[w3m-dev-en 00082] escape characters -> locale, isprint()

From: Sven Mascheck (mascheck@faw.uni-ulm.de)
Date: Thu Feb 17 2000 - 14:14:43 CST


Hello,

>From time to time `me and w3m' fumble over websites using invalid
escape characters in their pages (i guess it's proprietarily
`featuring' microsoft browsers).

Particularly hex:0x96 completely messes up the xfree-xterm,
which in turn knows this escape character as
``ESC V , Start of Guarded Area (SPA: 0x96)'' :
Text written after that point gets `stored' and won't be `cleared' easily
(see an illustration at http://www.uni-ulm.de/~s_smasch/w3m_test.html).

Using Hironori Sakamoto's i18n-patch certainly helps,
but as i don't know when that will merge into the ofiicial w3m,
i ask for your opinion about using isprint() (in connection with locales).

Please see the small patch in the attachment:
 - Setting the locale using the according environment variables.
 - Testing the char (or char+256, if needed) with isprint().
   (I fumbled over _possibly negative `char's, although isprint() is
    expecting a postive `int'. I just don't know how to solve that better).

This additional code is only active ``#ifdef EN''.

Certainly the original cause is _invalid_ html, but using
isprint()/locale would be a reasonable small feature anyway.
Until now i couldn't notice any drawbacks.

What do you think?

Regards,
Sven Mascheck

-- 




This archive was generated by hypermail 2b29 : Wed Jul 19 2000 - 10:30:43 CDT