[w3m-dev-en 00065] Re: w3m-i18n/m17n

From: Christian Weisgerber (w3m-dev-en@mips.rhein-neckar.de)
Date: Sun Jan 30 2000 - 09:56:54 CST


Hironori Sakamoto <hsaka@mth.biglobe.ne.jp> wrote:

> If w3m receives these characters in the character sets of CJK,
> w3m understands character width is 2 columns.
> But, at displaying with UTF-8, it becomes 1 column,

I don't understand this.

Characters are received in some charset; they are mapped to Unicode;
from there they are mapped to the display charset (which may be
Unicode in UTF-8 encoding). Each character should have an associated
width in the display charset. The width in the input charset doesn't
matter. (Exempting columnar text in <pre>...</pre> for the moment.)

> Do you know ISO-2022 as encoding system ?

Not really. I understand it allows switching between different
character sets/encodings.

> ISO-2022 can display some character sets (ISO-8859-*, CJK, etc.)
> with combination.

There is no supporting code base for the use of ISO-2022 in Europe.
Also, now that Unicode exists, I don't think ISO-2022 will ever
see any use in Europe. It introduces a stateful encoding which
people really abhor. UTF-8 also introduces state, but that is
limited to single characters at least.

> With only Unicode, it is impossible to handle right-to-left or
> top-to-bottom writing (traditional writing of Japan and China, do you know ?)
> or combining characters(Thai, India, etc.),

I don't know about top-to-bottom, but Unicode does have some
provisions for dealing with right-to-left scripts. And of course
Unicode deals with combining characters, ranging from the simple
addition of accents to latin letters to complex scripts such as
Thai. Now, whether an application that make use of Unicode can
handle all these features is another matter. The people on the
linux-utf8 mailing list can explain the details to you.

> I have disappointment to xterm... Why does xterm support ISO-2022,
> Big5(Taiwan's), Veitnamese's, Thai's,.. coding systems.

(Presuambly you meant to say "why doesn't...".)

> Is xterm a software for only Eropean and American ?
> There are MANY documents which are not written in Unicode.
> There are MANY software which do not support Unicode.
> When xterm will support both Unicode and existing coding systems,
> xterm will be a software for people in the world.
> The support of only Unicode is not a solution of multilingualization
> and internationalization.

You are contesting an argument I have not made.

I do *not* propose that xterm should be the sole X11 terminal
emulator. I do *not* propose that all character encodings of the
world should be replaced by Unicode. That is an entirely different
discussion--and one which is of no concern to this list.

I *do* propose that UTF-8 should be added as an *additional* display
encoding to w3m. In no way should this impinge on the continued
availability of other display encodings.

-- 
Christian "naddy" Weisgerber                  naddy@mips.rhein-neckar.de



This archive was generated by hypermail 2b29 : Wed Jul 19 2000 - 10:30:43 CDT