[w3m-dev-en 00060] Re: w3m-i18n/m17n

From: Hironori Sakamoto (hsaka@mth.biglobe.ne.jp)
Date: Sat Jan 29 2000 - 02:43:15 CST


>> Sender: naddy@mips.rhein-neckar.de
>> > BUT!, on the character sets of Japan/Korea/China these use 2 columns.
>> Hmmm.
>> But this is a characteristic of those character sets, then. These
>> characters could be single width on a UTF-8 display and double
>> width on a CJK one.

If w3m receives these characters in the character sets of CJK,
w3m understands character width is 2 columns.
But, at displaying with UTF-8, it becomes 1 column, hence the
rendering of tables and the display of the popup menu will collapse.

>> documents are common. However, there is *no* display character set
>> to display a document that has e.g. both French and Polish text,
>> or German and Russian, etc. The *only* display encoding that combines
>> all these character sets is Unicode/UTF-8. Somewhere under my

Do you know ISO-2022 as encoding system ?
The system of w3m-i18n is based on ISO-2022.
Also emacs(mule) and kterm(multilingual xterm) are based on ISO-2022.
ISO-8859-* and almost encoding systems of CJK are based on ISO-2022.
ISO-2022 can display some character sets (ISO-8859-*, CJK, etc.)
with combination.

Only for display, Unicode is not needed. Therefore, in my first plan,
there is not Unicode support. But, due to conversion between
character sets, I implemented Unicode support.

>> > I think when I will get a terminal emulator which fully support Unicode
>> > and fully Unicode fonts, I will may start the support of displaying
>> > with UTF-8.
>> Well, what is "full" Unicode support? Double-width characters?
>> Combining characters? Mixing left-to-right and right-to-left writing?
>> Automatic linking and breaking of ligatures?

With only Unicode, it is impossible to handle right-to-left or
top-to-bottom writing (traditional writing of Japan and China, do you know ?)
or combining characters(Thai, India, etc.),
Therefore, I don't wish those to Unicode.
My wish list
* CJK fonts (which have variety width and height).
* An application on xterm can select characters width.
  (It is indispensable for w3m)
* Conversion between ISO-2022 and Unicode.

>> [1] As I mentioned previously, generic xterm is growing Unicode/UTF-8
>> support. You can get the current version at
>> http://www.clark.net/pub/dickey/xterm/xterm.html

I have disappointment to xterm... Why does xterm support ISO-2022,
Big5(Taiwan's), Veitnamese's, Thai's,.. coding systems.
Is xterm a software for only Eropean and American ?
There are MANY documents which are not written in Unicode.
There are MANY software which do not support Unicode.
When xterm will support both Unicode and existing coding systems,
xterm will be a software for people in the world.
The support of only Unicode is not a solution of multilingualization
and internationalization.

PS.
I am afraid that you have misunderstanding because of my poor english.
-----------------------------------
Hironori Sakamoto <hsaka@mth.biglobe.ne.jp>
 http://www2u.biglobe.ne.jp/~hsaka/



This archive was generated by hypermail 2b29 : Wed Jul 19 2000 - 10:30:43 CDT