UTF8 terminal input and missing locales

Hello all 🙂

Some time ago two of this forum's members (Tom Gewecke and Jun. T.) [url="http://discussions.apple.com/click.jspa?searchID=-1&messageID=3526723"]he lped me[/url] get Vietnamese input working in X11. That was a huge help, and I appreciate it very much.

Although we worked around that problem by creating a custom keymap, the cause of my input problems overall, the lack of a UTF8 Vietnamese locale in the base OSX system, remains.

1. I would really like to install a Vietnamese UTF8 locale. I have searched, but I can't find anything on how to find or create missing OSX locales. Can anyone here help me with that, or point me to instructions on it?

2. I want to get terminal input (specifically [url="http://iterm.sourceforge.net/"]iTerm[/url]) working for my language. Can I get iTerm (or the bash shell running in it) to call my custom keymap, as we did with xterm?

I find iTerm in most ways a more comfortable terminal than Terminal.app or PathFinder's terminal, but its lack of UTF8 input is really a problem for me. [url="http://www.rift.dk/news.php?item.7.6"]this article[/url] explains how to get UTF8 input working for OSX terminals, but also points out iTerm's deficiencies in this area. I have added information to a [url="http://sourceforge.net/tracker/index.php?func=detail&aid=1623392&group_id =67789&atid=518973"]current bug[/url] on enabling UTF8 input in iTerm. I have also joined the iTerm mailing list, will follow this issue up there and report back here.

The lack of a UTF8 Vietnamese locale anywhere on disk seems to be to be a fundamental problem. I want to solve it. Surely, the languages supported by default under the OSX GUI should also be supported by the base system?

And I simply want to get Vietnamese input working in terminal.

Thanks for any help you can offer with this. 🙂

White MacBook 1.83 GHz, 2 GB RAM, 120 GB HDD, Mac OS X (10.4.8)

Posted on Dec 30, 2006 9:16 PM

Reply
10 replies

Dec 31, 2006 1:54 AM in response to Clytie

I think the locale is not important in inputting Vietnamese in Terminal.app (and other OSX applications). You may use any UTF-8 locale such as en_US.UTF-8 (if you want to iuse UTF-8 encoding). What you need is a Vietnamese keyboard layout. As you know, MacOS X has a Vietnamese keyboard layout but it requires hitting 'e then 5' to input è. If you don't like this, then you can create your own keyborad layout by using a software like Ukelele.

I'm assuming the character encoding for your Terminal.app is set to UTF-8.

PowerMac G4 Mac OS X (10.4.8)

Dec 31, 2006 8:53 AM in response to Clytie

1. I would really like to install a Vietnamese UTF8
locale.


I too think this not really the issue. The problems relate to making Terminal or a substitute for Terminal do UTF-8 properly for any language, and then again perhaps the same thing for any app you want to use in Terminal.

What is it you need to do in Terminal exactly -- input Vientamese filenames in commands?

Dec 31, 2006 1:32 PM in response to Clytie

PS Do these settings help for you in Terminal? Also make sure you have a font selected which has all the characters.

* In the Terminal Inspector:
o In the Emulation section, turn off the Escape non-ASCII characters option.
o In the Display section, choose Unicode (UTF-8) as the Character Set Encoding.
* Add the following line to your .profile: export LC CTYPE=enUS.UTF-8
* Add the following lines to your .inputrc:

set meta-flag on
set input-meta on
set output-meta on
set convert-meta off

* Apply changes by doing a source ~/.profile and a source ~/.inputrc.

Dec 31, 2006 7:47 PM in response to Tom Gewecke

Tom, you're a genius! 😀

I can now input Vietnamese in iTerm (this probably works in Terminal.app as well).

I'll send your instructions to the iTerm list and the author of the Internationalizing your Terminal article, so others can benefit from them.

To answer your question, I need, for example, to input search terms when searching in translated manpages or other Vietnamese text. UTF8 filenames will also be more usable now.

Thanks again! 🙂

Jan 1, 2007 9:13 PM in response to Clytie

Please use UTF-8, UTF8 gives you a headache.

Note that if you have bytes with the eighth bit on in your script (for
example embedded Latin-1 in your string literals), "use utf8" will be
unhappy since the bytes are most probably not well-formed UTF-8. If
you want to have such bytes and use utf8, you can disable utf8 until
the end the block (or file, if at top level) by "no utf8;".

If you want to automatically upgrade your 8-bit legacy bytes to UTF-8,
use the "encoding" pragma instead of this pragma. For example, if you
want to implicitly upgrade your ISO 8859-1 (Latin-1) bytes to UTF-8 as
used in e.g. "chr()" and "\x{...}", try this:

use encoding "latin-1";
my $c = chr(0xc4);
my $x = "\x{c5}";

In case you are wondering: yes, "use encoding 'utf8';" works much the
same as "use utf8;".

2.16 Ghz Core Duo Mac OS X (10.4.8)

Jan 3, 2007 4:20 AM in response to CodLBi

Thanks Tom, with your help it looks like I can actually create a locale. That would be great, because then I can upload it for others to find, and it might also work for BSD.

CodLBI, are you sure you have the right thread? I'm not writing a script, and I really don't understand what you are trying to say. If you are trying to help me, thankyou for your effort. Perhaps you have misunderstood my original question. I was trying to get utf8 input working in my shell, and also want to create a utf8 locale for my language. Tom has helped me a lot with both of those. This is certainly a great place to find information. 🙂

Jan 3, 2007 11:58 PM in response to Clytie

Hi Clytie,

I just wish someone had shown me this message 6 years ago.
I finally got it.

🙂

On 22 Oct 1997, wrote:

In article <344CFAF2.C3B136CD at netscape.com>,>
see us defined something like UTF8 as the default charset used. While I
heard from people who agreed with me on this, I didn't hear any
objections. Is this OK? Do poeple think this would be a bad thing? If
not should we change the draft?


UTF8 sounds fine.


Please always use UTF-8 and not UTF8. UTF-8 is the correct MIME charset
value; UTF8 only is risking confusion.

[The only place where I have seen UTF8 instead of UTF-8 is VRML 2.0,
but that's not an IETF standard, and other values are not allowed
anyway.]

Regards, Martin.

2.16 Ghz Core Duo Mac OS X (10.4.8)

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

UTF8 terminal input and missing locales

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.