yet another terminal/iterm umlaut irritation

Hi all

I'm back again with umlaut issues.

However I'll begin by describing the set up.

I have GNU bash, version 3.2.17(1)-release (powerpc-apple-darwin8.9.0) installed.
I also use vim version 7.1b.1 BETA

My .bashrc contains the following

PATH=$PATH:/usr/local/bin/:/Users/duffy/'Java Arkiv'/Glassfish_Arkiv/glassfish/bin/
export LC ALL=svSE.UTF-8
export LANG=sv_SE.UTF-8
export JAVA_HOME=/Library/Java/Home
export TERM=xterm-16color
alias mysql=/usr/local/mysql/bin/mysql
alias vim=/opt/local/bin/vim
alias vimdiff=/opt/local/bin/vimdiff
SHELL=/opt/local/bin/bash

my .bash_profile contains

if [ -f ~/.bashrc ]; then
source ~/.bashrc
fi

my .inputrc contains

set input-meta on
set output-meta on

set convert-meta off
set meta-flag on

My terminal settings are as follows
I login with usr/bin/login
use xterm-color
screensettings -> utf-8 encoding, wide characters count as 2

And in my .vimrc I have set the encoding to utf-8

That about does it for my set up.

Now - here goes.
Terminal command window displays umlauts correctly, and appears to display umlauts correctly when typing. If I do a ls on a directory and see a file called lets påsk.txt the file name looks fine. If i type ls på* I get No such file or directory, however if I mark the file name and do a cut and paste in terminal, i.e I cut and paset "på" and paste it into ls then I get ls på* and it works fine. In other words the keyboard input and terminal output display doesnt seem to be the same. I cant figure out why this is. On top of that, when I start vim I can type å ä ö but everytime I do it uses two spaces, so the word påsk looks like this på sk, thsi despite the fact that the terminal uses utf-8 and vim encoding is set to utf-8.

Maybe I've missed some info here, but I've pretty much put all the info in here I think is relevant.

I await words fron the grea Guru out there. 🙂

Pb G4, Mac OS X (10.4.1)

Posted on Sep 12, 2007 11:13 AM

Reply
9 replies

Sep 13, 2007 3:44 AM in response to snigelman

If i type ls på* I get No such file or directory


In HFS+, which is the MacOSX's default file system, accented characters in file names are stored in "decomposed form". For example, "å" is stored as "a" + "COMBINING RING ABOVE". The file names output by the ls command are decomposed:

ls p*.txt | od -t x1 -c


If you enter "å" from your keyboard, on the other hand, it is in pre-composed form, i.e., a single Unicode character:

echo å | od -t x1 -c

If you type "ls -w -l påsk.txt", the "å" is pre-composed and bash passes this pre-composed form to some system call (such as stat(2)). But the system call seems to internally convert it into decomposed form, so it works.

If you type "ls -w på*", on the other hand, bash gets filenames (which are decomposed) in the current directory and compares them with "på" (which is pre-composed), resulting in no match. A workaround is to use "ls -w pa*", i.e., without the accent.

I also use vim version 7.1b.1 BETA


Where did you get this? I guess it has been built without the multibyte support. In vim, type

:version<Return>

and search for a string "multi_byte". If it has "-" in front of it then multbyte support is not included in the vim.

The ones in
http://macvim.org/OSX/index.php (vim7.0) or
http://code.google.com/p/macvim/ (vim7.1)
will support multibyte.
Or you can build vim7.1 by yourself (if you have Xcode Tools installed).

Sep 13, 2007 9:09 AM in response to Jun T.

Hi

Thanks for a very good explanation of unicode and MacOSX. Very helpful and interesting. As for my version of Vim I believe its from darwinports. But I have now deactivated it since it did not have muli_byte support. I now have VIM - Vi IMproved 7.0 which has multi_byte.

Still my problem remains with vim and Terminal. If I have the terminal encoding set to UTF-8 vim displays å ä ö as dubbel characters making påsk look lik på sk.

And yes I have marked "wide characters count as 2" Screen appearance (I'm translating frpm swedish version).

iTerm is even weirder..iTerm seems incapable of displaying umlauts correctly no matter what encoding is used, so the ls command never shows correct umlauts. Vim has the same issues in iTerm as in Terminal when the encoding is UTF-8.

" If you type "ls -w på*" "

That works kind of.
I have noted that ls gives a result of files where some files are listed tabbed in from the others.
An examples

file1.txt
file2.txt
file3.txt
file4.txt
file5.txt

ls -w pa* works fine on the files thar are not tabbed i.e file1,2 and 5 but not 3 and 4.

Finally as for Vim 7.1, yes I can build it.. but I guess 7.0 is more stable

Sep 13, 2007 9:59 AM in response to snigelman

vim displays å ä ö as dubbel characters making påsk look lik på sk.


Strange.
On my Mac, even /usr/bin/vim (vim6.2, pre-installed by Apple) can display them correctly. Which font are you using in Terminal.app? If you are using Monaco then I have no idea what's wrong with your settings...

ls -w pa* works fine on the files thar are not tabbed i.e file1,2 and 5 but not 3 and 4.


I guess some of the files actually have a TAB character in their names. Try 'ls -b'.

By the way, when you post a program or shell output or such, please enclose them by
as follows:

your code here
{code}

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

yet another terminal/iterm umlaut irritation

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.