Character set/encoding issues when dealing with Address Book import/export

I work with a networked mixture of SPARC/Solaris workstations and Apple iMacs and MacBooks, and use a Treo 650 as my PDA/phone. I wrote a set of awk programs to
convert a ~20 year-old home-grown ASCII
addressbook database system that was built upon the troff/bibIX bibliographic
database system. It's output contains one address entry per line, each line in
the form of tab-separated values (TSV). I have successfully used such a file to
import addresses into thunderbird, Palm Desktop, and Apple's AddressBook
application, v. 4.1 (687.1) running under Mac OS X 10.5.2 (Leopard). In evaluating
IMAP mail clients and address book systems that I can use across my platforms, I
have settled on using thunderbird as the mail agent, but using Address Book on the Apple for my
addresses, with the goal of exporting vCard files from that tool that can be
imported into t-bird and Palm Desktop. The problem is this; the original data
contains foreign (European) accent marks in the form of troff escape sequences,
which I would like to translate into a form usable within my new framework.

Bear with me if some of what follows is oversimplified or naive. The vi and awk
applications under Solaris are almost certainly NOT Unicode-capable. However,
I was able to use awk to change the troff escape sequences into what I assume
are probably their ISO-Latin-1 (8859-1) equivalents -- the characters look as
desired within vi. If I ftp such a file to a Mac, it also looks as desired within
vi. If I use the Apple "character pallette" to insert accented characters into
such a file, they appear the same, so I am guessing that this approach is also
using ISO-Latin-1 (understanding that ASCII, and perhaps ISO-Latin-1, is
Unicode, if the UTF-8 transport encoding is in use). However, if I import
such a file into AddressBook, the accented characters appear as other characters
altogether. If I change them within AddressBook and export them, they do not
appear as desired in outside applications. Clearly the two environments are
using different character encoding conventions. How do I cleanly solve this
issue? I do have iconv available, of course, and I suspect that use of it will
be part of the solution.

Thanks in advance for any helpful guidance, and best wishes for success in your
own computing endeavors...

iMac, MacBook Pro, Mac OS X (10.5.2)

Posted on Apr 24, 2008 12:17 PM

Reply
6 replies

Apr 24, 2008 12:42 PM in response to unix4vr

understanding that ASCII, and perhaps ISO-Latin-1, is
Unicode, if the UTF-8 transport encoding is in use


ASCII is also Unicode, but Latin-1 is not. é is hex E9 in Latin-1 but C3 A9 in UTF-8.

However, if I import
such a file into AddressBook, the accented characters appear as other characters
altogether.


You might try opening your file with TextEdit set to Latin-1, then resaving with TE set to UTF-8 and UTF-16, then importing into Address Book. One of those should work.

Apr 24, 2008 1:33 PM in response to Tom Gewecke

Tom,

Thanks for taking the time to reply with what sounded like a clever simple fix.
I did confirm that my input is ISO-Latin-1; I imported it into TextEdit and
then saved it back out as UTF-8. It's good to know TextEdit has such a
capability, and a bit embarrassing not to have noticed it before. 🙂

However, importing this file into Address Book resultedin bogus character
mappings (an e-acute, for example, gets mapped to a square-root radical followed
by a copyright mark -- two characters!). I then tried saving the original file
from TextEdit in UTF-16, thinking that would work. However, Address Book appears
to be hanging up in the process of importing tha file. The beach ball is still
spinning as I write, many minutes after starting the import process. Any further
thoughts? I'm quite surprised this does not appear to have worked...

Jun 12, 2008 4:52 PM in response to Tom Gewecke

Actually I have a similar problem but it's an export issue. I'd like to export "My Card" to a vcf file and use another application to read it. So I have the following applescript:

*+tell application "Finder"+*
*+set theFolder to folder "TemporaryItems" of folder "Caches" of folder "Library" of home+*
*+end tell+*

*+tell application "Address Book"+*
*+set theVcard to (vcard of my card) as Unicode text+*
*+set theFile to (theFolder as text) & "myCard.vcf"+*
*+set theFileHandle to open for access theFile with write permission+*
*+-- write (ASCII character of 254) to theFileHandle+*
*+-- write (ASCII character of 255) to theFileHandle+*
*+write theVcard to theFileHandle+*
*+close access theFileHandle+*
*+end tell+*

OK, now the problem is that if "My Card" has any double-byte characters like Chinese or Arabic, the output file will just hold some question marks ("??"). The code works in Tiger but not Leopard. (Add char 254 and 255 just to make it a UTF16-BE file. So it doesn't really change the result of this script.)

I started digging and guess here is the reason: in the following statement the "vcard of my card" will always return plain text:
*+set theVcard to (vcard of my card) as Unicode text+*

I got this clue from the help of ScriptEditor:
vcard (text, r/o) : Person information in vCard format, this always returns a card in version 3.0 format.
But in Tiger, you can see:
vcard (Unicode text, r/o) : Person information in vCard format, this always returns a card in version 3.0 format.

Any workaround to solve this problem?

(And if this is not the right place to ask such question, please just let me know.)

Thanks a lot!

Message was edited by: calvinliu

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Character set/encoding issues when dealing with Address Book import/export

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.