Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Issues with iTunes scripting API and data encoded as UTF8

I have written some Applescript which interacts with iTunes through the normal iTunes scripting API. As I understand it, nowadays Applescript uses Unicode throughout. I am also given to understand that iTunes understands Unicode for things like track tags etc.


My code queries iTunes to look up tracks based on name, album and artist (these values are stored in variables). This works correctly for all regular data but if any of the text in these variables includes non-ascii data (for example the string 'Élan') encoded as UTF-8 then it fails to find a match even though the track is there (using search within iTunes itself finds it okay).


Is this some deficiency in the scripting API? Or do I have to do something special in order to get searches to work as expected when I pass in UTF8 data? I've tried converting to Unicode using 'as Unicode text' but that does not make any difference.


Thanks for any pointers...


Chris

MacBook Pro with Retina display, OS X Yosemite (10.10.3)

Posted on Apr 19, 2015 8:01 AM

Reply
1 reply

Apr 20, 2015 11:23 AM in response to ChrisJenkins

Hello


I'm not sure but I'd first suspect NFD v. NFC issue.


HFS Plus name is represented in NFD (Normalisation Form D) UTF-16. (Strictly, it is a variant of NFD which differs from NFD in some code ranges.) Application able to convert Unicode encoding forms (UTF-8, UTF-16, etc) transparently is not necessarily able to convert normalisation forms (NFC, NFKC, NFD, NFKD) transparently, in which case we'd need to do it manually.



E.g. 1.


set t to "éléments" set t1 to NFD(t) {hexdump_C(t), hexdump_C(t1), t = t1, t's id = t1's id} (* on NFD(t) do shell script "printf '%s' " & t's quoted form & "| iconv -f UTF-8 -t UTF-8-MAC" without altering line endings end NFD *) on NFD(t) do shell script "printf '%s' " & t's quoted form & "| perl -CS -MUnicode::Normalize -ne 'print NFD($_)'" without altering line endings end NFD on hexdump_C(t) do shell script "printf '%s' " & t's quoted form & "| hexdump -C" end hexdump_C




As shown above, "éléments" manually inputted in (Apple)Script Editor is in NFC (Normalisation Form C). AppleScript language treats "éléments" in NFD and "éléments" in NFC are equal when compared as string (but not when compared as id).


However, "whose" test in AppleScript is processed by application and it solely depends upon each application whether differences in NFD and NFC are ignored in the test.



E.g. 2.


set t to "éléments" set t1 to NFD(t) tell application "Pages" -- Pages v4 tell (make new document)'s body text set its text to t1 return {words whose it = t, words whose it = t1} --> {{}, {"éléments"}} : NFD ≠ NFC in whose test end tell end tell on NFD(t) do shell script "printf '%s' " & t's quoted form & "| perl -CS -MUnicode::Normalize -ne 'print NFD($_)'" without altering line endings end NFD




E.g. 3.


set t to "éléments" set t1 to NFD(t) tell application "TextEdit" tell (make new document) set its text to t1 return {words whose it = t, words whose it = t1} --> {{"éléments"}, {"éléments"}} : NFD = NFC in whose test end tell end tell on NFD(t) do shell script "printf '%s' " & t's quoted form & "| perl -CS -MUnicode::Normalize -ne 'print NFD($_)'" without altering line endings end NFD




Tested under OS X 10.6.8. I don't know about recent iTunes.


Hope this may help.

H


EDIT: fixed typo in code

Issues with iTunes scripting API and data encoded as UTF8

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.