You can make a difference in the Apple Support Community!

When you sign up with your Apple Account, you can provide valuable feedback to other community members by upvoting helpful replies and User Tips.

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Unicode & UTF-8 support problems?

I'm trying to view fonts from a website but all I get are diamonds with question marks. The site webmaster is not familiar with Mac/Safari/OSX, but informed me that the text is a unicode font and that I should select UTF-8 in Safari > View > Text encoding. However that does not resolve the problem. Does anyone know what font code I need to have installed in order to view these pages?

sacred-texts.com website

example of what I'm seeing

running Safari 2.0.3

PowerBook g4 800m-p/1g-r/40g-hd & PowerMac g4 d450m-p/512m-r/160g-hd, Mac OS X (10.4.5), Resurrected iBook g3 600m-p/20g-hd/640m-r

Posted on Apr 21, 2006 5:41 PM

Reply
Question marked as Top-ranking reply

Posted on Apr 21, 2006 7:41 PM

Leaf Roller,

Give the Webmaster the W3C Markup Validation Service link which says:
Sorry, I am unable to validate this document because on line 20-44 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.
;~)
User uploaded file
20 replies

Apr 23, 2006 11:43 AM in response to Leaf Roller

Yes if you could please email it to me.


Because it is nearly 4MB, I put it on my iDisk. Try downloading it from here. Will mail if no go.

http://homepage.mac.com/thgewecke/fs/FileSharing9.html

So, IE 6 has the ability to display the correct text
even if the files are corrupted? How is that
possible?


Explanation to follow when I make sure I understand it:-)

Apr 23, 2006 12:27 PM in response to Leaf Roller

So, IE 6 has the ability to display the correct text
even if the files are corrupted? How is that
possible?


OK, here goes:

It appears that the UTF-8 parser in Win IE (and I think Win Outlook too) does not check for validity of UTF-8 strings. Also it uses shortcuts. In the case of a character that begins with the byte E1, it knows the next two bytes, if valid, should both be of the binary form 10xxxxxx. So it ignores the first two bits and just reads the last 6. As long as the last 6 bits are the same, the character is displayed. In the case of the very first letter in the Septuagint, U+1F10 sm. epsilon psili ἐ, the valid and invalid sequences are as follows:

Invalid (E1) FC DO > (E1) 11111100 11010000

Valid (E1) BC 90 > (E1) 10111100 10010000

Apr 23, 2006 4:17 PM in response to Leaf Roller

excuse my lack of knowledge on this, but is that a
good thing or a bad thing?

So basically the files were corrupted and IE 6 had
the ability to ignore the corruption and correctly
display the text?


BAD, if you ask me. Shame on MS.

I never did get to test the code 2000, it seems he's
changed everything on his site back to character
entities.


Code 2000 is only needed if you are running Windows.

Unicode & UTF-8 support problems?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.