3 Replies Latest reply: Jan 18, 2013 10:43 PM by Hiroto
Blessed91 Level 1 Level 1 (0 points)

I ran into a problem after exporting my book into EPUB from Indesign CS4. In this book the Latin diphthong æ is used. Unfortuately it didn't translate very well into the EPUB or corresponding XHTML files. Is there a way to do a batch change using AppleScript to change these symbols √¶ into æ which is the code for the diphthong? If so, how?

 

I also have an issue with the symbol †, in some of the documents it appears as this ‚Ć. Can I do a batch change for that as well? If so, how?

 

I'm new to using AppleScript so I appreciate all of the help. Thank you so much!!


Mac Book Pro, Mac OS X (10.5.8), 3.06 Ghz, 8 GB
  • 1. Re: How to find and replace using AppleScript for TextWrangler?
    twtwtw Level 5 Level 5 (4,690 points)

    I take it you're talking about the standard unicode characters and not some odd thing from a specialized font, right?  this should be fairly easy, either in applescript or by using textwrangler, I just want to make sure that I understand the parameters of the probllem.  I'm wary of the fact that EPUB is misrepresenting the characters - that suggests to me that the problem may not be as straight-forward as you're presenting it; EPUB ought not to have a problem with standard unicode.

  • 2. Re: How to find and replace using AppleScript for TextWrangler?
    Tom Gewecke Level 9 Level 9 (71,700 points)

    It sounds like you may have used the wrong text encoding designation in InDesign.  It should be UTF-8 and not Latin-1 or something else.

  • 3. Re: How to find and replace using AppleScript for TextWrangler?
    Hiroto Level 5 Level 5 (5,015 points)

    Hello

     

    Here's some observations -

     

    æ = U+00E6
    = <c3 a6> (UTF-8)
    = <c3 a6> (MacRoman) = æ
    
    † = U+2020
    = <e2 80 a0> (UTF-8)
    = <e2 80 a0> (MacRoman) = †
    

     

    which likely mean that your source data is text in UTF-8 but destination (or viewer or intermediate converter) is interpreting the data as text in MacRoman.

     

    Make sure you properly declare the encoding of XHTML or EPUB document as UTF-8.

     

    Good luck,

    H