Apple Mail (Lion) treats imported sent mails as duplicates!

Impressed by its beauty, usability and "iCloudiness" I was planning to switch my e-mailing from Outlook 2011 to Apple Mail (as I had done before with iCal and Address Book).


My software situation: OS 10.7.2 with Outlook 2011 (or Thunderbird for Mac) as source and Apple Mail 5.1 as target software.


My problem occurs: When I try to import the content of the local Outlook 2011 Sent Mails mailbox (or the local Sent Mail from any other e-mail software) to Apple Mail (even if I import to the Apple Mail mailbox for sent mails).


My problem is: Apple Mail won't show all imported mails seperately as they were under Outlook 2011 or Thunderbird for Mac, but rather it will erroneously aggregate most (not all) of them and then treat them as duplicates of single huge mails which in most of the cases have nothing in common (neither interrelated conversation nor congruent recipients) with all the thereunder sorted mails except for the sender (which in all cases of "sent mails" is, of course, me).


My attempts were: Reading tons of articles that gave the impression that Eudora Mailbox Cleaner would have been the solution - if I hadn't Lion (which doesn't operate PowerPC applications anymore). Which is why I installed Thunderbird for Mac with the result that all imported mails (including those in the Sent Mail mailbox) show up exactly like in Outlook (in other words: the way they should). I tried a lot: Using the export mailbox function of Outlook 2011 (drawing mailboxes from the Outlook window on the Mac desktop) and then importing these mbox-files to Apple mail (>>problem occurs). Using Thunderbird as a bridge and importing the mails from Thunderbird via the Apple Mail import function (>>problem is even worse, all mails show up as one huge mail with one zillion duplicates sorted thereunder). Using the usual OLM-file Outlook export function and then using Emailchemy for converting the OLM-file into Apple Mail compatible mbox-files (>>problem occurs). Using Emailchemy as virtual local IMAP server with all imported Outlook mails (cross-plattform approach) and then downloading the mails from this "server" to Apple Mail (>>problem occurs).


Interesting fact: If I look up the Apple Mail folder in "library" I can see all mails there seperately as emlx-files. So nevertheless what I do: Apple Mail will aggregate most (not all) of them and then treat them as duplicates of single huge mails (and consequently sort these so-called duplicates under the latter ones).


PLEASE HELP - I would be very greatful!


With best regards,

EOSMAC

iMac, Mac OS X (10.7.2)

Posted on Oct 22, 2011 9:25 AM

Reply
17 replies

Oct 26, 2011 1:18 AM in response to EOSMAC

Hello


Export a Mbox file from Entourage / Outlook is simple as a drag and drop "mail folder" to desktop


but be carrefull only first level folder is exported not sub-folder , take care to export not to much email elements at time , by experiment 1000-1500 emails seem to be a maximum


before big export try verify you Mail data base into Entourage / Outlook


to do that , quit all office app first , then hold ALT key an launch Entourage / Outlook that launch microsoft database utility , can perform , rebuilt , compress , and verify option


HTH

Pierre

Oct 26, 2011 2:21 AM in response to rezamercury

Re


try to qualify the "export sent.mbox" file is i think the most important


in sent folded add a sub folder , move in it only 100 email , then drop this folder at desktop


the RE-IMPORT in outlook and check is import is correct ??


other way is to export file by file to make simple file to import or read in mail.app ? but is soooo time conssuming


Pierre

Oct 26, 2011 4:35 PM in response to durignieux pierre

Hello Pierre,


Thanks for your answers. I'm afraid I have to confirm what rezamercury mentioned: It does not solve the problem if I import the E-Mails in smaller mbox archives. In fact, the problem even occurs if I import every single E-Mail individually. As soon as Apple Mail 5.1 will have more than, say, three or four mails of the "sent mail" kind in one of its mailboxes it will erroneously aggregate most (not all) of them and then treat them as duplicates of single large mails.


With best regards,

EOSMAC

Jan 4, 2012 11:23 AM in response to EOSMAC

I am having the same problem.


I actually set up my iCloud (me.com) accout on Outlook 2011, set up the folders identically as in Outlook and copied the emails into the iCloud folders. The duplicate issue definitely occurs when I try to open the iCloud account in AppleMail. I can't fix it regardless of what I do. I can actually view all the "duplicate" emails by highlighting an email and clicking on "duplicates" in the preview pane. The duplicates appear in the mailbox then. You have to do this for EVERY email that shows "duplicates" in the upper right hand corner of the preview pane. This would be okay if I could get this view to remain once I navigate out of the window, but that is not the case.


I hope I am not being too confusing.


I hope a solution exists soon because I really want to use AppleMail instead of Outlook!

Jan 18, 2012 2:39 PM in response to EOSMAC

Same problem here. I have found that for some reason Outlook creates duplicate email IDs in the header, even on emails separated by years, and then, understandably, the Apple mail client regards them as duplicates. I confirmed this by copying a set of duplicates to a finder window and then used a text editor to modify the GIUDs so that they are unique. Importing the message back into the Apple mail client resolved the duplication problem in 2 out of the 3 emails I edited (I never did figure out why the one remaining mail was still regarded as a duplicate).


So, for some duplicate emails the cause is in the duplicate ID. But given that I have hundreds, maybe over a thousand, emails that are incorrectly tagged as duplicates, this manual approach is not feasible. It seems like there should be some way to force the Apple email client to ignore duplicates. Perhaps there is a setting in a config file somewhere, but I have not found such a file yet.


Another option would be to write some kind of script that traversed all your mailboxes and replaced the ID in the header of each email with some other number that is guaranteed to be unique. If the email is in a SQL based database, it might be possible to do this with a SQL query if you can figure out the table structure.

Jan 20, 2012 10:37 AM in response to EOSMAC

Hi all,


EOSMAC contacted me by email and drew my attention to this thread as he had seen a post of mine on another forum regarding Outlook and message id's.


Even though I'm not really concerned by this issue and my post on the other forum had nothing to do with your pbs, I got kind of curious and did a couple of tests by myself.


It looks, as mentioned by FiberBoy, that Apple Mail does duplicate tests based only on the Message-ID (which seems, by the way, a correct approach as those Id's are supposed to be unique).


Also, as suggested by FiberBoy, the easy solution here seems to be replacing the message ids by unique ones.


Unfortunately you can't do it within Outlook or Mail as those Ids are read-only to Applescript.


To not make it a manual and very time consuming task, one would have to automate this. I've done a little test, based only on a couple of messages, and it seems to work. So, before going back to my own business, here's what I did :


1) Export an Outlook folder to an mbox file (drag folder on the desktop)

2) open that file with TextWrangler (I used this as it is scriptable and free)

3) Write a little Applescript that does a find / replace on the Message-Id lines in that file and run it

4) Save the modified file

5) Import to Apple Mail


Here's the script I wrote to do the find / replace :


tell application "TextWrangler"
          tell front text window
  
                    set theLines to every line of text document 1
  
  
  -- Modify the 2 following to your needs
                    set personalString to "MYSTRING" -- length 6
                    set domainSufix to "@me.com>"
  
  
                    set personalString to text 1 thru 6 of personalString
  
                    set y to my sFormat(year of (current date), 4) as string --mod h mod mm
                    set m to my sFormat(((month of (current date)) * 1), 2) as string --mod h mod mm
                    set d to my sFormat(day of (current date), 2) as string --mod h mod mm
                    set HH to my sFormat(hours of (current date), 2) as string --mod h mod mm
                    set mm to my sFormat(minutes of (current date), 2) as string --mod h mod mm
                    set secs to my sFormat(seconds of (current date), 4) as string --mod h mod mm
  
  
                    set theNewMsgID to "Message-Id: <" & y & m & d & "-" & HH & mm & "-" & secs & "-" & secs & "-" & personalString
  
                    set counter to 1
  
                    repeat with theLine in theLines
  
                              set result to (find "Message-ID:" searching in theLine with selecting match)
  
                              if (found of result) then
                                        set text of theLine to theNewMsgID & (my sFormat(counter, 6) as string) & domainSufix
                                        set counter to counter + 1
  
                              end if
                    end repeat
  
          end tell
  
end tell

on sFormat(n, w)
  -- provides leading zeros to get total width w
          return text -w thru -1 of ("0000000" & n)
end sFormat



What it does :

It constructs a new Message-Id as follows :

YYYYMMDD-HHmm-secs-secs-PersonalString+Counter@domainSuffix


The Date/Time part is generated at the beginning of the script (which should ensure a unique ID even if more than one mboxes are modified, the counter is incremented for each find / replace (ensuring a unique ID for each message in the mbox file).


The PersonalString is another security to guaranty a unique id.


-- PersonalString and domainSuffix can be modified at the beginning of the script.


How to run it :

Open mbox file in textWrangler (only one file at the time, I only treat the first file)

Copy script to Applescript Editor and run it


However, I haven't tried it on heavy files, so I don't know how it performs. You might what to test it on medium-size files first to see how it performs. An stand-alone Application would obviously be a much smarter choice but I didn't have the time for that. Also, I haven't read through the specifications of Message-Ids so the generated Ids might not be "Standard".


I don't know if this can help you out, just let me know.



Take care

Lutz

Jan 20, 2012 6:25 PM in response to lume96

Dear Lutz,


Thanks a million for your fantastic efforts!

It seems to me that the bug has been identified in the form of erroneously congruent Message-ID's. As far as I remember MS Outlook does not (or at least did not always) assign Message-ID's to sent mails but rather let assign those by the mailserver or the recipient. This would somehow explain why the "imaginary duplicate bug" occurs massed on mails of the sent mailbox.


I followed your instructions step by step. Unfortunately, however, TextWrangler will hang when handling a mbox-file of e.g. 90MB size (which is, in fact, the smallest of my Outlook mailboxes).


Thanks again to you and to FiberBoy for the excellent impulse - it seems like we're getting closer to a solution.

With best regards,

EOSMAC

Jan 28, 2012 3:29 AM in response to EOSMAC

Dear Lutz,


Thanks for your reply.


I followed your procedure and I got following error:



tell application "TextWrangler"

get every line of text document 1 of text window 1

Result:

error "TextWrangler got an error: AppleEvent timed out." number -1712



I'm using TextWrangler 3.5.3 and I'm not sure this problem is cause of large mbox file or in cause of "tell application and window method". I checked this script with BBedit and result was same. I'm not so professional with apple script.


We will appreciate your reply about this problem. In other hand, I'm not native English and excuse me about any mistake in my comment.


Best Regards,

Reza

Jan 28, 2012 9:49 PM in response to EOSMAC

Hi,


well I guess I was a bit to optimistic as to the performance of TextWrangler and / or Applescript on very large text files.


I'm not directly confronted with this pb but got "drawn here" by EOSMAC who contacted me by mail. So I never tried it, other than on small test files.


As mentioned in my original post, the best solution would obviously be a little stand-alone tool that does the find / replace of duplicate Message-IDs. What it should do is :

1) replace ONLY the duplicate Message-IDs (not like my test script that replaces all IDs)

2) check for the presence of a Message-ID for every mail and add one if not present (as to EOSMACs last reply, there might be messages with no MEssage-ID at all).


Unfortunately the only dev I do right now is for my company's needs on Windows (C# .NET) and I haven't found the time to look at XCode so I can't be of much help here. Sorry.

Feb 26, 2012 10:29 AM in response to lume96

This is a wonderful discovery. Hopefully Apple can fix this but I'm not holding my breath waiting for it. 😉


Instead of using Applescript, I would recommend writing a Python script using the mailbox library:


http://docs.python.org/library/mailbox.html#mailbox.Mailbox


http://onlamp.com/pub/a/python/2007/06/28/processing-mailbox-files-with-mailboxp y.html


When I get some free time, I'll try to whip something up but if anyone else has some spare time to write a Python script to assign unique Message-IDs to emails in an mbox file, please share when you're done!

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Apple Mail (Lion) treats imported sent mails as duplicates!

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.