Apple script to rename EML files

I do have all my emails exported to a folder with ".eml" format.


I want rename all those files based on "date received" and "sender's email address" info within the EML file. How can I do this?


example: "20171206-123254 - example@domainname.com -.eml"


--

Kindest Regards,

Paulo Guedes

Wednesday, 6 of December, 2017 - 22h:24m [+0000]

MacBook Pro with Retina display, macOS High Sierra (10.13.1)

Posted on Dec 6, 2017 2:29 PM

Reply
Question marked as Top-ranking reply

Posted on Dec 8, 2017 6:48 PM

One final version before I stop fretting at this. If a sender sends you more than one message at exactly the same time (it happened to me with JustGiving), this adds a 2-digit random number to the end of the file name. Plus better handling of no "Mon..." in the date. Plus better use of text item delimiters:



set the_files to (choose fileof type"eml"with prompt"Choose the emails you want to rename:"withmultiple selections allowed)

repeat with each_file in the_files

set the_path to quoted form of POSIX path of each_file

set the_sender to (do shell script "grep ^\"From: \" " & the_path)

set the_date to (do shell script "grep ^\"Date: \" " & the_path)


setAppleScript'stext item delimitersto {"From:"}

set the_sender to text item 2 of the_sender as string


setAppleScript'stext item delimitersto {"Date: ", "("}

set date_string to text item 2 of the_date


try


set ndformat to (do shell script"date -j -f '%a, %d %b %Y %X %z' " & quoted formof date_string & " +'%Y%m%d - %H%M%S'")


onerror


try


set ndformat to (do shell script"date -j -f '%d %b %Y %X %z' " & quoted formof date_string & " +'%Y%m%d - %H%M%S'")


endtry


endtry

set file_name to ndformat & " " & the_sender & ".eml"


tellapplication"Finder"


activate


try

set name of each_file to file_name

on error number errnum


if errnum = -48then-- identical sender, time & date to a previously processed message results in an identical file name

set the_rand to random number from 10 to 99

set file_name to ndformat & " " & the_sender & the_rand & ".eml"

set name of each_file to file_name


endif


endtry


endtell

endrepeat

45 replies

Dec 9, 2017 9:31 AM in response to HD

Hello HD,


Thank You so much


I just tested but is not working and I will explain with a real example.

This is the result of your last script: "20171209 - 150200 Twitter <info@twitter.com>xyxy.eml"

This "Twitter <info@twitter.com>xyxy" is not an email address. The email is only "info@twitter.com".

So in this example, the result should be "[20171209-150200][info@twitter.com].eml

Also it is possible to have the "date received" in GMT instead of the local time?


--

Kindest Regards,

Paulo

Saturday, 9 of December, 2017 - 17h:29m [+0000]

Dec 9, 2017 10:54 AM in response to HD

Thank You so much


Cant you put the squares brackets like my in my example and remove the spaces in date/time?


Your output = 20171209 - 150200 info@twitter.com.eml


what I would like to have = [20171209-150200][info@twitter.com].eml"


--

Kindest Regards,

Paulo Guedes

Saturday, 9 of December, 2017 - 18h:52m [+0000]

Dec 9, 2017 11:09 AM in response to HD

Thank You so much. It is pretty much what I need. Is there any limitations on how many EML files I can run at once with this applescript?


One more thing, your "the_date" returns "Date: Fri, 08 Dec 2017 08:15:10 +0000". This "+0000" is the time zone. So how can I put the time zone like this "[20171209-150200][+0000][info@twitter.com].eml" ? Is this is a big problem leave it.


--

Kindest Regards,

Paulo

Saturday, 9 of December, 2017 - 19h:09m [+0000]

Dec 8, 2017 5:22 AM in response to HD

RFC 2822 (pp 7-8) (Internet Message Format) allows up to 998 characters per header line, though it should strive for 78 characters max to accommodate the majority of mail clients. Line "folding" is permitted on longer entries where a CRLF is inserted to impose the multi-line folding. Mail clients are supposed to perform "unfolding" by returning the view to a single line of text.


The maximum filename length in macOS APFS and HFS+ filesystems is 255 characters. This poses a problem for indiscriminately appending the Return-Path line content to the reformatted date string of the renamed .elm files. In randomly checking some 15 Return-Path entries here, I found one that was 95 characters long. The other concern is that these can include characters that are repugnant to the Finder (:, nul) and Bash shells.


The OP may want to rethink the original renaming criteria to fit into filename character capacity.

Dec 9, 2017 11:19 AM in response to HD

The following delivers the end file format. It still does not address a folded Return-Path, where the actual return address is on a following line. It cannot solve constructed filenames that exceed 255 characters, the macOS limit. AppleScript mandatory escapes do not make this more legible. Tested on El Capitan 10.11.6.


set foobar to read (choose fileof type "eml" with prompt "Choose an eml file:" without multiple selections allowed)


set captures to paragraphs of (do shell script "perl -ne 'print \"$1\\n$2\" if /.*Return-Path:\\s<(.*?)>|.*Date:\\s*(.*?)\\s\\(/;' <<<" & foobar's quoted form)

logcaptures

set ndformat to (do shell script "date -j -f \"%a, %d %b %Y %X %z\" " & (last item of captures)'s quoted form & " +\"%Y%m%d-%H%M%S\"")


set newfilename to ndformat & "-" & (first item of captures) & ".eml"

lognewfilename

return

Result:

"20171201-101855-no-reply@notifications.skype.com.eml"

Dec 9, 2017 12:59 PM in response to pguedes

Here you go - with some amendments to catch email addresses not enclosed in diamond brackets.


set the_files to (choose file of type "eml" with prompt "Choose the emails you want to rename:" with multiple selections allowed)

repeat with each_file in the_files

set the_path to quoted form of POSIX path of each_file

set the_sender to (do shell script "grep ^\"From: \" " & the_path)

set the_date to (do shell script "grep -i ^\"Date: \" " & the_path)

set AppleScript's text item delimiters to {"From: "}

set the_sender to text item 2 of the_sender as string


set AppleScript's text item delimiters to {"<", ">"}

try

set the_sender to text item 2 of the_sender

on error

set AppleScript's text item delimiters to "From: "

set the_sender to text item 1 of the_sender

end try


set the_sender to "[" & the_sender & "]"

set AppleScript's text item delimiters to {"Date: ", "("}

set date_string to text item 2 of the_date

set time_diff to "[" & (text -5 thru -1 of date_string) & "]"

try

set ndformat to (do shell script "date -j -f '%a, %d %b %Y %X %z' " & quoted form of date_string & " +'[%Y%m%d-%H%M%S]'")

on error

try

set ndformat to (do shell script "date -j -f '%d %b %Y %X %z' " & quoted form of date_string & " +'[%Y%m%d-%H%M%S]'")

end try

end try

set file_name to ndformat & time_diff & the_sender & ".eml"

tell application "Finder"

activate

try

set name of each_file to file_name

on error number errnum

if errnum = -48 then -- identical time & date

set the_rand to random number from 10 to 99

set file_name to ndformat & time_diff & the_sender & the_rand & ".eml"

set name of each_file to file_name

end if

end try

end tell

end repeat



I'm still finding occasional oddities - including email addresses not enclosed in diamond bracket characters in the From: header. I've run it on 170 .eml files. It may well throw errors if it finds different configurations/misconfigurations in the headers. The


I have run it successfully on 170 .eml files. There may still be odd configurations in email headers that will cause it to throw errors, or return inconsistencies in the time zones.

Dec 10, 2017 7:23 AM in response to HD

I tested ten different emails saved out to .eml format. This was a mixture of domestic and international sourced emails from different ISP. I didn't find any missing weekday Date content or would have likely provided a means to handle it. Your solution would do the job, unless the Date field encounters other non-standard anomalies.

Dec 12, 2017 4:31 PM in response to pguedes

Here is the fifth revision of the AppleScript to parse your .eml files. The principal fix is handling multiple CRLF line breaks (folding) in the Return-Path: content, including CRLF within the angle brackets. I have spent most of the afternoon modifying an .eml and testing different regular expressions to get this to work with my sample of .eml files.


With the data files that I have to work with, the code runs multiple times without any error dialog (and no -1712 Apple Event timeout).


Copy the entire (scrollable) code below into the Script Editor and save it.


--emlname.applescript -- Process exported .eml files from selected folder and construct new filename -- format from captured Return-Path and Date fields. Rename each file to new -- filename construction. Original TZ offset will be omitted if not in Date field: -- [YYYYmmdd-HHMMSS][original TZ offset][contents of return path].eml -- Version 5, Multiple, folded Return-Path lines are now handled with improved RE syntax. -- VikingOSX, 2017-12-12, Apple Support Communities property isDesktop : (path to desktop as text) as alias set dow to {"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"} -- use pm characters to see if timezone offset is in captured Date string set pm to {"+", "-"} -- Perl regular expressions -- allow for RFC2822 folded return paths if present -- set rpath_re to "/^Return-[Pp]ath:\\R?\\s*/m;' " set rpath_re to " /^Return-[Pp]ath:\\R?\\s*/ms;' -ne '$val = $1;$val =~ s/\\R\\s*?//g;print \"$val\";' " -- capture date through +- timezone offset, but exclude trailing (ZONE) if present set fdate_re to "/^Date:\\s*([A-Za-z0-9,+-: ]+)\\s?.*$/;' " -- UNIX BSD Date strftime settings for parsing the captured Date string -- legend: dow = day of week, tz = time zone, n = no -- Reference: strftime(3) : https://www.freebsd.org/cgi/man.cgi?query=strftime&sektion=3 set date_parse_dow_ntz to quoted form of "%a, %d %b %Y %X " -- Mon, 30 Oct 2017 06:43:23 set date_parse_dow_tz to quoted form of "%a, %d %b %Y %X %z " -- Mon, 30 Oct 2017 06:43:23 -0500 set date_parse_ndow_tz to quoted form of "%d %b %Y %X %z " -- 30 Oct 2017 06:43:23 -0500 set date_parse_ndow_ntz to quoted form of "%d %b %Y %X " -- 30 Oct 2017 06:43:23 set date_format to quoted form of "+[%Y%m%d-%H%M%S]" set emlFolder to (choose folder with prompt "Choose a folder containing raw email (.elm) files:" default location isDesktop without invisibles, multiple selections allowed and showing package contents) try tell application "Finder" set eml_list to (every item in folder emlFolder whose kind contains "Email Message" and name extension contains "eml") as alias list end tell if length of eml_list = 0 then return -- scan body of email looking for return path and Date fields to capture repeat with anItem in eml_list set x to (POSIX path of (anItem as alias))'s quoted form set rpath to (do shell script "perl -0777 -ne '$_ =~ " & rpath_re & x) set fdate to (do shell script "perl -ne 'print \"$1\" if " & fdate_re & x) -- store the original TZ offset in variable pzf for later filename inclusion -- Does this fdate capture have the +- TZ prefix offset? If so, return the -- starting character location in the fdate string. set ispm to (do shell script "ruby -e 'puts ARGV.first =~ /\\+|\\-/' " & fdate's quoted form) as integer -- snap out the timezone offset if it is present if ispm > 0 then set pz to text (ispm + 1) thru -1 of (fdate as text) -- trim leading/trailing white-space of the timezone offset set pzf to do shell script "xargs <<< " & pz else set pzf to "" end if -- we want to convert the output date string with the correct parser -- so make determination in this decision tree. if plus_minus(fdate, pm) and dow contains word 1 of fdate then -- have dow and TZ set parser to date_parse_dow_tz else if plus_minus(fdate, pm) and dow does not contain word 1 of fdate then -- no dow but TZ set parser to date_parse_ndow_tz else if not plus_minus(fdate, pm) and dow contains word 1 of fdate then -- have dow but no TZ set parser to date_parse_dow_ntz else if not plus_minus(fdate, pm) and dow does not contain word 1 of fdate then -- have no dow or Tz set parser to date_parse_ndow_ntz end if -- automatically convert the captured date into users localtime, and output in -- specified formatting. set ndformat to (do shell script "date -j -f " & parser & space & fdate's quoted form & space & date_format) set RP to "" -- return path if not rpath = "" then set RP to " [" & rpath & "]" end if set TZOFFSET to "" -- TZ offset if any if not pzf = "" then set TZOFFSET to " [" & pzf & "]" end if set newfilename to ndformat & TZOFFSET & RP & ".eml" if length of newfilename > 255 then display alert "Filename exceeds 255 character maximum for filesystem" giving up after 10 end if -- rename original filename to new -- log (emlFolder & newfilename) as text as alias tell application "Finder" if not (exists item ((emlFolder as text) & newfilename)) = true then set anItem's name to newfilename end if end tell end repeat on error errmsg number errnbr my error_handler(errnbr, errmsg) end try return on plus_minus(adatestr, alist) -- loop through Date string to see if +- characters indicate timezone offset repeat with achar in adatestr if achar is in alist then return true end repeat return false end plus_minus on error_handler(nbr, msg) return display alert "[ " & nbr & " ] " & msg as critical giving up after 10 end error_handler

Dec 9, 2017 2:05 PM in response to HD

The Time Zone sometimes works sometimes don't. Here it goes two examples where time zone are not good:


"Date: Fri, 8 Dec 2017 10:27:37 -0500 (EST)"

[20171208-152737][0500 ][Bennn@linkedselling.com].eml


"Date: Sat, 09 Dec 2017 16:49:16 +0000 (UTC)"

[20171209-164916][0000 ][aandreww@mykeysmart.com].eml


--

Kindest Regards,

Paulo

Saturday, 9 of December, 2017 - 22h:04m [+0000]

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Apple script to rename EML files

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.