Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

How can I use Automator to extract specific Data from a text file?

I have several hundred text files that contain a bunch of information. I only need six values from each file and ideally I need them as columns in an excel file.


How can I use Automator to extract specific Data from the text files and either create a new text file or excel file with the info? I have looked all over but can't find a solution. If anyone could please help I would be eternally grateful!!! If there is another, better solution than automator, please let me know!


Example of File Contents:



Link Time =DD/MMM/YYYY
RandomText

161 179

bytes of CODE memory (+ 68 range fill )
16 789bytes of DATA memory (+ 59 absolute )
1 875bytes of XDATA memory (+ 1 855 absolute )
90 783bytes of FARCODE memory


What I would like to have as a final file:


EXCEL COLUMN1Column 2Column3Column4Column5Column6
MM/DD/YYYYfilename116117916789187590783
MM/DD/YYYY

filename2

xxxxxxxxxxxxxxxxxxxx
MM/DD/YYYYfilename3xxxxxxxxxxxxxxxxxxxx


Is this possible? I can't imagine having to go through each and every file one by one. Please help!!!

Automator-OTHER

Posted on Oct 11, 2013 7:50 PM

Reply
27 replies

Oct 13, 2013 6:43 AM in response to ChrisAbreu

I don't know much about Automator, but it's certainly possible to do what you want with AppleScript.


I have a couple of questions:


1. Are all the files located in the same folder. How are they named?


2. From the layout of the example file contents, there is some ambiguity about what they actually contain, for example: is "DD/MMM/YYYY" literally that, or does it contain some (useful) date string?


3. By the way the data are set out, it's unclear what separators are used: is it just spaces or a mix of tabs and spaces?


4. Is this all the file contains, or just the first few lines?


If it's easier, you can get my email from my profile and send me one of the files to answer Q2-Q4!

Oct 13, 2013 6:10 PM in response to Arkouda

Hi Bernard,


Thanks for your help on this.


1. All the files are in the same folder but they don't seem to follow the same naming convention.


Some files are:


LCRC_0.28.0_QC.map (in this format)


While others are like:


FP_LXD_0.14.221_QC.map

FP_ESME_0.55.4_QC.map


2. The Date line actually contains a date, I listed it as so to display the format of it.


3. Only spaces are used


4. The files contain A LOT of lines, these are the only ones that matter to me.


Thank you so much for your help! I will email you seperately!

Oct 14, 2013 6:58 AM in response to ChrisAbreu

Here's a bash script that will work on the data that you posted.

Change ~/Downloads/* to the directory with the files to process.

A text file named Report.txt will be on your Desktop.

(you can wrap this in Automator (Run Shell Script Action) if you want):


#/bin/bash

Report=~/Desktop/Report.txt

for f in ~/Downloads/*
do
     if [ -f "$f" ] ; then
          echo >> "$Report"
          /usr/bin/grep -E -o '[0-9]{2}/[0-9]{2}/[0-9]{4}' "$f" | sed 's/\([0-9][0-9]\/\)\([0-9][0-9]\/*\)/\2\1/' | tr '\n' ' ' >> "$Report"
          echo -n "${f##*/} " >> "$Report"
          /usr/bin/grep -E -o '^[ [:blank:] | [:digit:] ].*bytes' "$f" | sed 's/[^0-9]*//g' | tr '\n' ' ' >> "$Report"
     fi
done



Note: I assumed that DD/MMM/YYYY was a typo and that you meant DD/MM/YYYY

If this is not the case, please clafify and I can adjust the script

Oct 14, 2013 9:29 AM in response to ChrisAbreu

This is a little bit more efficient.

grep stops searching the file after the 1st match (-m1) of "DD/MM/YYYY" and after the 4th match (-m4) of "bytes"



#/bin/bash

Report=~/Desktop/Report.txt

for f in ~/Downloads/*
do
     if [ -f "$f" ] ; then
          echo >> "$Report"
          /usr/bin/grep -E -o -m1 '[0-9]{2}/[0-9]{2}/[0-9]{4}' "$f" | sed 's/\([0-9][0-9]\/\)\([0-9][0-9]\/*\)/\2\1/' | tr '\n' ' ' >> "$Report"
          echo -n "${f##*/} " >> "$Report"
          /usr/bin/grep -E -o -m4 '^[ [:blank:] | [:digit:] ].*bytes' "$f" | sed 's/[^0-9]*//g' | tr '\n' ' ' >> "$Report"
     fi
done

Oct 14, 2013 6:45 PM in response to ChrisAbreu

Hello


You may try the following AppleScript script. It will ask you to choose a root folder where to start searching for *.map files and then create a CSV file named "out.csv" on desktop which you may import to Excel.



set f to (choose folder with prompt "Choose the root folder to start searching")'s POSIX path
if f ends with "/" then set f to f's text 1 thru -2

do shell script "/usr/bin/perl -CSDA -w <<'EOF' - " & f's quoted form & " > ~/Desktop/out.csv
use strict;
use open IN => ':crlf';

chdir $ARGV[0] or die qq($!);
local $/ = qq(\\0);
my @ff = map {chomp; $_} qx(find . -type f -iname '*.map' -print0);
local $/ = qq(\\n);

# 
#     CSV spec
# 
#     - record separator is CRLF
#     - field separator is comma
#     - every field is quoted
#     - text encoding is UTF-8
# 
local $\\ = qq(\\015\\012);    # CRLF
local $, = qq(,);            # COMMA

# print column header row
my @dd = ('column 1', 'column 2', 'column 3', 'column 4', 'column 5', 'column 6');
print map { s/\"/\"\"/og; qq(\").$_.qq(\"); } @dd;

# print data row per each file
while (@ff) {
    my $f = shift @ff;    # file path
    if ( ! open(IN, '<', $f) ) {
        warn qq(Failed to open $f: $!);
        next;
    }
    $f =~ s%^.*/%%og;    # file name
    @dd = ('', $f, '', '', '', '');
    while (<IN>) {
        chomp;
        $dd[0] = \"$2/$1/$3\" if m%Link Time\\s+=\\s+([0-9]{2})/([0-9]{2})/([0-9]{4})%o;
        ($dd[2] = $1) =~ s/ //g if m/([0-9 ]+)\\s+bytes of CODE\\s/o;
        ($dd[3] = $1) =~ s/ //g if m/([0-9 ]+)\\s+bytes of DATA\\s/o;
        ($dd[4] = $1) =~ s/ //g if m/([0-9 ]+)\\s+bytes of XDATA\\s/o;
        ($dd[5] = $1) =~ s/ //g if m/([0-9 ]+)\\s+bytes of FARCODE\\s/o;
        last unless grep { /^$/ } @dd;
    }
    close IN;
    print map { s/\"/\"\"/og; qq(\").$_.qq(\"); } @dd;
}
EOF
"


Hope this may help,

H

Oct 14, 2013 8:18 PM in response to ChrisAbreu

This will handle any order or CODE, DATA, etc...

Change ~/Downloads/* to the directory with the files to process

Run as a bash script, or just copy into Automator


#/bin/bash

Report=~/Desktop/Report.txt

for f in ~/Downloads/*
do
     if [ -f "$f" ] ; then
          DATE=$(/usr/bin/grep -E -o -m1 '[0-9]{2}/[0-9]{2}/[0-9]{4}' "$f" | sed 's/\([0-9][0-9]\/\)\([0-9][0-9]\/*\)/\2\1/')
          CODE=$(/usr/bin/grep -E -o -m1 '^[ [:blank:] | [:digit:] ].*bytes of CODE' "$f" | sed 's/[^0-9]*//g')
          DATA=$(/usr/bin/grep -E -o -m1 '^[ [:blank:] | [:digit:] ].*bytes of DATA' "$f" | sed 's/[^0-9]*//g')
          XDATA=$(/usr/bin/grep -E -o -m1 '^[ [:blank:] | [:digit:] ].*bytes of XDATA' "$f" | sed 's/[^0-9]*//g')
          FARCODE=$(/usr/bin/grep -E -o -m1 '^[ [:blank:] | [:digit:] ].*bytes of FARCODE' "$f" | sed 's/[^0-9]*//g')
          echo $DATE ${f##*/} $CODE $DATA $XDATA $FARCODE >> "$Report"
     fi
done


Open in Excel as a space delimited file.

How can I use Automator to extract specific Data from a text file?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.