How to make this? Applescript / unix. Database changement

Question

How to make this? Applescript / unix. Database changement

Hello,

I have a database with various fields / columns.

Tab delimited (mac roman western) .txt files

Various records from 24 to ....

But I need to change 1 column:

One of the columns (fieldname) is "destination":

First I should add @ so the fieldname become "@destination"

In this column I get text and have 3 possibilities:

Office - 96

Office and address - 240

Address with stamp - 144

The number at the end is variable so it can be every number.

Only look for the begin of the field.

So I would like to change the database like this

For fields that start with "Office" change content of those fields to office.pdf

For fields that start with "Adress" change content of those fields to stamp.pdf

Just save the database in same directory.

Can some one help me with this?

Different workstations from 10.3.9 to latest// G5 server - OS 10.3.9, Mac OS X (10.5.6)

Posted on Apr 29, 2015 7:52 AM

Reply

Answer 1

rccharles

Level 6

12,957 points

Apr 29, 2015 12:55 PM in response to Colin @ mac.com

Post an example of twenty or so records of the input file. Post example of the changed output file.

for posting, use advanced version of this editor. then look for the icon in blue >> > syntax hilighting > Plain

Reply

Answer 2

Apr 30, 2015 12:16 AM in response to rccharles

These are the original columns I need to change.

This I how the output database should become

Reply

Answer 3

Hiroto

Level 5

7,461 points

May 4, 2015 12:52 AM in response to Colin @ mac.com

Hello

If I understand it correctly, you might try the following AppleScript script which is a simple wrapper of Perl script.

Currently it is assumed that each field in TSV data is quoted such as

"destination" "street" "number" "box" "CL" "ad_Plaats"

and each field does not contain TAB, LF, CR character.

Output file is named after the original file with _mod appended to basename. E.g., given original.txt, the output is original_mod.txt in the same directory as the original.

on run open (choose file with multiple selections allowed) end run on open argv _main(argv) end open on _main(argv) (* list argv : alias list *) set args to "" repeat with a in argv set args to args & space & a's POSIX path's quoted form end repeat do shell script "perl <<'EOF' - " & args & " use strict; my $Q = q(\"); # q(\") if every value is quoted; q() otherwise. for my $f (@ARGV) { my $t; open(IN, '<', $f) or die qq($!); { local $/; $t = <IN>; } close IN; my @aa = &text2array($t, rs => qr/\\015\\012|\\015|\\012/o); # get original record separator my $rs = ($t =~ /\\015\\012|\\015|\\012/o) ? $& : qq(\\n); # get target column index my ( $k ) = grep { $aa[$_][0] and $aa[$_][0] eq $Q.'destination'.$Q } 0..$#aa; # edit values in target column $aa[$k][0] = $Q.'@destination'.$Q; for $a (@aa) { $a->[$k] = $Q.'office.pdf'.$Q if $a->[$k] and $a->[$k] =~ /^$Q Office/ixo; $a->[$k] = $Q.'stamp.pdf'.$Q if $a->[$k] and $a->[$k] =~ /^$Q Address/ixo; } # output result to file with _mod apended name (e.g., original.txt -> original_mod.txt) (my $f1 = $f) =~ s:(\\.[^./]+$|$):_mod$1:o; open(OUT, '>', $f1) or die qq($!); print OUT &array2text(\\@aa, rs => $rs); close OUT; } sub text2array($;@) { # $ : source text # @ : key value pairs of options # fs => field_separator # rs => record_separator # re => regex pattern to extract elements from each row # * if re is specified, fs and rs are ignored. # * fs, rs and re may be string or compiled pattern. # return : 2d-array my $t = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), re => undef, @_ ); my ($fs, $rs, $re) = @opts{'fs', 'rs', 're'}; my @rr = (); unless ($re) { ($fs, $rs) = map { ref($_) eq 'Regexp' ? $_ : scalar($_ = quotemeta $_, qr/$_/) } ($fs, $rs); for (split $rs, $t, -1) { push @rr, [split $fs, $_, -1]; } return @rr; } else { $re = ref($re) eq 'Regexp' ? $re : qr/$re/; my @tt = ( $t =~ /$re/g ); my $len = @+ - 1; push @rr, [ splice(@tt, 0, $len) ] while ( @tt >= $len ); return @rr; } } sub array2text($;@) { # $ : source 2d-array ref # @ : key value pairs of options # fs => field_separator # rs => record_separator # fm => format string for sprintf for each row # * if fm is specified, fs and rs are ignored # return : text representation of source 2d-array my $aref = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), fm => undef, @_ ); my ($fs, $rs, $fm) = @opts{'fs', 'rs', 'fm'}; unless ($fm) { return (join $rs, map { join $fs, @{$_}; } @{$aref}) . $rs; } else { my $t = ''; map { $t .= sprintf $fm, @{$_}; } @{$aref}; return $t; } } EOF" end _main

Tested under OS X 10.6.8.

Hope this may help,

H

EDIT: fixed code a little to name output file.

Reply

Answer 4

Hiroto

Level 5

7,461 points

May 4, 2015 12:52 AM in response to Hiroto

Oops. I posted wrong code which mistakes row index for column index in processing header...

WRONG:

# get target column index my ( $k ) = grep { $aa[$_][0] and $aa[$_][0] eq $Q.'destination'.$Q } 0..$#aa; # edit values in target column $aa[$k][0] = $Q.'@destination'.$Q;

CORRECT:

# get target column index my ( $k ) = grep { $aa[0][$_] and $aa[0][$_] eq $Q.'destination'.$Q } 0..$#aa; # edit values in target column $aa[0][$k] = $Q.'@destination'.$Q;

Corrected script is as follows.

on run open (choose file with multiple selections allowed) end run on open argv _main(argv) end open on _main(argv) (* list argv : alias list *) set args to "" repeat with a in argv set args to args & space & a's POSIX path's quoted form end repeat do shell script "perl <<'EOF' - " & args & " use strict; my $Q = q(\"); # q(\") if every value is quoted; q() otherwise. for my $f (@ARGV) { my $t; open(IN, '<', $f) or die qq($!); { local $/; $t = <IN>; } close IN; my @aa = &text2array($t, rs => qr/\\015\\012|\\015|\\012/o); # get original record separator my $rs = ($t =~ /\\015\\012|\\015|\\012/o) ? $& : qq(\\n); # get target column index my ( $k ) = grep { $aa[0][$_] and $aa[0][$_] eq $Q.'destination'.$Q } 0..$#aa; # edit values in target column $aa[0][$k] = $Q.'@destination'.$Q; for $a (@aa) { $a->[$k] = $Q.'office.pdf'.$Q if $a->[$k] and $a->[$k] =~ /^$Q Office/ixo; $a->[$k] = $Q.'stamp.pdf'.$Q if $a->[$k] and $a->[$k] =~ /^$Q Address/ixo; } # output result to file with _mod apended name (e.g., original.txt -> original_mod.txt) (my $f1 = $f) =~ s:(\\.[^./]+$|$):_mod$1:o; open(OUT, '>', $f1) or die qq($!); print OUT &array2text(\\@aa, rs => $rs); close OUT; } sub text2array($;@) { # $ : source text # @ : key value pairs of options # fs => field_separator # rs => record_separator # re => regex pattern to extract elements from each row # * if re is specified, fs and rs are ignored. # * fs, rs and re may be string or compiled pattern. # return : 2d-array my $t = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), re => undef, @_ ); my ($fs, $rs, $re) = @opts{'fs', 'rs', 're'}; my @rr = (); unless ($re) { ($fs, $rs) = map { ref($_) eq 'Regexp' ? $_ : scalar($_ = quotemeta $_, qr/$_/) } ($fs, $rs); for (split $rs, $t, -1) { push @rr, [split $fs, $_, -1]; } return @rr; } else { $re = ref($re) eq 'Regexp' ? $re : qr/$re/; my @tt = ( $t =~ /$re/g ); my $len = @+ - 1; push @rr, [ splice(@tt, 0, $len) ] while ( @tt >= $len ); return @rr; } } sub array2text($;@) { # $ : source 2d-array ref # @ : key value pairs of options # fs => field_separator # rs => record_separator # fm => format string for sprintf for each row # * if fm is specified, fs and rs are ignored # return : text representation of source 2d-array my $aref = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), fm => undef, @_ ); my ($fs, $rs, $fm) = @opts{'fs', 'rs', 'fm'}; unless ($fm) { return (join $rs, map { join $fs, @{$_}; } @{$aref}) . $rs; } else { my $t = ''; map { $t .= sprintf $fm, @{$_}; } @{$aref}; return $t; } } EOF" end _main

Sorry for the confusion.

Hiroto

Reply

Answer 5

Hiroto

Level 5

7,461 points

May 1, 2015 2:08 PM in response to Colin @ mac.com

Hello Colin,

In case, here's a revised script which should be able to process both quoted and unquoted values in TSV data properly.

All the best,

H

on run open (choose file with multiple selections allowed) end run on open argv _main(argv) end open on _main(argv) (* list argv : alias list *) set args to "" repeat with a in argv set args to args & space & a's POSIX path's quoted form end repeat do shell script "perl <<'EOF' - " & args & " use strict; for my $f (@ARGV) { my $t; open(IN, '<', $f) or die qq($!); { local $/; $t = <IN>; } close IN; my @aa = &text2array($t, rs => qr/\\015\\012|\\015|\\012/o); # get original record separator my $rs = ($t =~ /\\015\\012|\\015|\\012/o) ? $& : qq(\\n); # get target column index my ( $k ) = grep { $aa[0][$_] and $aa[0][$_] =~ /^\"?destination\"?$/ } 0..$#aa; # edit values in target column $aa[0][$k] =~ s/^(\"?)(destination)(\"?)$/$1.'@'.$2.$3/oe; for $a (@aa) { $a->[$k] = $1.'office.pdf'.$1 if ($a->[$k] and $a->[$k] =~ /^(\"?) Office/ixo ); $a->[$k] = $1.'stamp.pdf'.$1 if ($a->[$k] and $a->[$k] =~ /^(\"?) Address/ixo ); } # output result to file with _mod apended name (e.g., original.txt -> original_mod.txt) (my $f1 = $f) =~ s:(\\.[^./]+$|$):_mod$1:o; open(OUT, '>', $f1) or die qq($!); print OUT &array2text(\\@aa, rs => $rs); close OUT; } sub text2array($;@) { # $ : source text # @ : key value pairs of options # fs => field_separator # rs => record_separator # re => regex pattern to extract elements from each row # * if re is specified, fs and rs are ignored. # * fs, rs and re may be string or compiled pattern. # return : 2d-array my $t = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), re => undef, @_ ); my ($fs, $rs, $re) = @opts{'fs', 'rs', 're'}; my @rr = (); unless ($re) { ($fs, $rs) = map { ref($_) eq 'Regexp' ? $_ : scalar($_ = quotemeta $_, qr/$_/) } ($fs, $rs); for (split $rs, $t, -1) { push @rr, [split $fs, $_, -1]; } return @rr; } else { $re = ref($re) eq 'Regexp' ? $re : qr/$re/; my @tt = ( $t =~ /$re/g ); my $len = @+ - 1; push @rr, [ splice(@tt, 0, $len) ] while ( @tt >= $len ); return @rr; } } sub array2text($;@) { # $ : source 2d-array ref # @ : key value pairs of options # fs => field_separator # rs => record_separator # fm => format string for sprintf for each row # * if fm is specified, fs and rs are ignored # return : text representation of source 2d-array my $aref = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), fm => undef, @_ ); my ($fs, $rs, $fm) = @opts{'fs', 'rs', 'fm'}; unless ($fm) { return (join $rs, map { join $fs, @{$_}; } @{$aref}) . $rs; } else { my $t = ''; map { $t .= sprintf $fm, @{$_}; } @{$aref}; return $t; } } EOF" end _main

Reply

Answer 6

Hiroto

Level 5

7,461 points

May 3, 2015 1:02 PM in response to Colin @ mac.com

Ah... I noticed I had overlooked another error in the previous corrections.

WRONG:

# get target column index my ( $k ) = grep { $aa[0][$_] and $aa[0][$_] =~ /^\"?destination\"?$/ } 0..$#aa;

CORRECT:

# get target column index my ( $k ) = grep { $aa[0][$_] and $aa[0][$_] =~ /^\"?destination\"?$/ } 0..$#{$aa[0]};

Here's the last revision. Sorry for correction over corrections.

Regards,

H

on run open (choose file with multiple selections allowed) end run on open argv _main(argv) end open on _main(argv) (* list argv : alias list *) set args to "" repeat with a in argv set args to args & space & a's POSIX path's quoted form end repeat do shell script "perl <<'EOF' - " & args & " use strict; for my $f (@ARGV) { my $t; open(IN, '<', $f) or die qq($!); { local $/; $t = <IN>; } close IN; my @aa = &text2array($t, rs => qr/\\015\\012|\\015|\\012/o); # get original record separator my $rs = ($t =~ /\\015\\012|\\015|\\012/o) ? $& : qq(\\n); # get target column index my ( $k ) = grep { $aa[0][$_] and $aa[0][$_] =~ /^\"?destination\"?$/ } 0..$#{$aa[0]}; next unless defined $k; # edit values in target column $aa[0][$k] =~ s/^(\"?)(destination)(\"?)$/$1.'@'.$2.$3/oe; for $a (@aa) { $a->[$k] = $1.'office.pdf'.$1 if ($a->[$k] and $a->[$k] =~ /^(\"?) Office/ixo ); $a->[$k] = $1.'stamp.pdf'.$1 if ($a->[$k] and $a->[$k] =~ /^(\"?) Address/ixo ); } # output result to file with _mod apended name (e.g., original.txt -> original_mod.txt) (my $f1 = $f) =~ s:(\\.[^./]+$|$):_mod$1:o; open(OUT, '>', $f1) or die qq($!); print OUT &array2text(\\@aa, rs => $rs); close OUT; } sub text2array($;@) { # $ : source text # @ : key value pairs of options # fs => field_separator # rs => record_separator # re => regex pattern to extract elements from each row # * if re is specified, fs and rs are ignored. # * fs, rs and re may be string or compiled pattern. # return : 2d-array my $t = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), re => undef, @_ ); my ($fs, $rs, $re) = @opts{'fs', 'rs', 're'}; my @rr = (); unless ($re) { ($fs, $rs) = map { ref($_) eq 'Regexp' ? $_ : scalar($_ = quotemeta $_, qr/$_/) } ($fs, $rs); for (split $rs, $t, -1) { push @rr, [split $fs, $_, -1]; } return @rr; } else { $re = ref($re) eq 'Regexp' ? $re : qr/$re/; my @tt = ( $t =~ /$re/g ); my $len = @+ - 1; push @rr, [ splice(@tt, 0, $len) ] while ( @tt >= $len ); return @rr; } } sub array2text($;@) { # $ : source 2d-array ref # @ : key value pairs of options # fs => field_separator # rs => record_separator # fm => format string for sprintf for each row # * if fm is specified, fs and rs are ignored # return : text representation of source 2d-array my $aref = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), fm => undef, @_ ); my ($fs, $rs, $fm) = @opts{'fs', 'rs', 'fm'}; unless ($fm) { return (join $rs, map { join $fs, @{$_}; } @{$aref}) . $rs; } else { my $t = ''; map { $t .= sprintf $fm, @{$_}; } @{$aref}; return $t; } } EOF" end _main

Reply

Answer 7

May 4, 2015 12:55 AM in response to Hiroto

Many many thanks Hiroto,

It works perfect. I send you a package of good vibes and good luck.

Many thanks - what a nice way to start the week.

Reply

Answer 8

Hiroto

Level 5

7,461 points

May 4, 2015 1:01 PM in response to Colin @ mac.com

My pleasure! And thank you for your warm feedback. 🙂

Hiroto

Reply