awk FPAT does not work in AppleScript, but does work in the Terminal

The issue:

I have an awk command that uses FPAT to ignore comma's in a CSV record. This command works in the Terminal (default zsh shell):


awk -v FPAT='"[^"]*"|[^,]+' 'NF==6 {split($NF, a, /\. # /)} NR!=1&&NF<=6 {print a[5],$5,$3,$4} ' input.txt


Sample record:

http://publications.europa.eu/resource/cellar/3befa3c3-a9af-4dac-baa2-92e95cb6e3ab,http://publications.europa.eu/resource/cellar/3befa3c3-a9af-4dac-baa2-92e95cb6e3ab.0002,ECLI:EU:C:1985:443,61984CJ0239,Gerlach,"Judgment of the Court (Third Chamber) of 24 October 1985. # Gerlach & Co. BV, Internationale Expeditie, v Minister van Economische Zaken. # Reference for a preliminary ruling: College van Beroep voor het Bedrijfsleven - Netherlands. # Article 41 ECSC - Anti-dumping duties. # Case 239/84."


However, it does not work in my AppleScript, where I have entered it (with escaping) like this.


set r1 to do shell script "awk -v FPAT='\"[^\"]*\"|[^,]+' '{print a[5],$5,$3,$4
}' <<<" & quoted form of theInput


With this command, awk still treats a space as the field delimiter. In other words, it appears to ignore the `FPAT` completely. This is very clear when one replaces the `print` statement with a simple `print $1`.


Solutions tried:


  • When I do
awk -W version

the output is:

GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)


Posted on Mar 11, 2021 12:46 AM

Reply
Question marked as Top-ranking reply

Posted on Mar 15, 2021 2:46 AM

Thanks, the script suddenly behaved normally when I added /usr/local/bin/awk instead of just awk. Don't fully get why though.


Could it be that when I did 'brew install gawk', this also changed by awk in /usr/local/bin/awk ? So that when I call awk (without the path) from within AppleScript, a vanilla awk version is used that does not support FPAT?

10 replies
Question marked as Top-ranking reply

Mar 15, 2021 2:46 AM in response to BobHarris

Thanks, the script suddenly behaved normally when I added /usr/local/bin/awk instead of just awk. Don't fully get why though.


Could it be that when I did 'brew install gawk', this also changed by awk in /usr/local/bin/awk ? So that when I call awk (without the path) from within AppleScript, a vanilla awk version is used that does not support FPAT?

Mar 15, 2021 5:10 AM in response to Flimofly

Apple ships

/usr/bin/awk --version
awk version 20070501

which is old and does not know about FPAT


When you install gawk via HomeBrew you get

/usr/local/bin/awk --version
GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)
Copyright (C) 1989, 1991-2020 Free Software Foundation.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.


/usr/local/bin/awk knows about FPAT


Apple cannot ship gawk because it is GPL v3 license which is poison to any commercial operating system developer. It is also the reason Apple does not provide a newer bash, because the bash it does ship is GPL v2 license, which is fine for Apple, but current bash versions are GPL v3 and again poison to Apple.


You are of course allowed to install any GPL v3 software you like, and HomeBrew is happen to do that for you. But the GPL v3 license prevents Apple from including any GPL v3 software in their software installation packages.

Mar 15, 2021 5:45 AM in response to Flimofly

One has to realize that an AppleScript do shell script is using Bash by default (even after chsh /bin/zsh) and is not reading your local dotfiles to get any custom PATH setting that would allow you to access /usr/local/bin before /usr/bin. Consequently, referencing only awk means it uses /usr/bin/awk which has no FPAT facility but allowed execution because you were assigning your regex to an unused FPAT variable.


By the way, when I performed a brew install gawk, that is the executable name that was installed in my /opt/homebrew/bin location (as an arm64 binary) on my M1 mini using brew v3.0.5.


Either use the formal path to awk or invoke the do shell script and force it to read your Bash dotfile to get your PATH environment variable, not the Systems. You can override Bash and use the Zsh shell here too:


do shell script "source ~/.bash_profile;gawk '...' " & fname's quoted form
do shell script "/usr/local/bin/gawk '...' " & fname's quoted form
do shell script "/bin/zsh -c 'print $ZSH_VERSION'"



To avoid all of the lost productivity of wrestling AppleScript's escaping alligator, I would put clean awk/gawk code in an external script and use that on the do shell script invocation as I demonstrated previously.


Mar 11, 2021 5:55 AM in response to Flimofly

Just to clarify that you have a non-standard GNU awk installed with the same name as the BSD awk that ships with Big Sur. Have you seen this stackoverflow article?


Also given a single-row CSV with the following content:

"Company Name, LLC",12345,Type1,SubType3


The following AppleScript with its horrendous escaping syntax will capture "Company Name, LLC":

use scripting additions

set acsv to POSIX path of ((path to desktop as text) & "bark.csv") as text
set firstStr to (do shell script "/opt/homebrew/bin/gawk 'BEGIN{FPAT = \"([^,]+)|(\\\"[^\\\"]+\\\")\"}{print $1}' " & acsv's quoted form) as text


Mar 11, 2021 6:47 AM in response to Flimofly

One must escape any backslash, and if a double-quote appears within the double-quoted command string, it too must be escaped. So that said, here is the original FPAT as it appears in the Zsh shell:


FPAT = "([^,]+)|(\"[^\"]+\")"


But, because the awk/gawk script is surrounded by double quotes as a condition of the do shell script, I have to escape every double-quote and backslash in that existing FPAT to this mess:


FPAT = \"([^,]+)|(\\\"[^\\\"]+\\\")\"


The triple escapiing is one backslash to escape the existing backslash, and another backslash to escape the adjacent double-quote. Syntax only a mother could love…

Mar 11, 2021 8:29 AM in response to Flimofly

Another way to keep your sanity with AppleScript and avoid it requiring any escaping at all is to have an executable Awk script that you call in the do shell script invocation:


fpat.awk

#!/opt/homebrew/bin/gawk -f
BEGIN{FPAT = "([^,]+)|(\"[^\"]+\")"}
{print $1}


and from AppleScript:

use scripting additions

set AWKCMD to "$HOME/Desktop/fpat.awk "
set acsv to "$HOME/Desktop/bark.csv"
set r1 to (do shell script AWKCMD & acsv) as text

Result: "Company Name, LLC"


This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

awk FPAT does not work in AppleScript, but does work in the Terminal

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.