Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

trying to find joy with the find command

You all are having so much fun with the find command, so I'll post some more questions. Basically, the challenge is to process any file name unix allows in bash when using the find filter. Is this possible? How hard is it. Almost seems that I need to look at the filename and write different code depending on what the file name is.


What I'm I missing?


#!/bin/bash 
echo "in bash script $0"


ls -l '/Users/mac/exifinner/'
fileCount=$(ls -l '/Users/mac/exifinner/'   | wc -l)
let fileCount--
echo "${fileCount} files found."

count=1
echo "starting finding files"

# Hexadecimal: numbers preceded by '0x' or '0X'
#let "hex = 0x00"
#IFS=${hex}
echo "xargs"
echo "I'm not sure how to use this command"
echo "I loose the flow because all the files are glommed together."
find '/Users/mac/exifinner/'  -print0 -type f  \( -iname "*.txt"  \)  | xargs -0 echo

echo "print0"
echo "0x00 isn't being inserted"
echo "notice how the path to the first file appears twice. brilliant. "
echo "bash variables cannot contain 0x00? Nothing like rubbish C strings."
theFiles=$(find '/Users/mac/exifinner/'  -print0 -type f  \( -iname "*.txt"  \) )
echo $theFiles | hexdump -C    

echo "print"
echo "separator seems to be a blank"
theFiles=$(find '/Users/mac/exifinner/'  -print -type f  \( -iname "*.txt"  \) )
echo $theFiles | hexdump -C         

echo "default"
echo "default seems to be to put in a blank.  Why is thi done?"
echo "I find magic annoying.  Why not just put in the lf?  I could parse it out"
theFiles=$(find '/Users/mac/exifinner/'  -type f  \( -iname "*.txt"  \) )
echo $theFiles | hexdump -C  
echo "to terminal"
echo "How do I get terminal output to a string?"
echo "a lf is being inserted.  Guess I could write out to a file.  Seems to be a PITA."
echo "And I thought Unix was supposed to be consistent?"
find '/Users/mac/exifinner/'  -type f  \( -iname "*.txt"  \)       

echo "read loop"
echo "  so my first file is \"\\ \\&.txt\", but where are the two backslashes?"
find '/Users/mac/exifinner/'  -type f  \( -iname "*.txt"  \)  | (
        while  read theFilePath
        do
          echo -n " ${count}: "
          echo "processing file " ${theFilePath} 
          echo "               file  ${theFilePath}" 
          echo ${theFilePath} | hexdump -C 
          let count++
          
        done
      )

echo "leaving $0"

exit 0


Here are some test cases.


mac $ ~/config/groupReadFind#5.sh 
in bash script /Users/mac/config/groupReadFind#5.sh
total 0
-rw-r--r--  1 mac  staff  0 Mar  1 19:13 \ \&.txt
-rw-r--r--  1 mac  staff  0 Mar  1 19:12 normal.txt
2 files found.
starting finding files
xargs
I'm not sure how to use this command
I loose the flow because all the files are glommed together.
/Users/mac/exifinner/ /Users/mac/exifinner//\ \&.txt /Users/mac/exifinner//normal.txt
print0
0x00 isn't being inserted
notice how the path to the first file appears twice. brilliant. 
bash variables cannot contain 0x00? Nothing like rubbish C strings.
00000000  2f 55 73 65 72 73 2f 6d  61 63 2f 65 78 69 66 69  |/Users/mac/exifi|
00000010  6e 6e 65 72 2f 2f 55 73  65 72 73 2f 6d 61 63 2f  |nner//Users/mac/|
00000020  65 78 69 66 69 6e 6e 65  72 2f 2f 5c 20 5c 26 2e  |exifinner//\ \&.|
00000030  74 78 74 2f 55 73 65 72  73 2f 6d 61 63 2f 65 78  |txt/Users/mac/ex|
00000040  69 66 69 6e 6e 65 72 2f  2f 6e 6f 72 6d 61 6c 2e  |ifinner//normal.|
00000050  74 78 74 0a                                       |txt.|
00000054
print
separator seems to be a blank
00000000  2f 55 73 65 72 73 2f 6d  61 63 2f 65 78 69 66 69  |/Users/mac/exifi|
00000010  6e 6e 65 72 2f 20 2f 55  73 65 72 73 2f 6d 61 63  |nner/ /Users/mac|
00000020  2f 65 78 69 66 69 6e 6e  65 72 2f 2f 5c 20 5c 26  |/exifinner//\ \&|
00000030  2e 74 78 74 20 2f 55 73  65 72 73 2f 6d 61 63 2f  |.txt /Users/mac/|
00000040  65 78 69 66 69 6e 6e 65  72 2f 2f 6e 6f 72 6d 61  |exifinner//norma|
00000050  6c 2e 74 78 74 0a                                 |l.txt.|
00000056
default
default seems to be to put in a blank.  Why is thi done?
I find magic annoying.  Why not just put in the lf?  I could parse it out
00000000  2f 55 73 65 72 73 2f 6d  61 63 2f 65 78 69 66 69  |/Users/mac/exifi|
00000010  6e 6e 65 72 2f 2f 5c 20  5c 26 2e 74 78 74 20 2f  |nner//\ \&.txt /|
00000020  55 73 65 72 73 2f 6d 61  63 2f 65 78 69 66 69 6e  |Users/mac/exifin|
00000030  6e 65 72 2f 2f 6e 6f 72  6d 61 6c 2e 74 78 74 0a  |ner//normal.txt.|
00000040
to terminal
How do I get terminal output to a string?
a lf is being inserted.  Guess I could write out to a file.  Seems to be a PITA.
And I thought Unix was supposed to be consistent?
/Users/mac/exifinner//\ \&.txt
/Users/mac/exifinner//normal.txt
read loop
  so my first file is "\ \&.txt", but where are the two backslashes?
 1: processing file  /Users/mac/exifinner// &.txt
               file  /Users/mac/exifinner// &.txt
00000000  2f 55 73 65 72 73 2f 6d  61 63 2f 65 78 69 66 69  |/Users/mac/exifi|
00000010  6e 6e 65 72 2f 2f 20 26  2e 74 78 74 0a           |nner// &.txt.|
0000001d
 2: processing file  /Users/mac/exifinner//normal.txt
               file  /Users/mac/exifinner//normal.txt
00000000  2f 55 73 65 72 73 2f 6d  61 63 2f 65 78 69 66 69  |/Users/mac/exifi|
00000010  6e 6e 65 72 2f 2f 6e 6f  72 6d 61 6c 2e 74 78 74  |nner//normal.txt|
00000020  0a                                                |.|
00000021
leaving /Users/mac/config/groupReadFind#5.sh
mac $

Mac mini, OS X Yosemite (10.10.5), Fall 2014; iPhone 4 7.1.2

Posted on Mar 1, 2018 6:40 PM

Reply
Question marked as Best reply

Posted on Mar 9, 2018 7:59 PM

I’m at a disadvantage as the recent heavy wet snow in New England has me without cable, so I can only use my iPhone.


Variable=$(find ...)


Has the new lines


echo $variable


Will replace newlines with spaces


echo “$variable”


Will preserve newlines.


I would give example output, but as I said, doing this on my phone is limiting

Similar questions

10 replies
Question marked as Best reply

Mar 9, 2018 7:59 PM in response to rccharles

I’m at a disadvantage as the recent heavy wet snow in New England has me without cable, so I can only use my iPhone.


Variable=$(find ...)


Has the new lines


echo $variable


Will replace newlines with spaces


echo “$variable”


Will preserve newlines.


I would give example output, but as I said, doing this on my phone is limiting

Mar 3, 2018 8:39 PM in response to rccharles

This might be a useful what to use the find command in a shell script to process files that you found

while read -r file

do

echo "${file} between the 'do' and 'done' process each file found however you like"

done < <(find /Users/mac/exifinner/ -type f -iname '*.txt')

There is a space between the first < and the <(...)

The first < is I/O redirection into the while/do/done loop

The <(...) is Process Substitution where the stdout from the command inside the <(...) is made available as if it was a file. Since <(...) is immediately following a stdin < I/O redirection, the <(...) becomes the stdin file redirected as input to the while/do/done loop.


Confused? 🙂 I hope not, because this Process Substitution is very powerful, because the while loop is in the same scope as the rest of the script, so any variables that get set will be visible to the rest of the script.


If you did something like find | while read file; do echo "$file"; done, then the entire while/do/loop is run in a sub-process outside the scope of the current shell script, and any variables set between the 'do' and the 'while' will not be in the same shell script scope, so as soon as the while/do/done loop finishes, those variables disappear. If your case maybe it is $count that will not be available when the loop ends, but that might be when you want to print it.


xargs - VikingOSX has given xargs examples. The most important thing to know about xargs is that you can use it for things like deleting all the files found using a few 'rm' commands as possible, without having to invoke rm for each and every command, which would be slower when there are a lot of files to delete. I'm am using rm as an example, but it is a real example, as there are times you need to delete large number of files, but need to carefully select the files to delete. But other commands can be used with xargs.


Another powerful use of xargs, is when you want to wildcard include a large number of files on the command line, but the number and length is so large, you exceed the command line length limit. On some platforms, this is a low as a few thousand bytes, and on others it can be as high as a million bytes, but even then you can have enough files in a wildcard match to exceed the byte length limit.

  • rm *.jpg # failed because too many files on command line
  • find . -iname '*.jpg' -print0 | xargs -0 rm # this will work as fast as a single rm command, but never get a command line length too long error


About 'find' arguments. Each argument is considered to from an logical expression each connected by -a (AND) or -o (OR), with the assumption that if neither -a nor -o are specific, then -a (AND) is implied. So by default you have to match the -first AND -second AND -third to select a file.


As VikingOSX has pointed out, the -print0 option's action is to print the file name that has survived the logical expression up to that point in the list of arguments. It is possible to want an intermediate name including in the output, so you could put it before the end, but in generally you want it at the end of the logical expression.


So your 'find' command would more logically look like

theFiles=$(find '/Users/mac/exifinner/' -type f -a -iname "*.txt" -a -print )

While I'm hear, I want to talk about \( ... \)


These are just logical expression parens to group and change evaluation precedence, the same as parens would be used in any math expression. The \ is needed because normally parens on a shell command line mean something to the shell, so you need to protect the parens from the shell seeing them. You could protect them with quotes as well, it is just not the convention

find /path '(' -type f -o -type l ')' -a -iname '*jpg' -a -print0 | xargs -0 rm # both of these

find /path \( -type f -o -type l \) -a -iname '*jpg' -a -print0 | xargs -0 rm # are the same

About the blank when you want a newline

echo "default seems to be to put in a blank. Why is thi done?"

echo "I find magic annoying. Why not just put in the lf? I could parse it out"

Well 'find' DID separate each file name found with a newline, but you asked the shell to take the output from 'find' and substitute it into the command line, and that command line just happened to be on the right side of a shell variable assignment.


Shell command substitution replaces any newlines with a space. This has ZERO to do with the 'find' command and EVERYTHING to do with shell command substitution.


The while loop example I gave at the beginning of this reply, takes advantage of the fact that 'find' outputs each command as a separate line because the 'read' command takes each line from stdin and puts it into variable $file.


I find magic annoying. Why not just put in the lf? I could parse it out

You had to have a file name with backslashes in it! 🙂


My example while loop about uses the -r option with the 'read' command which prevents the shell from looking at the escape characters.


The 'find' command passes the file name exactly has it is to stdout.


Your problem is you did NOT include the -r option in your uses of the 'read' command.


Your bigger problem is that you have not been sufficiently beaten down for years as a Unix users such that you NEVER even think to include shell magic characters, such as \ & () : ` ' " * $ ! {} [] | ? < > or spaces in a file name 😮 After 33 years of working with Unix it never crosses my mind to use any of these characters in a file name, as I just know it is going to cause me problems. Actually the : is because of Macs


I'm not saying you are wrong to use any of these characters in a filename, I'm just saying I avoid using them, because I know I'll be writing a script to process my files eventually. As I write scripts all the time, eventually might be tomorrow 🙂


Feel free to ask follow up questions

Mar 1, 2018 8:27 PM in response to rccharles

This sequence will put the null terminated strings into your output. The -print0 is what is putting the null bytes on the end of each find result, and it should come last in the find usage. Notice how I have removed all of your quoting and escaping, as it is not needed. The xargs -Pn is the number of processes to use for parallel activity.


The following is tested in a Bash script exactly as written, though I used a different directory path.


find /Users/mac/exifinner/ -type f -iname '*.txt' -print0 | \

xargs -0 -P2 -I{} echo "{}"

Mar 2, 2018 10:54 AM in response to rccharles

I have been using multiple UNIX solutions for decades. I got into the habit of using -print0 as the last item a long time ago.


Did you know that xargs can be used to strip leading and trailing space from a non-newline terminated string, and then add a newline to that result? Same drill for awk.


xargs <<<" cat dog " | hexdump -C

awk '$1=$1' <<<" cat dog " | hexdump -C

Mar 3, 2018 9:30 PM in response to BobHarris

Your bigger problem is that you have not been sufficiently beaten down for years as a Unix users such that you NEVER even think to include shell magic characters,

I have no control over this because I was writing the script for someone else. Yesterday, I spend two hours writing the script in applescript and I have already spend 10 hours on bash.


I do appreciate your insight. I'll read it again tomorrow so that I can sort through my questions. I find your explanation very useful.

[ fyi, the folks over on stackoverflow cannot even agree on what the magic characters are 👿 ]


R

Mar 4, 2018 1:36 PM in response to rccharles

[ fyi, the folks over on stackoverflow cannot even agree on what the magic characters are 👿 ]

Well it all depends if you are in the "Harry Potter" universe, the "Lord of the Rings" universe, the "Discworld" universe, "Narnia" universe, "Oz" universe, "Xanth" universe, etc... Which translates to the explicit shell (sh, zsh, bash, zsh, csh, tcsh, etc...), or awk, Perl, Python, etc... Many have similar common magic characters, but they also each have some magic characters of their own.

Mar 9, 2018 6:28 PM in response to BobHarris

Well 'find' DID separate each file name found with a newline, but you asked the shell to take the output from 'find' and substitute it into the command line, and that command line just happened to be on the right side of a shell variable assignment.


So, is there any way to tell the shell not to do this? Beyond the two methods used here read or < < ?


Or is this a distinction without a difference?


R

Mar 10, 2018 10:28 AM in response to BobHarris

Here is your example! Success!


I haven't coded much in Bash in awhile. When I did my one big program in bash, I adopted a certain coding style. I've forgotten those rules. I think that I remember writing all comparisons of strings as "${stringVar}".


#!/bin/bash
echo "in bash script $0"


ls -l '/Users/mac/exifinner/'
fileCount=$(ls -l '/Users/mac/exifinner/'   | wc -l)
let fileCount--
echo "${fileCount} files found."

count=1
echo "start experiment"



echo "Five echoes"
echo "Need to put in quotes to avoid changing newline to a space"
theFiles=$(find '/Users/mac/exifinner/'  -type f   -iname "*.txt")
echo -n "$theFiles" | hexdump -C 
echo "printing \$theFiles without quotes"
echo $theFiles
echo "printing \$theFiles with double quotes"
echo "$theFiles"


echo "leaving $0"

exit 0




results:


mac $ /Users/mac/config/trying-newline-varients.sh
in bash script /Users/mac/config/trying-newline-varients.sh
total 15696
-rw-r--r--  1 mac  staff        0 Mar  3 13:42   blankblank-ending with blank
-rw-r--r--@ 1 mac  staff  2676625 Mar  2 21:49 Mc'Green.dng
-rw-r--r--@ 1 mac  staff  2676625 Mar  2 21:49 \ \&.dng
-rw-r--r--  1 mac  staff        0 Mar 10 12:50 morenormal.txt
-rw-r--r--  1 mac  staff        0 Mar  2 21:46 normal.txt
-rw-r--r--  1 mac  staff        0 Mar 10 12:50 normalist.txt
-rw-r--r--@ 1 mac  staff  2676625 Mar  2 21:48 photocardwithiPhone-copy.dng
7 files found.
start experiment
Five echoes
Need to put in quotes to avoid changing newline to a space
00000000  2f 55 73 65 72 73 2f 6d  61 63 2f 65 78 69 66 69  |/Users/mac/exifi|
00000010  6e 6e 65 72 2f 2f 6d 6f  72 65 6e 6f 72 6d 61 6c  |nner//morenormal|
00000020  2e 74 78 74 0a 2f 55 73  65 72 73 2f 6d 61 63 2f  |.txt./Users/mac/|
00000030  65 78 69 66 69 6e 6e 65  72 2f 2f 6e 6f 72 6d 61  |exifinner//norma|
00000040  6c 2e 74 78 74 0a 2f 55  73 65 72 73 2f 6d 61 63  |l.txt./Users/mac|
00000050  2f 65 78 69 66 69 6e 6e  65 72 2f 2f 6e 6f 72 6d  |/exifinner//norm|
00000060  61 6c 69 73 74 2e 74 78  74                       |alist.txt|
00000069
printing $theFiles without quotes
/Users/mac/exifinner//morenormal.txt /Users/mac/exifinner//normal.txt /Users/mac/exifinner//normalist.txt
printing $theFiles with double quotes
/Users/mac/exifinner//morenormal.txt
/Users/mac/exifinner//normal.txt
/Users/mac/exifinner//normalist.txt
leaving /Users/mac/config/trying-newline-varients.sh
mac $


R

PS. Somehow, writing scripts in Python or Applescript has a higher startup overhead. I figure this will be a short and quick script, so I'll just do it in Bash. I forget about all of the gotchases.

trying to find joy with the find command

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.