Regex Greedy
Mac mini, OS X Yosemite (10.10.5), Fall 2014; iPhone 4 7.1.2
Mac mini, OS X Yosemite (10.10.5), Fall 2014; iPhone 4 7.1.2
“Greedy” (e.g. .*, .+) will match the longest possible string, and “Lazy” (e.g. .*?, .+?) will match the shortest possible string. More in this stack overflow article.
You can also visualize your regex and captures with online feedback tools:
And here is a regex learning site. There are different dialects of regex, so keep that in mind.
I'm still a little confused about this.
(.*?)[[:digit:]][[:digit:]][[:digit:]]*(.*)
Why is [[:digit:]][[:digit:]][[:digit:]] being forced to the right? It seems these are acting greedy. I like think of (.*) as getting the stuff to to the left of [[:digit:]][[:digit:]][[:digit:]]* and the trailing (.*) get the stuff to the right. I think of .* affecting stuff to the left and not affecting stuff to the right of the .*, but I seeing .* effect stuff both to the left and the right. I would have thought the * in [[:digit:]][[:digit:]][[:digit:]]* said to maximize the number of 9s. I guess (.*) takes precedence of the stuff to the right of it.
R
I meant "the bigger picture". Are you trying to rename files or something? You mentioned that you were going to use sed. BBEdit is OK, but sed might behave much differently.
I think your pattern has too many wildcards. What happens if you have two sets of matching digits? I am a bit concerned about the "[[:digit:]]*". Although I do a lot of Perl, I still find regex really tricky.
And your captures at the beginning and end seem superfluous. If you just want to replace any sequence of 2 or more digits with "444", you could just search for "[0-9]{2,}" and replace with "444". There are many ways to do the same thing. There isn't necessarily one right way. But it does seem premature to be going into greedy vs. non-greedy searching for something like this.
Thanks.
Ended up with this:
set toUnix to "echo " & quotedDropped & " | sed s/[[:digit:]][[:digit:]]*/#\\ " & pageCount & "/"
log "toUnix is " & toUnix
set fromUnix to do shell script toUnix
log "sed output is " & fromUnix(*quotedDropped is 'Macintosh HD:Users:mac:Desktop:Tax - 0002.jpg'*)
(*toUnix is echo 'Macintosh HD:Users:mac:Desktop:Tax - 0002.jpg' | sed s/[[:digit:]][[:digit:]]*/#\ 3/*)
(*sed output is Macintosh HD:Users:mac:Desktop:Tax - # 3.jpg*)What exactly are you trying to do with this expression?
Regex Greedy