HI Dwight,
Where are you copying the text from? I would guess that the hard returns have been inserted by a text editor app at some point in the process. If you are copying from a text editor, you may be able to find a setting that will toggle between having and not having hard returns at the end of each line. If so, set it to 'off' before doing the select and Copy.
Is there any way to distinguish these extra returns from the legitimate ones by examining the characters that immediately precede and follow them?
Returns at the end of a paragraph should immediately follow a punctuation mark; period, question mark, or exclamation mark, and should be immediately followed by the first character of the next paragraph.
End of line returns should not follow one of these marks except when that mark (and the space after it) come too close to the maximum line length to fit the following word onto the same line.
If you are lucky, and the document was produced by someone not wise to (or not trusting of) the ability of software to add space after a paragraph, you may find that two consecutive returns have been used (as I have in this message) to create that space.
End of line returns should not follow one of these marks except when that mark (and the space after it) come too close to the maximum line length to fit the following word onto the same line.
Knowledge of these patterns can give you a three pass method for bulk removal of those extra returns using Find/Replace.
On the first pass, MARK the returns you want to keep.
— If double returns have been used, use two returns as your Find string, and two identical markers as the replace string. The markers should be a pair of characters that do not appear, or do not appear in pairs, in the document. "##" comes to mind.
— If you need to rely on punctuation, you'll need to do a separate pass for each punctuation mark that has been used at the end of a paragraph. For each, the Find string will be markreturn, and the replace string mark##.
That first pass will replace all of the returns you want to keep with "##" (preceded in the second method by the punctuation mark that is also to be retained).
On the second pass, remove all of the returns that are left in the document. Find string: return Replace string: one space.
NOTE: This assumes that the end of line return has replaced the space that would normally occur between words at this point. If you have determined that there is an end of line return AND a space between the word at the end of one line and the word at the beginning of the next line, leave the Replace box empty.
CAUTIOUS route: Include the space, then see the fourth pass below.
On the third pass, re-insert the required returns. Find: ## Replace with: return
Note that if your first pass searched for and retained a punctuation mark, you do NOT need to restore this—it was kept during the first pass.
Optional fourth pass: One of the things I taught students when we started using word processing in the elementary grades was to never press the space bar twice in a row. You need one space between words, no spaces between a word and the punctuation mark after the word, and one space between the end mark of a sentence and the first word of the next sentence in the same paragraph. This Find/Replace pass removes extra spaces using Find: two spaces Replace with: one space.
Repeat this pass as many times as necessary (until Find/Replace reports 'none found' or a similar message).
Regards,
Barry
PS: This method can probably also be written into an AppleScript, which would cut down on the necessary steps and simplify the whole thing.
B