10 Replies Latest reply: Jan 2, 2013 7:25 AM by SadLion
SadLion Level 1 Level 1 (0 points)

I upgraded to Mountain Lion (MacBook Pro Mid 2010, 4Gb Ram, 2.66GHz) recently.  The upgrade went fine and the system is running well in general.

 

However, I previously used grep (with -x or -w) in terminal to take a file of patterns and pull matching lines out of another file.  As in 'grep -F -x -f patternfile basefile > outputfile'.  Both files are admittedly very large (20,000 - 30,000 rows).  However, under Snow Leopard, this would take about a minute.  Under Mountain Lion, it is taking hours!  I mean literally hours!

 

Has anybody else had this problem?  Is there anyway to fix it?  I handle huge files like this all the time and I'd rather downgrade to Snow Leopard than have grep behave like this.

 

PS.  I know grep has been changed and the -p option removed, but I don't use that option so surely that shouldn't affect me.


MacBook Pro, OS X Mountain Lion (10.8.2)
  • 1. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    VikingOSX Level 5 Level 5 (5,500 points)

    Are either or both files compressed?

     

    The grep(1) man page indicates that fgrep is faster than grep or eqrep on fixed patterns. Try the following to see if your elapsed time improves.

     

     

     

    Code:
    fgrep -x -f patternfile basefile > outputfile
    

     

  • 2. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    VikingOSX Level 5 Level 5 (5,500 points)

    fgrep -x -f patternfile basefile > outfile

  • 3. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    SadLion Level 1 Level 1 (0 points)

    No, neither file is compressed and I'm afraid fgrep has made no significant difference in that it is still running 2 hours later when grep under snow leopard took less than a minute to do the same job.

  • 4. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    VikingOSX Level 5 Level 5 (5,500 points)

    Arrrggh.

     

    Is it possible to restructure the pattern file in a highest to lowest probability of matches to the basefile?

     

    Mountain Lion has shed GNU grep (Snow Leopard) for FreeBSD grep due to licensing whims outside of Apple. The Perl-Regex (-P) code that made the Snow Leopard grep so fast was the sacrificial lamb. Even though you stated you were not explicitly using the -P option in Snow Leopard, it may have been implicitly used in the pattern matching process anyway.

     

    I have spent some time now with Google search attempting to find any optimization techniques for FreeBSD grep and so far --- mostly rhetoric. It is possible to download the source to GNU grep and build it locally with Xcode command line tools, but it also has some pre-requisites and poor build documentation that make this a migraine.

     

    If you have language skills, perhaps you can write something in Ruby, Python, or Perl that allows you to create a faster pattern matching solution than offered by FreeBSD grep.

     

    Good luck.

  • 5. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    Linc Davis Level 10 Level 10 (118,110 points)

    Test in safe mode. Same?

  • 6. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    SadLion Level 1 Level 1 (0 points)

    No, I'm araid there is no way of structuring the pattern file as there are no patterns intrinsically more likely to match than others (it's biological data).

     

    Interestingly the pattern file is generated by running awk on a different file also ~25,000 rows in length and it works just fine in well under a minute.  I'm therefore inclined to agree that the new free grep is the problem - it's obviously rubbish.

     

    I have very basic knowledge of terminal.  Is there a way I can use awk to do the job of grep?  I think I tried at the beginning and ran into problems because the patterns to be matched are so odd.  Here's an example of a typical pattern....

     

    "NOC2L:uc001aby.4:exon8:c.T573C:p.T191T

     

    (Note that the first quotation mark is part of the pattern to be matched)

     

    I would need awk take each pattern in turn from a file with around 20,000 patterns as weird as the one above and then search for it in a separate csv file and, if it found that pattern (the whole pattern that is, not part of it) in any field of the file (unlike before, it doesn't have to be the only thing in the field; there may be more text before or after), then to print the the whole row to a new file.  Does any one know how to do that?

     

    Old grep did do that job easily in under a minute.  I have a new found respect for whoever wrote old grep.

  • 7. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    VikingOSX Level 5 Level 5 (5,500 points)

    Short of downloading the latest stable GNU grep, compiling it, and installing it in /usr/local/bin so it doesn't step on the FreeBSD grep, there probably is no liveable grep solution for you on Mountain Lion. Depending on coding skill and energy, you may or may not have an AWK or other reqular expression solution working before the next paragraph.

     

    I just revisited 10.7.5 and it has GNU grep. If you had originally purchased Lion, then you could redownload it to Mountain Lion (but not install it), burn it to a USB stick, and then perform a clean 10.7.5  install on an external USB drive. Then, reboot from the 10.7.5 drive, and let fly with GNU grep as before. Based on the amount of time the FreeBSD grep is costing you, the Lion download and installation would probably finish first.

     

    Thoughts?

  • 8. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    codinglamp Level 1 Level 1 (0 points)

    I had the same problem on Moutain Lion.   Using grep -v -f patterns-file on my 24MB test data file took >10s.  I then used the grep from my Lion partition and it took only 0.2s!    

  • 9. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    vogelw Level 1 Level 1 (10 points)

    Here is my solution: using MacPorts, https://www.macports.org, install the GNU version of grep.

     

    $ sw_vers -productVersion

    10.8.2

     

    $ sudo port install grep

     

    $ which grep

    /opt/local/bin/grep

     

    $ grep --version

    grep (GNU grep) 2.14

    Copyright (C) 2012 Free Software Foundation, Inc.

  • 10. Re: Grep running extremely slow on Mountain Lion.  What has happened to it?
    SadLion Level 1 Level 1 (0 points)

    Thanks, Vogelw.  That worked perfectly and was relatively easy.