9 Replies Latest reply: Dec 29, 2007 10:54 PM by j.v.
Eric S Level 1 Level 1 (80 points)
Is there a way to generate an md5 sum for an entire directory (as opposed to a single file)? The md5 command only seems to accept files.

Thanks in advance.

Eric

MacBook Pro, Mac OS X (10.5.1)
  • 1. Re: md5 of a directory
    etresoft Level 7 Level 7 (24,270 points)
    Try "md5 directory/*" or
    "find directory -type f -exec md5 {} \;"
  • 2. Re: md5 of a directory
    Cole Tierney Level 4 Level 4 (1,375 points)
    Something like this should work:

    cat somedir/* | md5

    --
    Cole
  • 3. Re: md5 of a directory
    Eric S Level 1 Level 1 (80 points)
    Thanks for the good idea. The "cat somedir/* | md5" command comes close, but it skips all the subdirectories of the main directory. There doesn't seem to be an option to have cat act recursively.

    Any other ideas?

    e
  • 4. Re: md5 of a directory
    j.v. Level 5 Level 5 (4,150 points)
    how about tar -cf test.tar someDir | md5 && rm test.tar???

    the blue part creates a "tarball" (Tape ARchive) of the directory "SomeDir" and all its files and subdirectories recursively, and the gray part is an optional part to delete the tar file after you've run an md5 checksum on it.

    If disk space is an issue while creating the temporary tarball, modify the command to read
    tar -czf test.tgz someDir | md5 && rm test.tgz instead. This will create a compressed tarball to run the checksum against, and, as before, optionally delete the zipped tarball after the md5 has been run on it. This second way takes a bit longer to execute, though, but disk space savings during the lifetime of the tarball can be quite significant. You won't gain anything, disk space savings-wise, with ".mov" files or ".gpg" files but you sure will with text-type files, like .doc, or .txt or .rtf.
  • 5. Re: md5 of a directory
    Cole Tierney Level 4 Level 4 (1,375 points)
    This seems to work:

    find somedir -type f -exec cat {} \; | md5

    --
    Cole
  • 6. Re: md5 of a directory
    j.v. Level 5 Level 5 (4,150 points)
    So, out of the three methods, it appears to me that Etresoft's "find" method gives you an MD5 checksum for each and every file nested in the directory of interest. Not sure if that's what you were looking for or not. My (j.v.) "nozip" tar method took 53 secs to spit out a single md5 checksum value for a 210,032,640-byte directory (my ~/Library) on a iBook G4/800, but it required a temporary allocation of free disk space the size of what I tarring/md5ing. Cole's method on that same directory (find ~/Library -type f -exec cat {} \; | md5), didn't require allocating any temporary disk space, and took 96 secs to spit out a single checksum value. I didn't time Etresoft's or my "tar&zip" methods. So there you have it -- three methods to do the same thing, each with their own advantages and disadvantages. Utilize as appropriate to your situation.
  • 7. Re: md5 of a directory
    etresoft Level 7 Level 7 (24,270 points)
    I like your tar method. I think you are just so used to dealing with tarballs that you forgot an important plot point. You don't need the "f" option. You can just do "tar c <dir> | md5"! You don't need a temporary file at all.
  • 8. Re: md5 of a directory
    Eric S Level 1 Level 1 (80 points)
    Thank you to everyone for your creative efforts. I have to admit, tar c <dir> | md5 looks pretty much like what I was hoping for.

    e
  • 9. Re: md5 of a directory
    j.v. Level 5 Level 5 (4,150 points)
    thanx for the compliment, Etresoft. And I'm glad to see the O.P. liked your liking my tar method, with the exception of the "-f".