iTunes has two modes for displaying duplicates. The first is accessed via File > Display Duplicates and lists all tracks in the current playlist where the same track title occurs more than once. This is, more often than not, a rather loose definition of the term and can bring back obviously different songs, e.g. studio and live versions of the same song, the same song recorded by different artists, or by the same artist but on different albums, or completely different songs that have nothing in common but their title.
A more useful feature can be found if you hold down the SHIFT key and then select File > Display Exact Duplicates. This gives a list of all tracks where more than one track has essentially the same metadata, name, track no., artist, album, etc. (I'm not sure what the exact criteria are for an exact duplicate as far as iTunes is concerned, but it's good enough for most needs).
There are three types of files that can show up as exact duplicates:
- Multiple library entries for the same physical file
These can occur if your media location is temporarily unavailable, e.g. due to a disconnected external drive, iTunes becomes aware and marks the tracks as missing, you then reconnect the drive and add the media again. iTunes won't recognize that the files are already in the library and will create duplicate entries.
- Multiple copies of the same physical file
This can happen if you have "Copy to iTunes media folder when adding to library" enabled and accidentally add the same folders from outside of the iTunes Media folder into your library.
- Different copies of the same track, in different formats or bit rates
This can happen if you decide to rip a CD that is already in your library on a later occasion or, for example, decide to create MP3 versions of some existing AAC files and then don't remove the originals.
Note you could also have files which have the same audio data but different metadata, e.g. the same album ripped twice but with a slightly altered album title, however there isn't an obvious way to highlight these within iTunes.
Once you have used the Show Exact Duplicates feature you can set about deleting all but one copy of each duplicate file. If all the duplicates have recently been added into your library then you could sort the list on the date added column, select a range of tracks with SHIFT-click and delete all the recent entries. If the duplicates have been added into the library at various times then sort the list by album or track name, select all but one of each group of matching tracks with CTRL-click and delete the selection. Probably best to work one screenful at a time.
Of course the issue isn't quite as simple as it seems. If the duplicates are of type 1 then you want to delete the tracks from the library, but not delete or send the underlying files to the recycle bin as there is only one copy of each file referenced by the duplicate entries. For tracks of type 2 you do want to delete the duplicate physical copies leaving one remaining version. For files of type 3 you probably want to decide which of the two or more copies you want to keep, the smallest to take up less room, the largest because it is the best quality, the MP3 version because it is the most portable or more than one because each has their uses for you.
By now you should have some idea of why deduping is not a trivial issue. Of the three types I have described it turns out that cleaning up type 1 duplicates is quite easy to automate. I've written a script called DeDuper which can go through a selection of tracks, spot those with the same path and remove the redundant entries. As a bonus it retains the entry with the earliest date added value, adds in the play & skip counts from the deleted entries and sets the last played/skipped dates to their most recent values. Although the script would work on the whole library it will be much more efficient to use the iTunes Show Exact Duplicates tool before running the script. You should also backup your library or, at the very least, the iTunes database iTunes Library.itl before running the script. Chances are you have type 2 duplicates and the script is of no use to you in its present form...
I hope to extend the script over the next few days to cope with type 2 and type 3 duplicates, ideally making the distinction between the two types, handling type 2 automatically (hopefully moving the removed files to somewhere they could still be restored from if required) and prompting for the user to choose which file to keep in the case of type 3. Again metadata would be merged. Hmm, just realized I need to do something about rating too.