fdupes

From Gentoo Wiki
Jump to:navigation Jump to:search

FDUPES is a tool for identifying duplicate files across a set of directories. It works by scanning the specified directories for files, running md5sum on those files, then running a byte-by-byte comparison on the files. It can work in tandem with duperemove, a deduplication tool (not only for btrfs).

Installation

Emerge

root #emerge --ask app-misc/fdupes

Configuration

FDUPES has no configuration options other than optional command-line parameters.

Usage

Invocation

user $fdupes --help

Find duplicate files recursively

To find duplicate files in target directories recursively the following command could be used:

user $fdupes --recurse --size /path/to/dir/one /path/to/dir/two
Note
Permissions problems may occur when some files are owned by root or another user; in this case users should run the fdupes command again with appropriate privileges.

Most of the time, however, it is wise to redirect the output of the fdupes command to a file:

user $fdupes --recurse --size /path/to/dir/one /path/to/dir/two >> /tmp/fdupes_file_list.txt

Creating a file is a wise and efficient idea, especially when a large amount of files are being compared. It is much easier to look through a large file list with a text editor rather than attempting to parse the list via scroll back in a terminal buffer.

Find and delete files recursively

Users are strongly cautioned to run one of the above command(s) before running one of the next commands. This is done in order to verify the output is as expected. Do the operation right the first time; the fewer mistakes the better! After output is satisfactory, the following command can be used to delete all but the first occurrence of the file. Be sure to list the directories in the order of precedence so that the correct files are preserved. I.e. to keep all the files in home directory, listing the home directory last will make it show up first in the list.

The following command uses the --noprompt (-N) and --delete (-d) options to delete all but the first duplicate found in a file list (created using a previous command) without prompting the user:

user $fdupes --noprompt --delete --recurse /path/to/dir/one /path/to/dir/two

Removal

Unmerge

No special files need to removed. Uninstall FDUPES via:

root #emerge --ask --depclean --verbose app-misc/fdupes

See also