jdupes

JDUPES(1)                   General Commands Manual                  JDUPES(1)



NAME
       jdupes - finds and performs actions upon duplicate files

SYNOPSIS
       jdupes [ options ] FILES and/or DIRECTORIES ...


DESCRIPTION
       Searches the given path(s) for duplicate files. Such files are found by
       comparing file sizes, then partial and full file hashes, followed by a
       byte-by-byte comparison. The default behavior with no other "action
       options" specified (delete, summarize, link, dedupe, etc.) is to print
       sets of matching files.


OPTIONS
       -@ --loud
              output annoying low-level debug info while running

       -0 --printnull
              when printing matches, use null bytes instead of CR/LF bytes,
              just like 'find -print0' does. This has no effect with any
              action mode other than the default "print matches" (delete,
              link, etc. will still print normal line endings in the output.)

       -1 --one-file-system
              do not match files that are on different filesystems or devices

       -A --nohidden
              exclude hidden files from consideration

       -B --dedupe
              issue the btrfs same-extents ioctl to trigger a deduplication on
              disk. The program must be built with btrfs support for this
              option to be available

       -C --chunksize=BYTES
              set the I/O chunk size manually; larger values may improve
              performance on rotating media by reducing the number of head
              seeks required, but also increases memory usage and can reduce
              performance in some cases

       -D --debug
              if this feature is compiled in, show debugging statistics and
              info at the end of program execution

       -d --delete
              prompt user for files to preserve, deleting all others (see
              CAVEATS below)

       -f --omitfirst
              omit the first file in each set of matches

       -H --hardlinks
              normally, when two or more files point to the same disk area
              they are treated as non-duplicates; this option will change this
              behavior

       -h --help
              displays help

       -i --reverse
              reverse (invert) the sort order of matches

       -I --isolate
              isolate each command-line parameter from one another; only match
              if the files are under different parameter specifications

       -L --linkhard
              replace all duplicate files with hardlinks to the first file in
              each set of duplicates

       -m --summarize
              summarize duplicate file information

       -M --printwithsummary
              print matches and summarize the duplicate file information at
              the end

       -N --noprompt
              when used together with --delete, preserve the first file in
              each set of duplicates and delete the others without prompting
              the user

       -n --noempty
              exclude zero-length files from consideration; this option is the
              default behavior and does nothing (also see -z/--zeromatch)

       -O --paramorder
              parameter order preservation is more important than the chosen
              sort; this is particularly useful with the -N option to ensure
              that automatic deletion behaves in a controllable way

       -o --order=WORD
              order files according to WORD: time - sort by modification time
              name - sort by filename (default)

       -p --permissions
              don't consider files with different owner/group or permission
              bits as duplicates

       -P --print=type
              print extra information to stdout; valid options are: early -
              matches that pass early size/permission/link/etc. checks partial
              - files whose partial hashes match fullhash - files whose full
              hashes match

       -Q --quick
              [WARNING: RISK OF DATA LOSS, SEE CAVEATS] skip byte-for-byte
              verification of duplicate pairs (use hashes only)

       -q --quiet
              hide progress indicator

       -R --recurse:
              for each directory given after this option follow subdirectories
              encountered within (note the ':' at the end of option; see the
              Examples section below for further explanation)

       -r --recurse
              for every directory given follow subdirectories encountered
              within

       -l --linksoft
              replace all duplicate files with symlinks to the first file in
              each set of duplicates

       -S --size
              show size of duplicate files

       -s --symlinks
              follow symlinked directories

       -T --partial-only
              [WARNING: EXTREME RISK OF DATA LOSS, SEE CAVEATS] match based on
              hash of first block of file data, ignoring the rest

       -v --version
              display jdupes version and compilation feature flags

       -x --xsize=[+]SIZE (NOTE: deprecated in favor of -X)
              exclude files of size less than SIZE from consideration, or if
              SIZE is prefixed with a '+' i.e.  jdupes -x +226 [files] then
              exclude files larger than SIZE. Suffixes K/M/G can be used.

       -X --exclude=spec:info
              exclude files based on specified criteria; supported specs are:

              `size[+-=]:number[suffix]'
                     Match only if size is greater (+), less than (-), or
                     equal to (=) the specified number, with an optional
                     multiplier suffix. The +/- and = specifiers can be
                     combined; ex :"size+=:4K" will match if size is greater
                     than or equal to four kilobytes (4096 bytes). Suffixes
                     supported are K/M/G/T/P/E with a B or iB extension (all
                     case-insensitive); no extension or an IB extension
                     specify binary multipliers while a B extension specifies
                     decimal multipliers (ex: 4K or 4KiB = 4096, 4KB = 4000.)

       -z --zeromatch
              consider zero-length files to be duplicates; this replaces the
              old default behavior when -n was not specified

       -Z --softabort
              if the user aborts the program (as with CTRL-C) act on the
              matches that were found before the abort was received. For
              example, if -L and -Z are specified, all matches found prior to
              the abort will be hard linked. The default behavior without -Z
              is to abort without taking any actions.


NOTES
       A set of arrows are used in hard linking to show what action was taken
       on each link candidate. These arrows are as follows:


       ---->  This file was successfully hard linked to the first file in the
              duplicate chain

       -@@->  This file was successfully symlinked to the first file in the
              chain

       -==->  This file was already a hard link to the first file in the chain

       -//->  Linking this file failed due to an error during the linking
              process


       Duplicate files are listed together in groups with each file displayed
       on a separate line. The groups are then separated from each other by
       blank lines.


EXAMPLES
       jdupes a --recurse: b
              will follow subdirectories under b, but not those under a.

       jdupes a --recurse b
              will follow subdirectories under both a and b.

       jdupes -O dir1 dir3 dir2
              will always place 'dir1' results first in any match set (where
              relevant)


CAVEATS
       Using -1 or --one-file-system prevents matches that cross filesystems,
       but a more relaxed form of this option may be added that allows cross-
       matching for all filesystems that each parameter is present on.

       When using -d or --delete, care should be taken to insure against
       accidental data loss.

       -Z or --softabort used to be --hardabort in jdupes prior to v1.5 and
       had the opposite behavior.  Defaulting to taking action on abort is
       probably not what most users would expect. The decision to invert
       rather than reassign to a different option was made because this
       feature was still fairly new at the time of the change.

       The -O or --paramorder option allows the user greater control over what
       appears in the first position of a match set, specifically for keeping
       the -N option from deleting all but one file in a set in a seemingly
       random way. All directories specified on the command line will be used
       as the sorting order of result sets first, followed by the sorting
       algorithm set by the -o or --order option. This means that the order of
       all match pairs for a single directory specification will retain the
       old sorting behavior even if this option is specified.

       When used together with options -s or --symlink, a user could
       accidentally preserve a symlink while deleting the file it points to.

       The -Q or --quick option only reads each file once, hashes it, and
       performs comparisons based solely on the hashes. There is a small but
       significant risk of a hash collision which is the purpose of the
       failsafe byte-for-byte comparison that this option explicitly bypasses.
       Do not use it on ANY data set for which any amount of data loss is
       unacceptable. This option is not included in the help text for the
       program due to its risky nature.  You have been warned!

       The -T or --partial-only option produces results based on a hash of the
       first block of file data in each file, ignoring everything else in the
       file. Partial hash checks have always been an important exclusion step
       in the jdupes algorithm, usually hashing the first 4096 bytes of data
       and allowing files that are different at the start to be rejected
       early. In certain scenarios it may be a useful heuristic for a user to
       see that a set of files has the same size and the same starting data,
       even if the remaining data does not match; one example of this would be
       comparing files with data blocks that are damaged or missing such as an
       incomplete file transfer or checking a data recovery against known-good
       copies to see what damaged data can be deleted in favor of restoring
       the known-good copy. This option is meant to be used with informational
       actions and can result in EXTREME DATA LOSS if used with options that
       delete files, create hard links, or perform other destructive actions
       on data based on the matching output. Because of the potential for
       massive data destruction, this option MUST BE SPECIFIED TWICE to take
       effect and will error out if it is only specified once.

       Using the -C or --chunksize option to override I/O chunk size can
       increase performance on rotating storage media by reducing "head
       thrashing," reading larger amounts of data sequentially from each file.
       This tunable size can have bad side effects; the default size maximizes
       algorithmic performance without regard to the I/O characteristics of
       any given device and uses a modest amount of memory, but other values
       may greatly increase memory usage or incur a lot more system call
       overhead. Try several different values to see how they affect
       performance for your hardware and data set. This option does not affect
       match results in any way, so even if it slows down the file matching
       process it will not hurt anything.


REPORTING BUGS
       Send all bug reports to jody@jodybruchon.com or use the Issue tracker
       at http://github.com/jbruchon/jdupes/issues


AUTHOR
       jdupes is a fork of 'fdupes' which is maintained by and contains extra
       code copyrighted by Jody Bruchon <jody@jodybruchon.com>

       Based on 'fdupes' created by Adrian Lopez <adrian2@caribe.net>



                                                                     JDUPES(1)