diff

DIFF(1POSIX)                POSIX Programmer's Manual               DIFF(1POSIX)



PROLOG
       This manual page is part of the POSIX Programmer's Manual.  The Linux
       implementation of this interface may differ (consult the corresponding
       Linux manual page for details of Linux behavior), or the interface may
       not be implemented on Linux.


NAME
       diff — compare two files

SYNOPSIS
       diff [−c|−e|−f|−u|−C n|−U n] [−br] file1 file2

DESCRIPTION
       The diff utility shall compare the contents of file1 and file2 and write
       to standard output a list of changes necessary to convert file1 into
       file2.  This list should be minimal. No output shall be produced if the
       files are identical.

OPTIONS
       The diff utility shall conform to the Base Definitions volume of
       POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines.

       The following options shall be supported:

       −b        Cause any amount of white space at the end of a line to be
                 treated as a single <newline> (that is, the white-space
                 characters preceding the <newline> are ignored) and other
                 strings of white-space characters, not including <newline>
                 characters, to compare equal.

       −c        Produce output in a form that provides three lines of copied
                 context.

       −C n      Produce output in a form that provides n lines of copied
                 context (where n shall be interpreted as a positive decimal
                 integer).

       −e        Produce output in a form suitable as input for the ed utility,
                 which can then be used to convert file1 into file2.

       −f        Produce output in an alternative form, similar in format to −e,
                 but not intended to be suitable as input for the ed utility,
                 and in the opposite order.

       −r        Apply diff recursively to files and directories of the same
                 name when file1 and file2 are both directories.

                 The diff utility shall detect infinite loops; that is, entering
                 a previously visited directory that is an ancestor of the last
                 file encountered.  When it detects an infinite loop, diff shall
                 write a diagnostic message to standard error and shall either
                 recover its position in the hierarchy or terminate.

       −u        Produce output in a form that provides three lines of unified
                 context.

       −U n      Produce output in a form that provides n lines of unified
                 context (where n shall be interpreted as a non-negative decimal
                 integer).

OPERANDS
       The following operands shall be supported:

       file1, file2
                 A pathname of a file to be compared. If either the file1 or
                 file2 operand is '−', the standard input shall be used in its
                 place.

       If both file1 and file2 are directories, diff shall not compare block
       special files, character special files, or FIFO special files to any
       files and shall not compare regular files to directories.  Further
       details are as specified in Diff Directory Comparison Format.  The
       behavior of diff on other file types is implementation-defined when found
       in directories.

       If only one of file1 and file2 is a directory, diff shall be applied to
       the non-directory file and the file contained in the directory file with
       a filename that is the same as the last component of the non-directory
       file.

STDIN
       The standard input shall be used only if one of the file1 or file2
       operands references standard input. See the INPUT FILES section.

INPUT FILES
       The input files may be of any type.

ENVIRONMENT VARIABLES
       The following environment variables shall affect the execution of diff:

       LANG      Provide a default value for the internationalization variables
                 that are unset or null. (See the Base Definitions volume of
                 POSIX.1‐2008, Section 8.2, Internationalization Variables for
                 the precedence of internationalization variables used to
                 determine the values of locale categories.)

       LC_ALL    If set to a non-empty string value, override the values of all
                 the other internationalization variables.

       LC_CTYPE  Determine the locale for the interpretation of sequences of
                 bytes of text data as characters (for example, single-byte as
                 opposed to multi-byte characters in arguments and input files).

       LC_MESSAGES
                 Determine the locale that should be used to affect the format
                 and contents of diagnostic messages written to standard error
                 and informative messages written to standard output.

       LC_TIME   Determine the locale for affecting the format of file
                 timestamps written with the −C and −c options.

       NLSPATH   Determine the location of message catalogs for the processing
                 of LC_MESSAGES.

       TZ        Determine the timezone used for calculating file timestamps
                 written with a context format. If TZ is unset or null, an
                 unspecified default timezone shall be used.

ASYNCHRONOUS EVENTS
       Default.

STDOUT
   Diff Directory Comparison Format
       If both file1 and file2 are directories, the following output formats
       shall be used.

       In the POSIX locale, each file that is present in only one directory
       shall be reported using the following format:

           "Only in %s: %s\n", <directory pathname>, <filename>

       In the POSIX locale, subdirectories that are common to the two
       directories may be reported with the following format:

           "Common subdirectories: %s and %s\n", <directory1 pathname>,
               <directory2 pathname>

       For each file common to the two directories, if the two files are not to
       be compared: if the two files have the same device ID and file serial
       number, or are both block special files that refer to the same device, or
       are both character special files that refer to the same device, in the
       POSIX locale the output format is unspecified.  Otherwise, in the POSIX
       locale an unspecified format shall be used that contains the pathnames of
       the two files.

       For each file common to the two directories, if the files are compared
       and are identical, no output shall be written. If the two files differ,
       the following format is written:

           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>

       where <diff_options> are the options as specified on the command line.

       All directory pathnames listed in this section shall be relative to the
       original command line arguments. All other names of files listed in this
       section shall be filenames (pathname components).

   Diff Binary Output Format
       In the POSIX locale, if one or both of the files being compared are not
       text files, it is implementation-defined whether diff uses the binary
       file output format or the other formats as specified below. The binary
       file output format shall contain the pathnames of two files being
       compared and the string "differ".

       If both files being compared are text files, depending on the options
       specified, one of the following formats shall be used to write the
       differences.

   Diff Default Output Format
       The default (without −e, −f, −c, −C, −u, or −U options) diff utility
       output shall contain lines of these forms:

           "%da%d\n", <num1>, <num2>

           "%da%d,%d\n", <num1>, <num2>, <num3>

           "%dd%d\n", <num1>, <num2>

           "%d,%dd%d\n", <num1>, <num2>, <num3>

           "%dc%d\n", <num1>, <num2>

           "%d,%dc%d\n", <num1>, <num2>, <num3>

           "%dc%d,%d\n", <num1>, <num2>, <num3>

           "%d,%dc%d,%d\n", <num1>, <num2>, <num3>, <num4>

       These lines resemble ed subcommands to convert file1 into file2.  The
       line numbers before the action letters shall pertain to file1; those
       after shall pertain to file2.  Thus, by exchanging a for d and reading
       the line in reverse order, one can also determine how to convert file2
       into file1.  As in ed, identical pairs (where num1= num2) are abbreviated
       as a single number.

       Following each of these lines, diff shall write to standard output all
       lines affected in the first file using the format:

           "< %s", <line>

       and all lines affected in the second file using the format:

           "> %s", <line>

       If there are lines affected in both file1 and file2 (as with the c
       subcommand), the changes are separated with a line consisting of three
       <hyphen> characters:

           "−−−\n"

   Diff −e Output Format
       With the −e option, a script shall be produced that shall, when provided
       as input to ed, along with an appended w (write) command, convert file1
       into file2.  Only the a (append), c (change), d (delete), i (insert), and
       s (substitute) commands of ed shall be used in this script. Text lines,
       except those consisting of the single character <period> ('.'), shall be
       output as they appear in the file.

   Diff −f Output Format
       With the −f option, an alternative format of script shall be produced. It
       is similar to that produced by −e, with the following differences:

        1. It is expressed in reverse sequence; the output of −e orders changes
           from the end of the file to the beginning; the −f from beginning to
           end.

        2. The command form <lines> <command-letter> used by −e is reversed. For
           example, 10c with −e would be c10 with −f.

        3. The form used for ranges of line numbers is <space>-separated, rather
           than <comma>-separated.

   Diff −c or −C Output Format
       With the −c or −C option, the output format shall consist of affected
       lines along with surrounding lines of context. The affected lines shall
       show which ones need to be deleted or changed in file1, and those added
       from file2.  With the −c option, three lines of context, if available,
       shall be written before and after the affected lines. With the −C option,
       the user can specify how many lines of context are written.  The exact
       format follows.

       The name and last modification time of each file shall be output in the
       following format:

           "*** %s %s\n", file1, <file1 timestamp>
           "−−− %s %s\n", file2, <file2 timestamp>

       Each <file> field shall be the pathname of the corresponding file being
       compared. The pathname written for standard input is unspecified.

       In the POSIX locale, each <timestamp> field shall be equivalent to the
       output from the following command:

           date "+%a %b %e %T %Y"

       without the trailing <newline>, executed at the time of last modification
       of the corresponding file (or the current time, if the file is standard
       input).

       Then, the following output formats shall be applied for every set of
       changes.

       First, a line shall be written in the following format:

           "***************\n"

       Next, the range of lines in file1 shall be written in the following
       format if the range contains two or more lines:

           "*** %d,%d ****\n", <beginning line number>, <ending line number>

       and the following format otherwise:

           "*** %d ****\n", <ending line number>

       The ending line number of an empty range shall be the number of the
       preceding line, or 0 if the range is at the start of the file.

       Next, the affected lines along with lines of context (unaffected lines)
       shall be written. Unaffected lines shall be written in the following
       format:

           "  %s", <unaffected_line>

       Deleted lines shall be written as:

           "− %s", <deleted_line>

       Changed lines shall be written as:

           "! %s", <changed_line>

       Next, the range of lines in file2 shall be written in the following
       format if the range contains two or more lines:

           "−−− %d,%d −−−−\n", <beginning line number>, <ending line number>

       and the following format otherwise:

           "−−− %d −−−−\n", <ending line number>

       Then, lines of context and changed lines shall be written as described in
       the previous formats. Lines added from file2 shall be written in the
       following format:

           "+ %s", <added_line>

   Diff −u or −U Output Format
       The −u or −U options behave like the −c or −C options, except that the
       context lines are not repeated; instead, the context, deleted, and added
       lines are shown together, interleaved.  The exact format follows.

       The name and last modification time of each file shall be output in the
       following format:

           "--- %s%s%s %s0, file1, <file1 timestamp>, <file1 frac>, <file1 zone>
           "+++ %s%s%s %s0, file2, <file2 timestamp>, <file2 frac>, <file2 zone>

       Each <file> field shall be the pathname of the corresponding file being
       compared, or the single character '−' if standard input is being
       compared. However, if the pathname contains a <tab> or a <newline>, or if
       it does not consist entirely of characters taken from the portable
       character set, the behavior is implementation-defined.

       Each <timestamp> field shall be equivalent to the output from the
       following command:

           date '+%Y-%m-%d %H:%M:%S'

       without the trailing <newline>, executed at the time of last modification
       of the corresponding file (or the current time, if the file is standard
       input).

       Each <frac> field shall be either empty, or a decimal point followed by
       at least one decimal digit, indicating the fractional-seconds part (if
       any) of the file timestamp. The number of fractional digits shall be at
       least the number needed to represent the file's timestamp without loss of
       information.

       Each <zone> field shall be of the form "shhmm", where "shh" is a signed
       two-digit decimal number in the range −24 through +25, and "mm" is an
       unsigned two-digit decimal number in the range 00 through 59.  It
       represents the timezone of the timestamp as the number of hours (hh) and
       minutes (mm) east (+) or west (−) of UTC for the timestamp.  If the hours
       and minutes are both zero, the sign shall be '+'.  However, if the
       timezone is not an integral number of minutes away from UTC, the <zone>
       field is implementation-defined.

       Then, the following output formats shall be applied for every set of
       changes.

       First, the range of lines in each file shall be written in the following
       format:

           "@@ -%s +%s @@", <file1 range>, <file2 range>

       Each <range> field shall be of the form:

           "%1d", <beginning line number>

       if the range contains exactly one line, and:

           "%1d,%1d", <beginning line number>, <number of lines>

       otherwise. If a range is empty, its beginning line number shall be the
       number of the line just before the range, or 0 if the empty range starts
       the file.

       Next, the affected lines along with lines of context shall be written.
       Each non-empty unaffected line shall be written in the following format:

           " %s", <unaffected_line>

       where the contents of the unaffected line shall be taken from file1.  It
       is implementation-defined whether an empty unaffected line is written as
       an empty line or a line containing a single <space> character. This line
       also represents the same line of file2, even though file2's line may
       contain different contents due to the −b.  Deleted lines shall be written
       as:

           "-%s", <deleted_line>

       Added lines shall be written as:

           "+%s", <added_line>

       The order of lines written shall be the same as that of the corresponding
       file. A deleted line shall never be written immediately after an added
       line.

       If −U n is specified, the output shall contain no more than n consecutive
       unaffected lines; and if the output contains an affected line and this
       line is adjacent to up to n consecutive unaffected lines in the
       corresponding file, the output shall contain these unaffected lines.  −u
       shall act like −U3.

STDERR
       The standard error shall be used only for diagnostic messages.

OUTPUT FILES
       None.

EXTENDED DESCRIPTION
       None.

EXIT STATUS
       The following exit values shall be returned:

        0    No differences were found.

        1    Differences were found.

       >1    An error occurred.

CONSEQUENCES OF ERRORS
       Default.

       The following sections are informative.

APPLICATION USAGE
       If lines at the end of a file are changed and other lines are added, diff
       output may show this as a delete and add, as a change, or as a change and
       add; diff is not expected to know which happened and users should not
       care about the difference in output as long as it clearly shows the
       differences between the files.

EXAMPLES
       If dir1 is a directory containing a directory named x, dir2 is a
       directory containing a directory named x, dir1/x and dir2/x both contain
       files named date.out, and dir2/x contains a file named y, the command:

           diff −r dir1 dir2

       could produce output similar to:

           Common subdirectories: dir1/x and dir2/x
           Only in dir2/x: y
           diff −r dir1/x/date.out dir2/x/date.out
           1c1
           < Mon Jul  2 13:12:16 PDT 1990
           −−−
           > Tue Jun 19 21:41:39 PDT 1990

RATIONALE
       The −h option was omitted because it was insufficiently specified and
       does not add to applications portability.

       Historical implementations employ algorithms that do not always produce a
       minimum list of differences; the current language about making every
       effort is the best this volume of POSIX.1‐2008 can do, as there is no
       metric that could be employed to judge the quality of implementations
       against any and all file contents. The statement ``This list should be
       minimal'' clearly implies that implementations are not expected to
       provide the following output when comparing two 100-line files that
       differ in only one character on a single line:

           1,100c1,100
           all 100 lines from file1 preceded with "< "
           −−−
           all 100 lines from file2 preceded with "> "

       The ``Only in'' messages required when the −r option is specified are not
       used by most historical implementations if the −e option is also
       specified. It is required here because it provides useful information
       that must be provided to update a target directory hierarchy to match a
       source hierarchy. The ``Common subdirectories'' messages are written by
       System V and 4.3 BSD when the −r option is specified. They are allowed
       here but are not required because they are reporting on something that is
       the same, not reporting a difference, and are not needed to update a
       target hierarchy.

       The −c option, which writes output in a format using lines of context,
       has been included. The format is useful for a variety of reasons, among
       them being much improved readability and the ability to understand
       difference changes when the target file has line numbers that differ from
       another similar, but slightly different, copy. The patch utility is most
       valuable when working with difference listings using a context format.
       The BSD version of −c takes an optional argument specifying the amount of
       context. Rather than overloading −c and breaking the Utility Syntax
       Guidelines for diff, the standard developers decided to add a separate
       option for specifying a context diff with a specified amount of context
       (−C).  Also, the format for context diffs was extended slightly in 4.3
       BSD to allow multiple changes that are within context lines from each
       other to be merged together. The output format contains an additional
       four <asterisk> characters after the range of affected lines in the first
       filename. This was to provide a flag for old programs (like old versions
       of patch) that only understand the old context format. The version of
       context described here does not require that multiple changes within
       context lines be merged, but it does not prohibit it either. The
       extension is upwards-compatible, so any vendors that wish to retain the
       old version of diff can do so by adding the extra four <asterisk>
       characters (that is, utilities that currently use diff and understand the
       new merged format will also understand the old unmerged format, but not
       vice versa).

       The −u and −U options of GNU diff have been included. Their output
       format, designed by Wayne Davison, takes up less space than −c and −C
       format, and in many cases is easier to read. The format's timestamps do
       not vary by locale, so LC_TIME does not affect it. The format's line
       numbers are rendered with the %1d format, not %d, because the file format
       notation rules would allow extra <blank> characters to appear around the
       numbers.

       The substitute command was added as an additional format for the −e
       option. This was added to provide implementations with a way to fix the
       classic ``dot alone on a line'' bug present in many versions of diff.
       Since many implementations have fixed this bug, the standard developers
       decided not to standardize broken behavior, but rather to provide the
       necessary tool for fixing the bug. One way to fix this bug is to output
       two periods whenever a lone period is needed, then terminate the append
       command with a period, and then use the substitute command to convert the
       two periods into one period.

       The BSD-derived −r option was added to provide a mechanism for using diff
       to compare two file system trees. This behavior is useful, is standard
       practice on all BSD-derived systems, and is not easily reproducible with
       the find utility.

       The requirement that diff not compare files in some circumstances, even
       though they have the same name, is based on the actual output of
       historical implementations.  The specified behavior precludes the
       problems arising from running into FIFOs and other files that would cause
       diff to hang waiting for input with no indication to the user that diff
       was hung. An earlier version of this standard specified the output format
       more precisely, but in practice this requirement was widely ignored and
       the benefit of standardization seemed small, so it is now unspecified. In
       most common usage, diff −r should indicate differences in the file
       hierarchies, not the difference of contents of devices pointed to by the
       hierarchies.

       Many early implementations of diff require seekable files. Since the
       System Interfaces volume of POSIX.1‐2008 supports named pipes, the
       standard developers decided that such a restriction was unreasonable.
       Note also that the allowed filename almost always refers to a pipe.

       No directory search order is specified for diff.  The historical ordering
       is, in fact, not optimal, in that it prints out all of the differences at
       the current level, including the statements about all common
       subdirectories before recursing into those subdirectories.

       The message:

           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>

       does not vary by locale because it is the representation of a command,
       not an English sentence.

FUTURE DIRECTIONS
       None.

SEE ALSO
       cmp, comm, ed, find

       The Base Definitions volume of POSIX.1‐2008, Chapter 8, Environment
       Variables, Section 12.2, Utility Syntax Guidelines

COPYRIGHT
       Portions of this text are reprinted and reproduced in electronic form
       from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
       -- Portable Operating System Interface (POSIX), The Open Group Base
       Specifications Issue 7, Copyright (C) 2013 by the Institute of Electrical
       and Electronics Engineers, Inc and The Open Group.  (This is POSIX.1-2008
       with the 2013 Technical Corrigendum 1 applied.) In the event of any
       discrepancy between this version and the original IEEE and The Open Group
       Standard, the original IEEE and The Open Group Standard is the referee
       document. The original Standard can be obtained online at
       http://www.unix.org/online.html .

       Any typographical or formatting errors that appear in this page are most
       likely to have been introduced during the conversion of the source files
       to man page format. To report such errors, see
       https://www.kernel.org/doc/man-pages/reporting_bugs.html .



IEEE/The Open Group                   2013                          DIFF(1POSIX)