file

FILE(1P)                   POSIX Programmer's Manual                  FILE(1P)



PROLOG
       This manual page is part of the POSIX Programmer's Manual.  The Linux
       implementation of this interface may differ (consult the corresponding
       Linux manual page for details of Linux behavior), or the interface may
       not be implemented on Linux.


NAME
       file — determine file type

SYNOPSIS
       file [−dh] [−M file] [−m file] file...

       file −i [−h] file...

DESCRIPTION
       The file utility shall perform a series of tests in sequence on each
       specified file in an attempt to classify it:

        1. If file does not exist, cannot be read, or its file status could
           not be determined, the output shall indicate that the file was
           processed, but that its type could not be determined.

        2. If the file is not a regular file, its file type shall be
           identified.  The file types directory, FIFO, socket, block special,
           and character special shall be identified as such. Other
           implementation-defined file types may also be identified. If file
           is a symbolic link, by default the link shall be resolved and file
           shall test the type of file referenced by the symbolic link. (See
           the −h and −i options below.)

        3. If the length of file is zero, it shall be identified as an empty
           file.

        4. The file utility shall examine an initial segment of file and shall
           make a guess at identifying its contents based on position-
           sensitive tests. (The answer is not guaranteed to be correct; see
           the −d, −M, and −m options below.)

        5. The file utility shall examine file and make a guess at identifying
           its contents based on context-sensitive default system tests. (The
           answer is not guaranteed to be correct.)

        6. The file shall be identified as a data file.

       If file does not exist, cannot be read, or its file status could not be
       determined, the output shall indicate that the file was processed, but
       that its type could not be determined.

       If file is a symbolic link, by default the link shall be resolved and
       file shall test the type of file referenced by the symbolic link.

OPTIONS
       The file utility shall conform to the Base Definitions volume of
       POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines, except that the
       order of the −m, −d, and −M options shall be significant.

       The following options shall be supported by the implementation:

       −d        Apply any position-sensitive default system tests and
                 context-sensitive default system tests to the file. This is
                 the default if no −M or −m option is specified.

       −h        When a symbolic link is encountered, identify the file as a
                 symbolic link. If −h is not specified and file is a symbolic
                 link that refers to a nonexistent file, file shall identify
                 the file as a symbolic link, as if −h had been specified.

       −i        If a file is a regular file, do not attempt to classify the
                 type of the file further, but identify the file as specified
                 in the STDOUT section.

       −M file   Specify the name of a file containing position-sensitive
                 tests that shall be applied to a file in order to classify it
                 (see the EXTENDED DESCRIPTION). No position-sensitive default
                 system tests nor context-sensitive default system tests shall
                 be applied unless the −d option is also specified.

       −m file   Specify the name of a file containing position-sensitive
                 tests that shall be applied to a file in order to classify it
                 (see the EXTENDED DESCRIPTION).

       If the −m option is specified without specifying the −d option or the
       −M option, position-sensitive default system tests shall be applied
       after the position-sensitive tests specified by the −m option. If the
       −M option is specified with the −d option, the −m option, or both, or
       the −m option is specified with the −d option, the concatenation of the
       position-sensitive tests specified by these options shall be applied in
       the order specified by the appearance of these options. If a −M or −m
       file option-argument is , the results are unspecified.

OPERANDS
       The following operand shall be supported:

       file      A pathname of a file to be tested.

STDIN
       The standard input shall be used if a file operand is '−' and the
       implementation treats the '−' as meaning standard input.  Otherwise,
       the standard input shall not be used.

INPUT FILES
       The file can be any file type.

ENVIRONMENT VARIABLES
       The following environment variables shall affect the execution of file:

       LANG      Provide a default value for the internationalization
                 variables that are unset or null. (See the Base Definitions
                 volume of POSIX.1‐2008, Section 8.2, Internationalization
                 Variables for the precedence of internationalization
                 variables used to determine the values of locale categories.)

       LC_ALL    If set to a non-empty string value, override the values of
                 all the other internationalization variables.

       LC_CTYPE  Determine the locale for the interpretation of sequences of
                 bytes of text data as characters (for example, single-byte as
                 opposed to multi-byte characters in arguments and input
                 files).

       LC_MESSAGES
                 Determine the locale that should be used to affect the format
                 and contents of diagnostic messages written to standard error
                 and informative messages written to standard output.

       NLSPATH   Determine the location of message catalogs for the processing
                 of LC_MESSAGES.

ASYNCHRONOUS EVENTS
       Default.

STDOUT
       In the POSIX locale, the following format shall be used to identify
       each operand, file specified:

           "%s: %s\n", <file>, <type>

       The values for <type> are unspecified, except that in the POSIX locale,
       if file is identified as one of the types listed in the following
       table, <type> shall contain (but is not limited to) the corresponding
       string, unless the file is identified by a position-sensitive test
       specified by a −M or −m option. Each <space> shown in the strings shall
       be exactly one <space>.

                       Table 4-9: File Utility Output Strings

───────┬─────────────────────────────────────────────┬──────────────────────────────────┬─      │
       │         If file is:                    <type> shall contain the string:   Notes│       │
───────┼─────────────────────────────────────────────┼──────────────────────────────────┼─      │
 Nonexistent                                    cannot open                             │       │
       │                                             │                                  │       │
       │Block special                                │ block special                    │ 1     │
       │Character special                            │ character special                │ 1     │
       │Directory                                    │ directory                        │ 1     │
       │FIFO                                         │ fifo                             │ 1     │
       │Socket                                       │ socket                           │ 1     │
       │Symbolic link                                │ symbolic link to                 │ 1     │
       │Regular file                                 │ regular file                     │ 1,2   │
       │Empty regular file                           │ empty                            │ 3     │
       │Regular file that cannot be read             │ cannot open                      │ 3     │
       │                                             │                                  │       │
       │Executable binary                            │ executable                       │ 3,4,6 │
       │ar archive library (see ar)                  │ archive                          │ 3,4,6 │
       │Extended cpio format (see pax)               │ cpio archive                     │ 3,4,6 │
       │Extended tar format (see ustar in pax)       │ tar archive                      │ 3,4,6 │
       │                                             │                                  │       │
       │Shell script                                 │ commands text                    │ 3,5,6 │
       │C-language source                            │ c program text                   │ 3,5,6 │
       │FORTRAN source                               │ fortran program text             │ 3,5,6 │
       │                                             │                                  │       │
       │Regular file whose type cannot be determined │ data                             │ 3     │
       └─────────────────────────────────────────────┴──────────────────────────────────┴───────┘
       Notes:

                  1. This is a file type test.

                  2. This test is applied only if the −i option is specified.

                  3. This test is applied only if the −i option is not
                     specified.

                  4. This is a position-sensitive default system test.

                  5. This is a context-sensitive default system test.

                  6. Position-sensitive default system tests and context-
                     sensitive default system tests are not applied if the −M
                     option is specified unless the −d option is also
                     specified.

       In the POSIX locale, if file is identified as a symbolic link (see the
       −h option), the following alternative output format shall be used:

           "%s: %s %s\n", <file>, <type>, <contents of link>"

       If the file named by the file operand does not exist, cannot be read,
       or the type of the file named by the file operand cannot be determined,
       this shall not be considered an error that affects the exit status.

STDERR
       The standard error shall be used only for diagnostic messages.

OUTPUT FILES
       None.

EXTENDED DESCRIPTION
       A file specified as an option-argument to the −m or −M options shall
       contain one position-sensitive test per line, which shall be applied to
       the file. If the test succeeds, the message field of the line shall be
       printed and no further tests shall be applied, with the exception that
       tests on immediately following lines beginning with a single '>'
       character shall be applied.

       Each line shall be composed of the following four <tab>-separated
       fields. (Implementations may allow any combination of one or more
       white-space characters other than <newline> to act as field
       separators.)

       offset    An unsigned number (optionally preceded by a single '>'
                 character) specifying the offset, in bytes, of the value in
                 the file that is to be compared against the value field of
                 the line. If the file is shorter than the specified offset,
                 the test shall fail.

                 If the offset begins with the character '>', the test
                 contained in the line shall not be applied to the file unless
                 the test on the last line for which the offset did not begin
                 with a '>' was successful. By default, the offset shall be
                 interpreted as an unsigned decimal number. With a leading 0x
                 or 0X, the offset shall be interpreted as a hexadecimal
                 number; otherwise, with a leading 0, the offset shall be
                 interpreted as an octal number.

       type      The type of the value in the file to be tested. The type
                 shall consist of the type specification characters d, s, and
                 u, specifying signed decimal, string, and unsigned decimal,
                 respectively.

                 The type string shall be interpreted as the bytes from the
                 file starting at the specified offset and including the same
                 number of bytes specified by the value field. If insufficient
                 bytes remain in the file past the offset to match the value
                 field, the test shall fail.

                 The type specification characters d and u can be followed by
                 an optional unsigned decimal integer that specifies the
                 number of bytes represented by the type. The type
                 specification characters d and u can be followed by an
                 optional C, S, I, or L, indicating that the value is of type
                 char, short, int, or long, respectively.

                 The default number of bytes represented by the type
                 specifiers d, f, and u shall correspond to their respective
                 C-language types as follows. If the system claims conformance
                 to the C-Language Development Utilities option, those
                 specifiers shall correspond to the default sizes used in the
                 c99 utility. Otherwise, the default sizes shall be
                 implementation-defined.

                 For the type specifier characters d and u, the default number
                 of bytes shall correspond to the size of a basic integer type
                 of the implementation. For these specifier characters, the
                 implementation shall support values of the optional number of
                 bytes to be converted corresponding to the number of bytes in
                 the C-language types char, short, int, or long.  These
                 numbers can also be specified by an application as the
                 characters C, S, I, and L, respectively. The byte order used
                 when interpreting numeric values is implementation-defined,
                 but shall correspond to the order in which a constant of the
                 corresponding type is stored in memory on the system.

                 All type specifiers, except for s, can be followed by a mask
                 specifier of the form &number. The mask value shall be AND'ed
                 with the value of the input file before the comparison with
                 the value field of the line is made. By default, the mask
                 shall be interpreted as an unsigned decimal number. With a
                 leading 0x or 0X, the mask shall be interpreted as an
                 unsigned hexadecimal number; otherwise, with a leading 0, the
                 mask shall be interpreted as an unsigned octal number.

                 The strings byte, short, long, and string shall also be
                 supported as type fields, being interpreted as dC, dS, dL,
                 and s, respectively.

       value     The value to be compared with the value from the file.

                 If the specifier from the type field is s or string, then
                 interpret the value as a string. Otherwise, interpret it as a
                 number. If the value is a string, then the test shall succeed
                 only when a string value exactly matches the bytes from the
                 file.

                 If the value is a string, it can contain the following
                 sequences:

                 \character  The <backslash>-escape sequences as specified in
                             the Base Definitions volume of POSIX.1‐2008,
                             Table 5-1, Escape Sequences and Associated
                             Actions ('\\', '\a', '\b', '\f', '\n', '\r',
                             '\t', '\v').  In addition, the escape sequence
                             '\ ' (the <backslash> character followed by a
                             <space> character) shall be recognized to
                             represent a <space> character. The results of
                             using any other character, other than an octal
                             digit, following the <backslash> are unspecified.

                 \octal      Octal sequences that can be used to represent
                             characters with specific coded values. An octal
                             sequence shall consist of a <backslash> followed
                             by the longest sequence of one, two, or three
                             octal-digit characters (01234567).

                 By default, any value that is not a string shall be
                 interpreted as a signed decimal number. Any such value, with
                 a leading 0x or 0X, shall be interpreted as an unsigned
                 hexadecimal number; otherwise, with a leading zero, the value
                 shall be interpreted as an unsigned octal number.

                 If the value is not a string, it can be preceded by a
                 character indicating the comparison to be performed.
                 Permissible characters and the comparisons they specify are
                 as follows:

                 =     The test shall succeed if the value from the file
                       equals the value field.

                 <     The test shall succeed if the value from the file is
                       less than the value field.

                 >     The test shall succeed if the value from the file is
                       greater than the value field.

                 &     The test shall succeed if all of the set bits in the
                       value field are set in the value from the file.

                 ^     The test shall succeed if at least one of the set bits
                       in the value field is not set in the value from the
                       file.

                 x     The test shall succeed if the file is large enough to
                       contain a value of the type specified starting at the
                       offset specified.

       message   The message to be printed if the test succeeds. The message
                 shall be interpreted using the notation for the printf
                 formatting specification; see printf.  If the value field was
                 a string, then the value from the file shall be the argument
                 for the printf formatting specification; otherwise, the value
                 from the file shall be the argument.

EXIT STATUS
       The following exit values shall be returned:

        0    Successful completion.

       >0    An error occurred.

CONSEQUENCES OF ERRORS
       Default.

       The following sections are informative.

APPLICATION USAGE
       The file utility can only be required to guess at many of the file
       types because only exhaustive testing can determine some types with
       certainty. For example, binary data on some implementations might match
       the initial segment of an executable or a tar archive.

       Note that the table indicates that the output contains the stated
       string. Systems may add text before or after the string. For
       executables, as an example, the machine architecture and various facts
       about how the file was link-edited may be included. Note also that on
       systems that recognize shell script files starting with "#!" as
       executable files, these may be identified as executable binary files
       rather than as shell scripts.

EXAMPLES
       Determine whether an argument is a binary executable file:

           file −− "$1" | grep −q ':.*executable' &&
               printf "%s is executable.\n$1"

RATIONALE
       The −f option was omitted because the same effect can (and should) be
       obtained using the xargs utility.

       Historical versions of the file utility attempt to identify the
       following types of files: symbolic link, directory, character special,
       block special, socket, tar archive, cpio archive, SCCS archive, archive
       library, empty, compress output, pack output, binary data, C source,
       FORTRAN source, assembler source, nroff/troff/eqn/tbl source troff
       output, shell script, C shell script, English text, ASCII text, various
       executables, APL workspace, compiled terminfo entries, and CURSES
       screen images. Only those types that are reasonably well specified in
       POSIX or are directly related to POSIX utilities are listed in the
       table.

       Historical systems have used a ``magic file'' named /etc/magic to help
       identify file types. Because it is generally useful for users and
       scripts to be able to identify special file types, the −m flag and a
       portable format for user-created magic files has been specified. No
       requirement is made that an implementation of file use this method of
       identifying files, only that users be permitted to add their own
       classifying tests.

       In addition, three options have been added to historical practice. The
       −d flag has been added to permit users to cause their tests to follow
       any default system tests. The −i flag has been added to permit users to
       test portably for regular files in shell scripts. The −M flag has been
       added to permit users to ignore any default system tests.

       The POSIX.1‐2008 description of default system tests and the
       interaction between the −d, −M, and −m options did not clearly indicate
       that there were two types of ``default system tests''. The ``position-
       sensitive tests'' determine file types by looking for certain string or
       binary values at specific offsets in the file being examined. These
       position-sensitive tests were implemented in historical systems using
       the magic file described above.  Some of these tests are now built into
       the file utility itself on some implementations so the output can
       provide more detail than can be provided by magic files. For example, a
       magic file can easily identify a core file on most implementations, but
       cannot name the program file that dropped the core. A magic file could
       produce output such as:

           /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1

       but by building the test into the file utility, you could get output
       such as:

           /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'

       These extended built-in tests are still to be treated as position-
       sensitive default system tests even if they are not listed in
       /etc/magic or any other magic file.

       The context-sensitive default system tests were always built into the
       file utility. These tests looked for language constructs in text files
       trying to identify shell scripts, C, FORTRAN, and other computer
       language source files, and even plain text files. With the addition of
       the −m and −M options the distinction between position-sensitive and
       context-sensitive default system tests became important because the
       order of testing is important. The context-sensitive system default
       tests should never be applied before any position-sensitive tests even
       if the −d option is specified before a −m option or −M option due to
       the high probability that the context-sensitive system default tests
       will incorrectly identify arbitrary text files as text files before
       position-sensitive tests specified by the −m or −M option would be
       applied to give a more accurate identification.

       Leaving the meaning of −M − and −m − unspecified allows an existing
       prototype of these options to continue to work in a backwards-
       compatible manner. (In that implementation, −M − was roughly equivalent
       to −d in POSIX.1‐2008.)

       The historical −c option was omitted as not particularly useful to
       users or portable shell scripts. In addition, a reasonable
       implementation of the file utility would report any errors found each
       time the magic file is read.

       The historical format of the magic file was the same as that specified
       by the Rationale in the ISO POSIX‐2:1993 standard for the offset,
       value, and message fields; however, it used less precise type fields
       than the format specified by the current normative text. The new type
       field values are a superset of the historical ones.

       The following is an example magic file:

           0  short     070707              cpio archive
           0  short     0143561             Byte-swapped cpio archive
           0  string    070707              ASCII cpio archive
           0  long      0177555             Very old archive
           0  short     0177545             Old archive
           0  short     017437              Old packed data
           0  string    \037\036            Packed data
           0  string    \377\037            Compacted data
           0  string    \037\235            Compressed data
           >2 byte&0x80 >0                  Block compressed
           >2 byte&0x1f x                   %d bits
           0  string    \032\001            Compiled Terminfo Entry
           0  short     0433                Curses screen image
           0  short     0434                Curses screen image
           0  string    <ar>                System V Release 1 archive
           0  string    !<arch>\n__.SYMDEF  Archive random library
           0  string    !<arch>             Archive
           0  string    ARF_BEGARF          PHIGS clear text archive
           0  long      0x137A2950          Scalable OpenFont binary
           0  long      0x137A2951          Encrypted scalable OpenFont binary

       The use of a basic integer data type is intended to allow the
       implementation to choose a word size commonly used by applications on
       that architecture.

       Earlier versions of this standard allowed for implementations with
       bytes other than eight bits, but this has been modified in this
       version.

FUTURE DIRECTIONS
       None.

SEE ALSO
       ar, ls, pax, printf

       The Base Definitions volume of POSIX.1‐2008, Table 5-1, Escape
       Sequences and Associated Actions, Chapter 8, Environment Variables,
       Section 12.2, Utility Syntax Guidelines

COPYRIGHT
       Portions of this text are reprinted and reproduced in electronic form
       from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
       -- Portable Operating System Interface (POSIX), The Open Group Base
       Specifications Issue 7, Copyright (C) 2013 by the Institute of
       Electrical and Electronics Engineers, Inc and The Open Group.  (This is
       POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
       event of any discrepancy between this version and the original IEEE and
       The Open Group Standard, the original IEEE and The Open Group Standard
       is the referee document. The original Standard can be obtained online
       at http://www.unix.org/online.html .

       Any typographical or formatting errors that appear in this page are
       most likely to have been introduced during the conversion of the source
       files to man page format. To report such errors, see
       https://www.kernel.org/doc/man-pages/reporting_bugs.html .



IEEE/The Open Group                  2013                             FILE(1P)