utf8trans

utf8trans(1)                        docbook2X                       utf8trans(1)



NAME
       utf8trans - Transliterate UTF-8 characters according to a table

SYNOPSIS
       utf8trans charmap [file]...

DESCRIPTION
       utf8trans transliterates characters in the specified files (or standard
       input, if they are not specified) and writes the output to standard
       output. All input and output is in the UTF-8 encoding.

       This program is usually used to render characters in Unicode text files
       as some markup escapes or ASCII transliterations.  (It is not intended
       for general charset conversions.)  It provides functionality similar to
       the character maps in XSLT 2.0 (XML Stylesheet Language –
       Transformations, version 2.0).

OPTIONS
       -m, --modify
              Modifies the given files in-place with their transliterated
              output, instead of sending it to standard output.

              This option is useful for efficient transliteration of many files
              at once.

       --help Show brief usage information and exit.

       --version
              Show version and exit.

USAGE
       The translation is done according to the rules in the ‘character map’,
       named in the file charmap. It has the following format:

       1.  Each line represents a translation entry, except for blank lines and
           comment lines, which are ignored.

       2.  Any amount of whitespace (space or tab) may precede the start of an
           entry.

       3.  Comment lines begin with #.  Everything on the same line is ignored.

       4.  Each entry consists of the Unicode codepoint of the character to
           translate, in hexadecimal, followed one space or tab, followed by the
           translation string, up to the end of the line.

       5.  The translation string is taken literally, including any leading and
           trailing spaces (except the delimeter between the codepoint and the
           translation string), and all types of characters. The newline at the
           end is not included.

       The above format is intended to be restrictive, to keep utf8trans simple.
       But if a XML-based format is desired, there is a xmlcharmap2utf8trans
       script that comes with the docbook2X distribution, that converts
       character maps in XSLT 2.0 format to the utf8trans format.

LIMITATIONSutf8trans does not work with binary files, because malformed UTF-8
         sequences in the input are substituted with U+FFFD characters. However,
         null characters in the input are handled correctly. This limitation may
         be removed in the future.

       • There is no way to include a newline or null in the substitution
         string.

AUTHOR
       Steve Cheng <stevecheng@users.sourceforge.net>.



docbook2X 0.8.8                   3 March 2007                      utf8trans(1)