onsgmls − An SGML/XML parser and validator

onsgmls [−BCdeghlnpRrsuvx] [−alinktype] [−Aarchitecture]
[−bbctf] [−csysid...] [−Ddirectory] [−Emax_errors] [−ffile]
[−iname] [−ooutput_option...] [−tfile] [−wwarning_type...]

     onsgmls parses and validates the SGML document whose
document entity is specified by the system identifiers and
prints on the standard output a simple text representation
of its Element Structure Information Set. (This is the
information set which a structure−controlled conforming SGML
application should act upon.) If more than one system
identifier is specified, then the corresponding entities
will be concatenated to form the document entity. Thus the
document entity may be spread among several files; for
example, the SGML declaration, prolog and document instance
set could each be in a separate file. If no system
identifiers are specified, then onsgmls will read the
document entity from the standard input. A command line
system identifier of − can be used to refer to the standard
input. (Normally in a system identifier, <OSFD>0 is used to
refer to standard input.)

     Part of an SGML System Conforming to International
Standard ISO 8879 −− Standard Generalized Markup Language.
An SGML Extended Facilities system conforming to Annex A of
Internal Standard ISO/IEC 10744 −− Hypermedia/Time−based
Structuring Language

     The following options are available:

     −alinktype, −−activate=linktype
     Make link type linktype active. Not all ESIS
     information is output in this case: the active LPDs are
     not explicitly reported, although each link attribute
     is qualified with its link type name; there is no
     information about result elements; when there are
     multiple link rules applicable to the current element,
     onsgmls always chooses the first.

     −Aarchitecture, −−architecture=architecture
     Parse with respect to architecture architecture.

     −bbctf, −−bctf=bctf, −bencoding, −−encoding=encoding
     This determines the encoding used for output. If in
     fixed character set mode it specifies the name of an
     encoding; if not, it specifies the name of a BCTF.

     −B, −−batch_mode
     Batch mode. Parse each specified on the command line
     separately, rather than concatenating them. This is


     useful mainly with −s.

     If −tfilename is also specified, then the specified
     filename will be prefixed to the sysid to make the
     filename for the RAST result for each sysid.

     −csysid, −−catalog=sysid
     Map public identifiers and entity names to system
     identifiers using the catalog entry file whose system
     identifier is sysid. Multiple −c options are allowed.
     If there is a catalog entry file called catalog in the
     same place as the document entity, it will be searched
     for immediately after those specified by −c.

     −C, −−catalogs
     The arguments specify catalog files rather than the
     document entity. The document entity is specified by
     the first DOCUMENT entry in the catalog files.

     −Ddirectory, −−directory=directory
     Search directory for files specified in system
     identifiers. Multiple −D options are allowed. See the
     description of the osfile storage manager for more
     information about file searching.

     −e, −−open−entities
     Describe open entities in error messages. Error
     messages always include the position of the most
     recently opened external entity.

     −Emax_errors, −−max−errors=max_errors
     onsgmls will exit after max_errors errors. If
     max_errors is 0, there is no limit on the number of
     errors. The default is 200.

     −ffile, −−error−file=file
     Redirect errors to file. This is useful mainly with
     shells that do not support redirection of stderr.

     −g, −−open−elements
     Show the generic identifiers of open elements in error

     −h, −−help
     Show a help message and exit.

     −iname, −−include=name
     Pretend that

     <!ENTITY % name "INCLUDE">

     occurs at the start of the document type declaration
     subset in the SGML document entity. Since repeated
     definitions of an entity are ignored, this definition


     will take precedence over any other definitions of this
     entity in the document type declaration. Multiple −i
     options are allowed. If the SGML declaration replaces
     the reserved name INCLUDE then the new reserved name
     will be the replacement text of the entity. Typically
     the document type declaration will contain

     <!ENTITY % name "IGNORE">

     and will use %name; in the status keyword specification
     of a marked section declaration. In this case the
     effect of the option will be to cause the marked
     section not to be ignored.

     −n, −−error−numbers
     Show message numbers in error messages.

     −ooutput_option, −−option=output_option
     Output additional information according to

     entity Output definitions of all general entities not
     just for data or subdoc entities that are referenced or
     named in an ENTITY or ENTITIES attribute.

     id Distinguish attributes whose declared value is ID.

     line Output L commands giving the current line number
     and filename.

     included Output an i command for included sub−elements.

     empty Output an e command for elements which are not
     allowed to have an end−tag, that is those with a
     declared content of empty or with a content reference

     notation−sysid Output an f command before an N command,
     if a system identifier could be generated for that

     nonsgml In fixed character set mode, output \% escape
     sequences for non−SGML data characters. Non−SGML data
     characters can result from numeric character

     data−attribute Output the notation name and attributes
     for DATA attributes. Otherwise, DATA attributes are
     treated like CDATA attributes. For more details see
     clause 4.4.3 of Annex K of ISO 8879.

     comment Output an _ command with the contents of a
     comment. Multiple comments in a single comment
     declaration will result in multiple distinct _


     commands, just as if the comments were each in a
     separate comment declaration.

     omitted Output an o command before a command which was
     implied by the input document, but omitted from the
     actual markup. This currently affects (,), and A

     tagomit As omitted, but only for ( and ) commands.

     attromit As omitted, but only for A commands.

     Multiple −o options are allowed.

     −p, −−only−prolog
     Parse only the prolog.  onsgmls will exit after parsing
     the document type declaration. Implies −s.

     −R, −−restricted
     Restrict file reading. This option is intended for use
     with onsgmls−based Web tools (e.g. CGI scripts) to
     prevent reading of arbitrary files on the Web server.
     With this option enabled, onsgmls will not read any
     local files unless they are located in a directory (or
     subdirectory) specified by the −D option or included in
     the SGML_SEARCH_PATH environment variable. As a further
     security precaution, this option limits filesnames to
     the characters A−Z, a−z, 0−9, '?', '.', '_', '−' and
     does not allow filenames containing "..". On systems
     with MS−DOS file names ':' and '\' are also allowed.

     −s, −−no−output
     Suppress output. Error messages will still be printed.

     −tfile, −−rast−file=file
     Output to file the RAST result as defined by ISO/IEC
     13673:1995 (actually this isn't quite an IS yet; this
     implements the Intermediate Editor's Draft of
     1994/08/29, with changes to implement ISO/IEC
     JTC1/SC18/WG8 N1777). The normal output is not

     −v, −−version
     Print the version number.

     −wtype, −−warning=type
     Control warnings and errors. Multiple −w options are
     allowed. The following values of type enable warnings:

     xml Warn about constructs that are not allowed by XML.

     mixed Warn about mixed content models that do not allow
     #PCDATA anywhere.


     sgmldecl Warn about various dubious constructions in
     the SGML declaration.

     should Warn about various recommendations made in ISO
     8879 that the document does not comply with.
     (Recommendations are expressed with "should", as
     distinct from requirements which are usually expressed
     with "shall".)

     default Warn about defaulted references.

     duplicate Warn about duplicate entity declarations.

     undefined Warn about undefined elements: elements used
     in the DTD but not defined.

     unclosed Warn about unclosed start and end−tags.

     empty Warn about empty start and end−tags.

     net Warn about net−enabling start−tags and null

     min−tag Warn about minimized start and end−tags.
     Equivalent to combination of unclosed, empty and net

     unused−map Warn about unused short reference maps: maps
     that are declared with a short reference mapping
     declaration but never used in a short reference use
     declaration in the DTD.

     unused−param Warn about parameter entities that are
     defined but not used in a DTD. Unused internal
     parameter entities whose text is INCLUDE or IGNORE
     won't get the warning.

     notation−sysid Warn about notations for which no system
     identifier could be generated.

     all Warn about conditions that should usually be
     avoided (in the opinion of the author). Equivalent to:
     mixed, should, default, undefined, sgmldecl,
     unused−map, unused−param, empty and unclosed.

     immediate−recursion Warn about immediately recursive
     elements. For more detais see clause 2.2.5 of Annex K
     of ISO 8879.

     fully−declared Warn if the document instance fails to
     be fully declared. This has the effect of changing the
     SGML declaration to specify IMPLYDEF ATTLIST NO ELEMENT
     NO ENTITY NO NOTATION NO. For more details see clause
     2.2.1 of Annex K of ISO 8879.


     fully−tagged Warn if the document instance fails to be
     fully−tagged. This has the effect of changing the SGML
     declaration to specify DATATAG NO, RANK NO, OMITTAG NO,
     NO. For more details see clause 2.2.2 of Annex K of ISO

     amply−tagged, amply−tagged−recursive Warn if the
     doucment instance fails to be amply−tagged. Implicitly
     defined elements may be immediately recurisve if
     amply−tagged−recursive is specified. This has the
     effect of changing the SGML declaration to specify
     IMPLYDEF ELEMENT YES. For more details see clause 2.2.4
     of Annex K of ISO 8879.

     type−valid Warn if the document instance fails to be
     type−valid. This has the effect of changing the SGML
     declaration to specify VALIDITY YES. For more details
     see clause 2.2.3 of Annex K of ISO 8879.

     entity−ref Warn about references to non−predefined
     entities. This has the effect of changing the SGML
     declaration to specify ENTITIES REF NONE. For more
     details see clause 2.3.2 of Annex K of ISO 8879.

     external−entity−ref Warn about references to external
     entities. This includes references to an external DTD
     subset. This has the effect of changing the SGML
     declaration to specify ENTITIES REF INTERNAL. For more
     details see clause 2.3.3 of Annex K of ISO 8879.

     integral Warn if the document instance is not
     integrally stored. This has the effect of changing the
     SGML declaration to specify ENTITIES INTEGRAL YES. For
     more details see clause 2.3.1 of Annex K of ISO 8879.

     A warning can be disabled by using its name prefixed
     with no−. Thus −wall −wno−duplicate will enable all
     warnings except those about duplicate entity

     The following values for warning_type disable errors:

     no−idref Do not give an error for an ID reference value
     which no element has as its ID. The effect will be as
     if each attribute declared as an ID reference value had
     been declared as a name.

     no−significant Do not give an error when a character
     that is not a significant character in the reference
     concrete syntax occurs in a literal in the SGML
     declaration. This may be useful in conjunction with


     certain buggy test suites.

     no−valid Do not require the document to be type−valid.
     This has the effect of changing the SGML declaration to
     ELEMENT YES. An option of −wvalid has the effect of
     changing the SGML declaration to specify VALIDITY TYPE
     and IMPLYDEF ATTLIST NO ELEMENT NO. If neither −wvalid
     nor −wno−valid are specified, then the VALIDITY and
     IMPLYDEF specified in the SGML declaration will be

     no−afdr Do not give errors when AFDR meta−DTD notation
     features are used in the DTD. These errors are normally
     produced when parsing the DTD, but suppressed when
     parsing meta−DTDs.

     −x, −−references
     Show information about relevant clauses (from ISO
     8879:1986) in error messages.

     The following options are also supported for backward
compatibility with sgmls:

     Same as −wduplicate.

     Same as −oline.

     Same as −c.

     Same as −wdefault.

     Same as −wundef.

     ospent(1), ospam(1), onsgmlnorm(1), osx(1)

     James Clark

     Ian Castle <ian.castle@openjade.org>