NAME

mootchurn - File format converter for moocow's PoS tagger.


SYNOPSIS

mootchurn [OPTIONS] INPUT(s)

 Arguments:
    INPUT(s)  Input files / file-lists.
 Options
    -h          --help                      Print help and exit.
    -V          --version                   Print version and exit.
    -cFILE      --rcfile=FILE               Read an alternate configuration file.
    -vLEVEL     --verbose=LEVEL             Verbosity level.
    -B          --no-banner                 Suppress initial banner message (implied at verbosity levels <= 2)
    -dNTOKS     --dots=NTOKS                Print a dot for every NTOKS tokens processed.
    -l          --list                      INPUTs are file-lists, not filenames.
    -oFILE      --output=FILE               Specify output file (default=stdout).
 Format Options
    -t          --tokens                    Read input token-wise.
    -IFORMAT    --input-format=FORMAT       Specify input file(s) format(s).
    -OFORMAT    --output-format=FORMAT      Specify output file format.
 XML Options
                --input-encoding=ENCODING   Override XML document input encoding
                --output-encoding=ENCODING  Set default XML output encoding


DESCRIPTION

File format converter for moocow's PoS tagger.

'mootchurn' is a file-format converter for use with the 'moot' part-of-speech tagging tools. See the mootfiles manpage for details on moot file formats.


ARGUMENTS

INPUT(s)

Input files / file-lists.

Input files should be 'cooked' in some format known to moot.

See also the '--list' option.

For details on moot file formats, see the mootfiles manpage.


OPTIONS

--help , -h

Print help and exit.

Default: '0'

--version , -V

Print version and exit.

Default: '0'

--rcfile=FILE , -cFILE

Read an alternate configuration file.

Default: 'NULL'

See also: CONFIGURATION FILES.

--verbose=LEVEL , -vLEVEL

Verbosity level.

Default: '3'

Be more or less verbose. Recognized values are in the range 0..6:

  1. (silent)

    Disable all diagnostic messages.

  2. (errors)

    Print error messages to stderr.

  3. (warnings)

    Print warnings to stderr.

  4. (info)

    Print general diagnostic information to stderr.

  5. (progress)

    Print progress information to stderr.

  6. (debug)

    Print debugging information to stderr (if applicable).

  7. (trace)

    Print execution trace information to stderr (if applicable).

--no-banner , -B

Suppress initial banner message (implied at verbosity levels <= 2)

Default: '0'

--dots=NTOKS , -dNTOKS

Print a dot for every NTOKS tokens processed.

Default: '0'

Zero (the default) means that no dots will be printed.

--list , -l

INPUTs are file-lists, not filenames.

Default: '0'

Useful for large batch-processing jobs.

--output=FILE , -oFILE

Specify output file (default=stdout).

Default: '-'

Format Options

--tokens , -t

Read input token-wise.

Default: '0'

Default behavior is to read sentence-wise.

--input-format=FORMAT , -IFORMAT

Specify input file(s) format(s).

Default: 'NULL'

Value should be a comma-separated list of format flag names, optionally prefixed with an exclamation point (!) to indicate negation.

Default='WellDone'

See 'I/O Format Flags' in the mootfiles manpage for details.

--output-format=FORMAT , -OFORMAT

Specify output file format.

Default: 'NULL'

Value should be a comma-separated list of format flag names, optionally prefixed with an exclamation point (!) to indicate negation.

Default='WellDone'

See 'I/O Format Flags' in the mootfiles manpage for details.

XML Options

--input-encoding=ENCODING

Override XML document input encoding

Default: 'NULL'

Potentially useful for XML documents without encoding declarations

--output-encoding=ENCODING

Set default XML output encoding

Default: 'NULL'

Potentially useful for human-readable XML documents, but also dangerous.


CONFIGURATION FILES

Configuration files are expected to contain lines of the form:

    LONG_OPTION_NAME    OPTION_VALUE

where LONG_OPTION_NAME is the long name of some option, without the leading '--', and OPTION_VALUE is the value for that option, if any. Fields are whitespace-separated. Blank lines and comments (lines beginning with '#') are ignored.

The following configuration files are read by default:


ADDENDA

Caveats

When converting to XML, you should first ensure that your data is properly encoded, using either character entities or UTF-8 to encode non-ASCII characters.

When convering from XML, all data will be written in the encoding declared in the document, or in UTF-8 if no encoding was declared.

About this Document

Documentation file auto-generated by optgen.perl version 0.07 using Getopt::Gen version 0.13. Translation was initiated as:

   optgen.perl -l --nocfile --nohfile --notimestamp -F mootchurn mootchurn.gog


BUGS AND LIMITATIONS

None known.


ACKNOWLEDGEMENTS

Perl by Larry Wall.

Getopt::Gen by Bryan Jurish.


AUTHOR

Bryan Jurish <moocow@cpan.org>


SEE ALSO

the mootfiles manpage the mootpp manpage, mootm(1), the mootrain manpage, the mootcompile manpage, the mootdump manpage, the moot manpage, the mooteval manpage