NAME

gfsmtrain - EXPERIMENTAL: count successful paths for training string pairs in a transducer

SYNOPSIS

gfsmtrain [OPTIONS] PAIR_FILE(s)...

 Arguments:
    PAIR_FILE(s)...  Input string-pair file(s)

 Options
    -h         --help                Print help and exit.
    -V         --version             Print version and exit.
    -iLABELS   --ilabels=LABELS      Specify input (lower) labels file.
    -oLABELS   --olabels=LABELS      Specify output (upper) labels file.
    -lLABELS   --labels=LABELS       Set -i and -o labels simultaneously.
    -a         --att-mode            Parse string(s) in AT&T-compatible mode.
    -q         --quiet               Suppress warnings about undefined symbols.
    -u         --utf8                Assume UTF-8 encoded alphabet and input.
    -B         --best                Only consider cost-minimal path(s) for each training pair.
    -O         --ordered             Count permutations in arc-order as multiple paths.
    -P         --distribute-by-path  Distribute pair-mass over multiple paths.
    -A         --distribute-by-arc   Distribute path-mass over arcs.
    -fFSTFILE  --fst=FSTFILE         Transducer to apply (required).
    -zLEVEL    --compress=LEVEL      Specify compression level of output file.
    -FFILE     --output=FILE         Specifiy output file (default=stdout).

DESCRIPTION

EXPERIMENTAL: count successful paths for training string pairs in a transducer

ARGUMENTS

PAIR_FILE(s)...

Input string-pair file(s)

One pair per line, TAB-separated.

OPTIONS

--help , -h

Print help and exit.

Default: '0'

--version , -V

Print version and exit.

Default: '0'

--ilabels=LABELS , -iLABELS

Specify input (lower) labels file.

Default: 'NULL'

--olabels=LABELS , -oLABELS

Specify output (upper) labels file.

Default: 'NULL'

--labels=LABELS , -lLABELS

Set -i and -o labels simultaneously.

Default: 'NULL'

--att-mode , -a

Parse string(s) in AT&T-compatible mode.

Default: '0'

--quiet , -q

Suppress warnings about undefined symbols.

Default: '0'

--utf8 , -u

Assume UTF-8 encoded alphabet and input.

Default: '0'

--best , -B

Only consider cost-minimal path(s) for each training pair.

Default: '0'

If specified and true, only minimal-cost path(s) will be considered for each training pair, otherwise all successful paths will be considered.

--ordered , -O

Count permutations in arc-order as multiple paths.

Default: '0'

If unspecified or false, only unique successful paths modulo arc-ordering will be considered; e.g. (q --[<epsilon>:a]--> q --[a:<epsilon>]--> q) and (q --[a:<epsilon>]--> q --[<epsilon>:a]--> q) are duplicates in this sense, since they differ only in the ordering of the arcs.

--distribute-by-path , -P

Distribute pair-mass over multiple paths.

Default: '0'

If true, a total count-mass of 1 will be added for each (input,output) pair, and distributed uniformly among any successful paths for that pair. Otherwise, each successful path for a given pair will receive a count-mass of 1 (one).

--distribute-by-arc , -A

Distribute path-mass over arcs.

Default: '0'

If true, the total count-mass added to each successful path will be distributed uniformly over all its arcs and its final weight. Otherwise, each arc in the path will receive the full count-mass alotted to that path.

--fst=FSTFILE , -fFSTFILE

Transducer to apply (required).

Default: 'NULL'

--compress=LEVEL , -zLEVEL

Specify compression level of output file.

Default: '-1'

Specify zlib compression level of output file. -1 (default) indicates the default compression level, 0 (zero) indicates no zlib compression at all, and 9 indicates the best possible compression.

--output=FILE , -FFILE

Specifiy output file (default=stdout).

Default: '-'

ADDENDA

About this Document

Documentation file auto-generated by optgen.perl version 0.07 using Getopt::Gen version 0.14. Translation was initiated as:

   optgen.perl -l --no-handle-rcfile --nocfile --nohfile --notimestamp -F gfsmtrain gfsmtrain.gog

BUGS AND LIMITATIONS

No negative-cost epsilon cycles are allowed in the transducer.

ACKNOWLEDGEMENTS

Perl by Larry Wall.

Getopt::Gen by Bryan Jurish.

AUTHOR

Bryan Jurish <moocow.bovine@gmail.com>

SEE ALSO

gfsmutils