NAME

mootutils - moot Commandline Utilities


PROGRAMS

The following is a list of the programs and user documentation contained in the libmoot package. See the individual manpages for details.

moot

moocow's HMM part-of-speech tagger/disambiguator. (the moot manpage)

mootchurn

File format converter for moocow's PoS tagger. (the mootchurn manpage)

mootcompile

moocow's HMM part-of-speech tagger/disambiguator: model compiler. (the mootcompile manpage)

mootconfig

moocow's part-of-speech tagger: report configuration (the mootconfig manpage)

mootdump

moocow's HMM part-of-speech tagger/disambiguator: model dumper. (the mootdump manpage)

mootdyn

moocow's dynamic HMM part-of-speech tagger/disambiguator. (the mootdyn manpage)

mooteval

Output evaluator for moocow's PoS tagger. (the mooteval manpage)

()

mootpp

Rudimentary tokenizer for moocow's part-of-speech tagger. (the mootpp manpage)

mootrain

moocow's part-of-speech tagger : HMM trainer (the mootrain manpage)

moottaste

moocow's HMM part-of-speech tagger: heuristic token classifier. (the moottaste manpage)

waste

Word- and Sentence-Token Extractor using a Hidden Markov Model (the waste manpage)

mootfiles

moot file formats (the mootfiles manpage)


DESCRIPTION

The mootutils package provides a suite of command-line tools for statistical part-of-speech (PoS) tagging using the libmoot library. In addition to traditional bigram tagging routines, libmoot allows the use of user-specified a priori sets of possible analyses for each input token ("lexical classes"), a technique which has been shown to lead to a reduction in errors of up to 21% with respect to traditional Hidden-Markov-Model (HMM) methods.


ADDENDA

About this Document

Documentation file auto-generated by optgen.perl version 0.07. Translation was initiated as:

   /usr/local/bin/optgen.perl --nocfile --nohfile --nopod --no-handle-help --no-handle-version --no-handle-rcfile --no-handle-error --notimestamp --template=mootutils.skel -


ACKNOWLEDGEMENTS

Initial development of the this was supported by the project 'Kollokationen im Wörterbuch' ( "collocations in the dictionary", http://www.bbaw.de/forschung/kollokationen ) in association with the project 'Digitales Wörterbuch der deutschen Sprache des 20. Jahrhunderts (DWDS)' ( "digital dictionary of the German language of the 20th century", http://www.dwds.de ) at the Berlin-Brandenburgische Akademie der Wissenschaften ( http://www.bbaw.de ) with funding from the Alexander von Humboldt Stiftung ( http://www.avh.de ) and from the Zukunftsinvestitionsprogramm of the German federal government. Development of the DynHMM and WASTE extensions was supported by the DFG-funded projects 'Deutsches Textarchiv' ( "German text archive", http://www.deutschestextarchiv.de ) and 'DLEX' at the Berlin-Brandenburgische Akademie der Wissenschaften.

The authors are grateful to Christiane Fellbaum, Alexander Geyken, Gerald Neumann, Edmund Pohl, Alexey Sokirko, and others for offering useful insights in the course of development of this package. Thomas Hanneforth wrote and maintains the libFSM C++ library for finite-state device operations used by the class-based HMM tagger / disambiguator, without which moot could not have been built. Alexander Geyken and Thomas Hanneforth developed the rule-based morphological analysis system for German which was used in the development and testing of the class-based HMM tagger / disambiguator.


SEE ALSO

moot(1), mootchurn(1), mootcompile(1), mootconfig(1), mootdump(1), mootdyn(1), mooteval(1), (1), mootpp(1), mootrain(1), moottaste(1), waste(1), mootm(1)


AUTHOR

Bryan Jurish <moocow@ling.uni-potsdam.de>