README for package 'moot'

Last updated for moot version 2.0.4


DESCRIPTION

moot - moocow's part-of-speech tagger and utilities.


REQUIREMENTS

pkg-config (Required)
Available from: http://www.freedesktop.org/software/pkgconfig/

To build from cvs, you will also need the pkg-config autoconf macros which come with the source distribution of pkg-config.

STL headers (Required)
If your C++ compiler does not have the STL headers already installed, you will need to get them from somewhere. For gcc-2.x, I recommend STLport >= 4.5.3, available from http://www.stlport.org You will have to set the environment variables CPPFLAGS, LDFLAGS, and LIBS according to your installation before building.

flex++ , bison++ (Optional)
Alain Coetmeur's C++ ports of the famous lexer/parser generator pair, available from: ftp://iecc.com/pub/file/bison++flex++ or from the official distribution site of this package.

Tested with flex++-v2.3.8-4 and bison++-v1.21-5.

Should only be required if you want/need to mess with the native I/O formats.

expat (Optional)
XML parser toolkit library by James Clark, required for XML input, available from http://expat.sourceforge.net

Tested version 1.95.6.

librecode (Optional)
Character-set recoding library by François Pinard, useful for XML output, available from http://www.gnu.org/directory/recode.html

Tested version 3.6.

zlib (Optional)
Compression library by Jean-loup Gailly and Mark Adler, useful for compressed binary HMM files. Available from: http://www.gzip.org/zlib

Tested version 1.2.1.

doxygen (Optional)
Required for building library documentation. Available from: http://www.doxygen.org

Tested version 1.2.15.

Perl (Optional)
Get it from http://www.cpan.org or http://www.perl.com Required for building command-line parsers, utility documentation, library documentation, etc.

Getopt::Gen
A Perl module used to generate command-line option parsers. Available from: http://www.ling.uni-potsdam.de/~moocow/projects/perl/index.html#gog

pod2man, pod2text, pod2html (Optional)
The Perl documentation converstion utilities, required for building the correspdonding program documentation formats, should have come with your Perl.


INSTALLATION

Issue the following commands to the shell:

 sh ./configure
 make
 make install

See the file INSTALL in the top-level distribution directory for details.


BUILD FROM CVS

To build from CVS, you need the GNU utilities aclocal, automake, autoconf, and libtool. If you have these, you can just run the top-level script:

 sh ./autogen.sh

This will create the 'configure' script and other necessary build files.

You will also need Perl and the Getopt::Gen Perl module, which should be available from wherever you acquired these sources.


KNOWN ISSUES

Common Warnings

``flex++bison++.pc not found''
If you want this to go away, install my (old, unmaintained) flex++bison++ package from http://www.ling.uni-potsdam.de/~moocow/projects/moot/flex++bison++-0.0.5.tar.gz Otherwise, keep your distro's versions and ignore the warning.

``cannot find optgen.pl program''
If you're building from CVS, this will be fatal. Get my Getopt::Gen perl module (and perl, if you haven't already), build it, install it, then run moot's ./configure again.

Known Bugs

``osfcn.h No such file or directory''
osfcn.h appears to be a relic of my antiquated flex++bison++ package; you may attempt to use and/or modify the file 'src/libmoot/myosfcn.h' that comes with the distribution as a workaround. Otherwise, you should try to rebuild the lexer/parser .cc and .h files by hand:
  bash$ cd moot-XX.YY
  bash$ touch ./src/libmoot/*.ll ./src/libmoot/*.yy
  bash$ make

... ought to do the trick.


ACKNOWLEDGEMENTS

Development of this package was supported by the project 'Kollokationen im Wörterbuch' (``collocations in the dictionary'', http://www.bbaw.de/forschung/kollokationen ) in association with the project 'Digitales Wörterbuch deutscher Sprache des 20. Jahrhunderts (DWDS)' (``digital dictionary of the German language of the 20th century'', http://www.dwds.de ) at the Berlin-Brandenburgische Akademie der Wissenschaften ( http://www.bbaw.de ) with funding from the Alexander von Humboldt Stiftung ( http://www.avh.de ) and from the Zukunftsinvestitionsprogramm of the German federal government.

I am grateful to Christiane Fellbaum, Alexander Geyken, Thomas Hanneforth, Gerald Neumann, Edmund Pohl, Alexey Sokirko, and others for offering useful insights in the course of development of this package.

Thomas Hanneforth wrote and maintains the libFSM C++ library for finite-state device operations used in the development of the HMM tagger / disambiguator.

Alexander Geyken and Thomas Hanneforth developed the rule-based morphological analysis system for German which was used in the development and testing of the class-based HMM tagger / disambiguator.


AUTHOR

Bryan Jurish <moocow@ling.uni-potsdam.de>