README for package 'moot'
Last updated for moot version 2.0.4
moot - moocow's part-of-speech tagger and utilities.
- pkg-config (Required)
-
Available from:
http://www.freedesktop.org/software/pkgconfig/
-
To build from cvs, you will also need the pkg-config
autoconf macros which come with the source distribution
of pkg-config.
- STL headers (Required)
-
If your C++ compiler does not have the STL
headers already installed, you will need
to get them from somewhere.
For gcc-2.x, I recommend STLport >= 4.5.3,
available from http://www.stlport.org
You will have to set the environment variables
CPPFLAGS, LDFLAGS, and LIBS according to
your installation before building.
- flex++ , bison++ (Optional)
-
Alain Coetmeur's C++ ports of the famous lexer/parser
generator pair, available from:
ftp://iecc.com/pub/file/bison++flex++
or
from the official distribution site of this package.
-
Tested with flex++-v2.3.8-4 and bison++-v1.21-5.
-
Should only be required if you want/need to mess with
the native I/O formats.
- expat (Optional)
-
XML parser toolkit library by James Clark, required for XML input,
available from
http://expat.sourceforge.net
-
Tested version 1.95.6.
- librecode (Optional)
-
Character-set recoding library by François Pinard, useful for XML output,
available from
http://www.gnu.org/directory/recode.html
-
Tested version 3.6.
- zlib (Optional)
-
Compression library by Jean-loup Gailly and Mark Adler,
useful for compressed binary HMM files.
Available from:
http://www.gzip.org/zlib
-
Tested version 1.2.1.
- doxygen (Optional)
-
Required for building library documentation.
Available from:
http://www.doxygen.org
-
Tested version 1.2.15.
- Perl (Optional)
-
Get it from http://www.cpan.org or http://www.perl.com
Required for building command-line parsers, utility
documentation, library documentation, etc.
- Getopt::Gen
-
A Perl module used to generate command-line option parsers.
Available from:
http://www.ling.uni-potsdam.de/~moocow/projects/perl/index.html#gog
- pod2man, pod2text, pod2html (Optional)
-
The Perl documentation converstion utilities, required
for building the correspdonding program documentation
formats, should have come with your Perl.
Issue the following commands to the shell:
sh ./configure
make
make install
See the file INSTALL in the top-level distribution
directory for details.
To build from CVS, you need the GNU utilities
aclocal, automake, autoconf, and libtool. If
you have these, you can just run the top-level
script:
sh ./autogen.sh
This will create the 'configure' script and other
necessary build files.
You will also need Perl and the Getopt::Gen Perl module,
which should be available from wherever you acquired
these sources.
- ``flex++bison++.pc not found''
-
If you want this to go away, install my (old, unmaintained)
flex++bison++ package from
http://www.ling.uni-potsdam.de/~moocow/projects/moot/flex++bison++-0.0.5.tar.gz
Otherwise, keep your distro's versions and
ignore the warning.
- ``cannot find optgen.pl program''
-
If you're building from CVS, this will be fatal. Get my Getopt::Gen
perl module (and perl, if you haven't already), build it,
install it, then run moot's ./configure again.
- ``osfcn.h No such file or directory''
-
osfcn.h appears to be a relic of my antiquated flex++bison++
package; you may attempt to use and/or modify the file
'src/libmoot/myosfcn.h' that comes with the distribution as a
workaround. Otherwise, you should try to rebuild the lexer/parser
.cc and .h files by hand:
-
bash$ cd moot-XX.YY
bash$ touch ./src/libmoot/*.ll ./src/libmoot/*.yy
bash$ make
-
... ought to do the trick.
Development of this package was supported by the project
'Kollokationen im Wörterbuch'
(``collocations in the dictionary'', http://www.bbaw.de/forschung/kollokationen )
in association with the project
'Digitales Wörterbuch deutscher Sprache des 20. Jahrhunderts (DWDS)'
(``digital dictionary of the German language of the 20th century'', http://www.dwds.de )
at the Berlin-Brandenburgische Akademie der Wissenschaften ( http://www.bbaw.de )
with funding from
the Alexander von Humboldt Stiftung ( http://www.avh.de )
and from the Zukunftsinvestitionsprogramm of the
German federal government.
I am grateful to Christiane Fellbaum, Alexander Geyken,
Thomas Hanneforth, Gerald Neumann, Edmund Pohl, Alexey Sokirko,
and others for offering useful insights in the course of development
of this package.
Thomas Hanneforth wrote and maintains the libFSM C++ library
for finite-state device operations
used in the development of the HMM tagger / disambiguator.
Alexander Geyken and Thomas Hanneforth developed the
rule-based morphological analysis system for German
which was used in the development and testing of the
class-based HMM tagger / disambiguator.
Bryan Jurish <moocow@ling.uni-potsdam.de>