List of all members
moot::mootHMMTrainer Class Reference

High-level class to gather training data for a mootHMM or mootCHMM.

Collaboration diagram for moot::mootHMMTrainer:
Collaboration graph
[legend]

Public Types

Training types
typedef mootNgrams::Ngram Ngram
 
typedef mootNgrams::NgramCount CountT
 
typedef set< mootTagStringTagSet
 

Public Member Functions

Constructor / destructor
 mootHMMTrainer (void)
 
 ~mootHMMTrainer (void)
 
Reset / Clear
void clear (void)
 
Top-level training methods
bool train_from_reader (TokenReader *reader)
 
bool train_from_stream (FILE *in=stdin, const string &srcname="(unknown)")
 
bool train_from_file (const string &filename)
 
bool train_finish (void)
 
Mid-level training methods
void train_init (void)
 
void train_bos (void)
 
void train_token (const mootToken &curtok)
 
void train_eos (void)
 
Warnings / Errors
void carp (const char *fmt,...)
 

Public Attributes

Training data
mootNgrams ngrams
 
mootLexfreqs lexfreqs
 
mootClassfreqs lcfreqs
 
mootTaster taster
 
Flags
bool want_ngrams
 
bool want_lexfreqs
 
bool want_classfreqs
 
bool want_flavors
 
Pragmatic constants
mootTagString eos_tag
 

Protected Attributes

Runtime training state
Ngram ng
 
bool last_was_eos
 

Member Typedef Documentation

◆ Ngram

Type for an N-gram

◆ CountT

Type for counts

◆ TagSet

Type for current tag-sets

Constructor & Destructor Documentation

◆ mootHMMTrainer()

moot::mootHMMTrainer::mootHMMTrainer ( void  )
inline

Default constructor

◆ ~mootHMMTrainer()

moot::mootHMMTrainer::~mootHMMTrainer ( void  )
inline

Default destructor

Member Function Documentation

◆ clear()

void moot::mootHMMTrainer::clear ( void  )
inline

◆ train_from_reader()

bool moot::mootHMMTrainer::train_from_reader ( TokenReader reader)

Gather training data using TokenIO layer

◆ train_from_stream()

bool moot::mootHMMTrainer::train_from_stream ( FILE *  in = stdin,
const string &  srcname = "(unknown)" 
)

Gather training data from a native text-format C-stream

◆ train_from_file()

bool moot::mootHMMTrainer::train_from_file ( const string &  filename)

Gather training data from a file using mootTaggerLexer

◆ train_finish()

bool moot::mootHMMTrainer::train_finish ( void  )

Finish training and compute "special" pseudo-frequencies (e.g. , flavors, etc.)

◆ train_init()

void moot::mootHMMTrainer::train_init ( void  )

Initialize training data

◆ train_bos()

void moot::mootHMMTrainer::train_bos ( void  )

Initialize data for training a new sentence

◆ train_token()

void moot::mootHMMTrainer::train_token ( const mootToken curtok)

Gather training information for a single token, using mootToken interface

◆ train_eos()

void moot::mootHMMTrainer::train_eos ( void  )

Gather training information for a sentence boundary.

◆ carp()

void moot::mootHMMTrainer::carp ( const char *  fmt,
  ... 
)

Error reporting

Member Data Documentation

◆ ngrams

mootNgrams moot::mootHMMTrainer::ngrams

Raw n-gram frequency data

◆ lexfreqs

mootLexfreqs moot::mootHMMTrainer::lexfreqs

Raw lexical frequency data

◆ lcfreqs

mootClassfreqs moot::mootHMMTrainer::lcfreqs

Raw lexical-class frequency data

◆ taster

mootTaster moot::mootHMMTrainer::taster

Heuristic token classifier (default: built-in rules)

◆ want_ngrams

bool moot::mootHMMTrainer::want_ngrams

Whether to gather n-gram frequency data

◆ want_lexfreqs

bool moot::mootHMMTrainer::want_lexfreqs

Whether to gather lexical frequency data

◆ want_classfreqs

bool moot::mootHMMTrainer::want_classfreqs

Whether to gather lexical-class frequency data

◆ want_flavors

bool moot::mootHMMTrainer::want_flavors

Whether to gather lexical-flavor information

◆ eos_tag

mootTagString moot::mootHMMTrainer::eos_tag

String indicating end-of-sentence: this is usually __$

◆ ng

Ngram moot::mootHMMTrainer::ng
protected

Current n-gram window

◆ last_was_eos

bool moot::mootHMMTrainer::last_was_eos
protected

Stupid hack


The documentation for this class was generated from the following file: