Main Page | Directories | Namespace List | Class Hierarchy | Alphabetical List | Class List | File List | Namespace Members | Class Members | File Members

moot Namespace Reference

Classes

String Utilities

Named File Utilities

Command-line utilities

Typedefs

Enumerations

Functions

Variables


Detailed Description

Default input buffer length for XML parsers


Typedef Documentation

typedef ProbT moot::CountT
 

Count types (for raw frequencies)

typedef list<mootToken> moot::mootSentence
 

Sentences are just lists of mootToken objects

typedef set<mootTagString> moot::mootTagSet
 

Tagset (read "lexical class") type

typedef string moot::mootTagString
 

Tag-string type

typedef mootTokenTypeE moot::mootTokenType
 

typedef string moot::mootTokString
 

Token-string type

typedef float moot::ProbT
 

Type for probabilities

typedef AssocVector<mootEnumID,ProbT> moot::SuffixTrieDataT
 

Typedef for suffix trie data

typedef TokenIOFormatE moot::TokenIOFormat
 


Enumeration Type Documentation

enum mootTokenFlavor
 

Enum for TnT-style token typification

Enumeration values:
TokFlavorAlpha  (Mostly) alphabetic token: "foo", "bar", "foo2bar"
TokFlavorCard  : Digits-only: "42"
TokFlavorCardPunct  : Digits single-char punctuation suffix: "42."
TokFlavorCardSuffix  : Digits with (almost any) suffix: "42nd"
TokFlavorCardSeps  : Digits with interpunctuation: "420.24/7"
TokFlavorUnknown  : Special "Unknown" token-type
NTokFlavors  Not really a token-type

enum mootTokenTypeE
 

Enumeration values:
TokTypeUnknown  we dunno what it is -- could be anything
TokTypeVanilla  plain "vanilla" token (+/-besttag,+/-analyses)
TokTypeLibXML  plain XML token; much like 'Vanilla'
TokTypeXMLRaw  Raw XML text (for lossless XML I/O)
TokTypeComment  a comment, should be ignored by processing routines
TokTypeEOS  end-of-sentence
TokTypeEOF  end-of-file
TokTypeUser  user-defined token type: use in conjunction with 'user_data'
NTokTypes  number of token-types (not a type itself)

enum TokenIOFormatE
 

Enum for I/O format flags

Enumeration values:
tiofNone  no format
tiofUnknown  unknown format
tiofNull  null i/o, useful for testing
tiofUser  some user-defined format
tiofNative  native text format
tiofXML  XML format.
tiofConserve  Conserve raw XML.
tiofPretty  Pretty-print (XML only).
tiofText  Pretty-print (XML only).
tiofAnalyzed  input is pre-analyzed (>= "medium rare")
tiofTagged  input is tagged ("medium" or "well done")
tiofPruned  pruned output


Function Documentation

bool hmm_parse_model_name const std::string &  modelname,
std::string &  binfile,
std::string &  lexfile,
std::string &  ngfile,
std::string &  lcfile
 

Utility for mootHMM::load_model() and friends: parse a model name according to the conventions described in mootfiles(5).

Parameters:
modelname name of the model
binfile output string for binary model filename
lexfile output string for lexical frequency text-format filename
ngfile output string for n-gram frequency text-format filename
lcfile output string for class frequency text-format filename

bool hmm_parse_model_name_text const std::string &  modelname,
std::string &  lexfile,
std::string &  ngfile,
std::string &  lcfile
 

Utility for mootHMM::load_model() and friends: parse a text-model name according to the conventions described in mootfiles(5).

Parameters:
modelname name of the model
lexfile output string for lexical frequency text-format filename
ngfile output string for n-gram frequency text-format filename
lcfile output string for class frequency text-format filename

bool isTokFlavorName const mootTokString tokstr  )  [inline]
 

Returns true iff is a pseudo-identifier for a non-alpha type Used during HMM and trie compilation

std::string moot_banner void   ) 
 

Return a banner string for the library

char* moot_extension const char *  filename  )  [inline]
 

Get extension of a filename (including leading '.')

char* moot_extension const char *  filename,
size_t  pos
[inline]
 

Get final extension of a filename (including leading '.'), reading backwards from (filename+pos). Returns a pointer into filename. If no next extension is found, returns NULL.

bool moot_file_exists const char *  filename  ) 
 

Check whether a file exists by trying to open it with 'fopen()'

std::string moot_normalize_ws const std::string &  s,
bool  trim_left = true,
bool  trim_right = true
[inline]
 

Create and return a whitespace-normalized STL string from a different STL string.

@param s source string @param trim_left whether to trim all leading whitespace @param trim_right whether to trim all trailing whitespace

std::string moot_normalize_ws const char *  s,
bool  trim_left = true,
bool  trim_right = true
[inline]
 

Create and return a whitespace-normalized STL string from a NUL-terminated C string.

@param s source string @param trim_left whether to trim all leading whitespace @param trim_right whether to trim all trailing whitespace

std::string moot_normalize_ws const char *  buf,
size_t  len,
bool  trim_left = true,
bool  trim_right = true
[inline]
 

Create and return a whitespace-normalized STL string from a C memory buffer.

@param buf source buffer @param len length of source buffer, in bytes @param trim_left whether to trim all leading whitespace @param trim_right whether to trim all trailing whitespace

void moot_normalize_ws const char *  s,
std::string &  out,
bool  trim_left = true,
bool  trim_right = true
[inline]
 

Append a whitespace-normalized NUL-terminated C string to an STL string.

@param s source string @param out destination STL string @param trim_left whether to trim all leading whitespace @param trim_right whether to trim all trailing whitespace

void moot_normalize_ws const std::string &  in,
std::string &  out,
bool  trim_left = true,
bool  trim_right = true
 

Append a whitespace-normalized C++ string to another C++ string. All whitespace substrings in are replaced with a single space in . is not cleared.

@param in source string @param out destination string @param trim_left whether to trim all leading whitespace @param trim_right whether to trim all trailing whitespace

void moot_normalize_ws const char *  buf,
size_t  len,
std::string &  out,
bool  trim_left = true,
bool  trim_right = true
 

Append a whitespace-normalized C buffer to an STL string. All whitespace substrings in s are replaced with a single space in out. out is not cleared.

@param buf source buffer @param len length of source buffer in bytes @param out destination STL string @param trim_left whether to trim all leading whitespace @param trim_right whether to trim all trailing whitespace

bool moot_parse_doubles char *  str,
double *  dbls,
size_t  ndbls
 

Parse a comma-separated list of doubles (at most 'ndbls') from str into dbls. You should already have allocated space for ndbls doubles in dbls.

std::string moot_program_banner const std::string &  prog_name,
const std::string &  prog_version,
const std::string &  prog_author,
bool  is_free = true
 

Return a full banner string for a program using the library.

void moot_remove_newlines std::string &  s  )  [inline]
 

Remove all newlines from an STL string.

void moot_remove_newlines char *  s  )  [inline]
 

Remove all newlines from a NUL-terminated C string.

void moot_remove_newlines char *  buf,
size_t  len
[inline]
 

Remove all newlines from a C buffer. Every newline is replaced with a single space.

@param s target string @param len length of target buffer in bytes

std::list<std::string> moot_strtok const std::string &  s,
const std::string &  delim
[inline]
 

Tokenize an STL string to a new list.

@param s source string @param delim string of delimiter characters

void moot_strtok const std::string &  s,
const std::string &  delim,
std::list< std::string > &  out
 

Tokenize an STL string to an existing list.

@param s source string @param delim string of delimiter characters @param out destination string list

std::string moot_unextend const char *  filename  ) 
 

Get path+basename of a file

mootTokenFlavor tokenFlavor const mootTokString token  )  [inline]
 

Get the TokenType for a given token

bool tokenFlavor_isCardPunctChar const char  c  )  [inline]
 

TnT compatibility hack

bool tokenFlavor_isCardSuffixChar const char  c  )  [inline]
 

TnT compatibility hack


Variable Documentation

const char* moot::mootTokenFlavorNames[NTokFlavors]
 

Convert token-types to symbolic names

const char* moot::mootTokenTypeNames[NTokTypes]
 

Useful for debugging token types


Generated on Mon Jun 27 13:05:26 2005 for libmoot by  doxygen 1.3.8-20040913