ddc
Public Types | Public Member Functions | Public Attributes | Protected Member Functions | Protected Attributes | List of all members
CStringIndexator Class Reference

#include <StringIndexator.h>

Inheritance diagram for CStringIndexator:
Inheritance graph
[legend]
Collaboration diagram for CStringIndexator:
Collaboration graph
[legend]

Public Types

typedef map< string, string > IndexAliasMap
 typedef for index alias maps More...
 
typedef map< string, CStringIndexSet * > IndexMap
 typedef for index symbol table More...
 

Public Member Functions

 CStringIndexator ()
 
 ~CStringIndexator ()
 
bool RegisterStringIndices (const string &IndicesStr)
 read index declarations from a string and register them More...
 
bool RegisterIndexAliases (const string &IndexAliasStr)
 read index alias declarations from a string and register them; returns true iff all registrations were successful More...
 
bool RegisterIndexAlias (const string &AliasFrom, const string &AliasTo)
 register a single index alias (low-level); returns true iff AliasTo resolves to a known index according to m_AliasMap More...
 
void RegisterIndexAlias (const string &AliasFrom, CStringIndexSet *idx)
 register a single index label or alias (lowest-level); if idx is NULL, any existing entry for AliasFrom will be deleted More...
 
void SetPath (string Path)
 set the path to the indices More...
 
string GetIndicesString () const
 return all registered index declarations, in opt-file syntax More...
 
string GetIndexAliasString () const
 return all registered index aliases, in opt-file syntax More...
 
size_t GetSearchPeriodsCount () const
 return the number of corpus periods More...
 
const CTokenNoGetSearchPeriod (size_t i) const
 get a corpus period by an index More...
 
bool StartIndexing (string Path)
 call CreateTempFiles for all registered indices More...
 
void TerminateIndexing ()
 call DeleteTempFiles for all registered indices More...
 
bool FinalSaveAllIndices (bool bAfterLoading)
 final saving all indices to disk (converting temp files to persistent) More...
 
bool AddInputLoadIndexToMemoryLoadIndex ()
 unites input index with memory index and clears input load index More...
 
bool AddMemoryLoadIndexToMainLoadIndex ()
 unites memory index with main index and clears memory load index More...
 
bool SaveMemoryLoadIndex ()
 store memory load index on the disk More...
 
CStringIndexSetGetIndexByName (const string &Name)
 return a pointer to the index by CStringIndexSet::m_Name (linear search) More...
 
CStringIndexSetGetIndexByNameOrShortName (const string &Name)
 return a pointer to the index by CStringIndexSet::m_Name or CStringIndexSet::m_ShortName (linear search) More...
 
CStringIndexSetGetIndexByAlias (const string &Alias) const
 return a pointer to the index by long-name, short-name, or alias (most abstract, uses m_IndexMap) More...
 
CStringIndexSetGetTokenIndex ()
 return the first index that normally contains tokens themselves More...
 
const CStringIndexSetGetTokenIndex () const
 return the first index that normally contains tokens themselves More...
 

Public Attributes

string m_Path
 where all indices are stored More...
 
bool m_bMemoryMap
 whether to directly mmap() index file data (default=false) More...
 
vector< CStringIndexSet * > m_Indices
 the registered indices, by positional index More...
 
IndexAliasMap m_IndexAlias
 declared index aliases (FROM -> TO); not really used at runtime More...
 
IndexMap m_IndexMap
 all registered indices, keyed by long-name, short-name, or label (LABEL -> INDEX) More...
 
size_t m_MaxRegExpExpansionSize
 the maximal number of index items which can be included in an expansion set of one regular expression More...
 
CStringIndexSetm_pChunkIndex
 a quick reference to a chunk index, if CConcIndexator::m_bIndexChunks is on, otherwise null More...
 

Protected Member Functions

bool RegisterChunkIndex ()
 register chunk index (chunks:NP, VP etc) More...
 
string GetSearchPeriodsFileName () const
 return the file name for search periods More...
 
bool DestroyIndices ()
 call DestroyIndexSet for all registered indices More...
 
void ReadIndicesFromTheDisk ()
 call ReadFromTheDisk for all registered indices More...
 
void ClearStringIndices ()
 clear m_Indices More...
 
void IndexOneToken (CTokenIndexator *document, const char *Line, bool tryFixErrors=true)
 index one token and its properies (delimited by CConcCommon.h::globalFieldDelimeter) More...
 
void IndexTokenFixLongColumns (const size_t MaxLen, const size_t nCols, const char *InputLine, char *Out)
 moo: truncate long columns in InputLine, storing result in Out More...
 

Protected Attributes

vector< CTokenNom_SearchPeriods
 search periods of the corpus More...
 

Detailed Description

CStringIndexator contains a set of all token indices and corpus periods. It contains also the main path to the project file.

Member Typedef Documentation

◆ IndexAliasMap

typedef map<string,string> CStringIndexator::IndexAliasMap

typedef for index alias maps

◆ IndexMap

typedef for index symbol table

Constructor & Destructor Documentation

◆ CStringIndexator()

CStringIndexator::CStringIndexator ( )

◆ ~CStringIndexator()

CStringIndexator::~CStringIndexator ( )

References ClearStringIndices().

Here is the call graph for this function:

Member Function Documentation

◆ RegisterChunkIndex()

bool CStringIndexator::RegisterChunkIndex ( )
protected

register chunk index (chunks:NP, VP etc)

References ChunkIndexName, GetIndexByAlias(), CStringIndexSet::InitIndexSet(), m_Indices, and m_pChunkIndex.

Referenced by CConcordance::LoadOptionsFromString().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ GetSearchPeriodsFileName()

string CStringIndexator::GetSearchPeriodsFileName ( ) const
protected

return the file name for search periods

References m_Path, and MakeFName().

Referenced by CConcIndexator::DestroyIndex(), FinalSaveAllIndices(), and ReadIndicesFromTheDisk().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ DestroyIndices()

bool CStringIndexator::DestroyIndices ( )
protected

call DestroyIndexSet for all registered indices

References m_Indices.

Referenced by CConcIndexator::DestroyIndex().

Here is the caller graph for this function:

◆ ReadIndicesFromTheDisk()

void CStringIndexator::ReadIndicesFromTheDisk ( )
protected

call ReadFromTheDisk for all registered indices

References GetSearchPeriodsFileName(), m_Indices, m_SearchPeriods, and ReadVector().

Here is the call graph for this function:

◆ ClearStringIndices()

void CStringIndexator::ClearStringIndices ( )
protected

clear m_Indices

References m_Indices.

Referenced by RegisterStringIndices(), and ~CStringIndexator().

Here is the caller graph for this function:

◆ IndexOneToken()

void CStringIndexator::IndexOneToken ( CTokenIndexator document,
const char *  Line,
bool  tryFixErrors = true 
)
protected

◆ IndexTokenFixLongColumns()

void CStringIndexator::IndexTokenFixLongColumns ( const size_t  MaxLen,
const size_t  nCols,
const char *  InputLine,
char *  Out 
)
protected

moo: truncate long columns in InputLine, storing result in Out

References ddcLogWarn, Format(), globalFieldDelimeter, and stringSplit().

Referenced by IndexOneToken().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ RegisterStringIndices()

bool CStringIndexator::RegisterStringIndices ( const string &  IndicesStr)

read index declarations from a string and register them

References ClearStringIndices(), ErrorMessage(), GetIndexByAlias(), CStringIndexSet::InitIndexSet(), m_Indices, Name, RegisterIndexAlias(), Trim(), and StringTokenizer::val().

Referenced by CConcordance::LoadOptionsFromString().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ RegisterIndexAliases()

bool CStringIndexator::RegisterIndexAliases ( const string &  IndexAliasStr)

read index alias declarations from a string and register them; returns true iff all registrations were successful

References ddcLogWarn, Format(), StringTokenizer::next_token(), RegisterIndexAlias(), Trim(), and StringTokenizer::val().

Referenced by CConcordance::LoadOptionsFromString().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ RegisterIndexAlias() [1/2]

bool CStringIndexator::RegisterIndexAlias ( const string &  AliasFrom,
const string &  AliasTo 
)

register a single index alias (low-level); returns true iff AliasTo resolves to a known index according to m_AliasMap

References ddcLogWarn, Format(), GetIndexByAlias(), and m_IndexAlias.

Referenced by RegisterIndexAliases(), and RegisterStringIndices().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ RegisterIndexAlias() [2/2]

void CStringIndexator::RegisterIndexAlias ( const string &  AliasFrom,
CStringIndexSet idx 
)

register a single index label or alias (lowest-level); if idx is NULL, any existing entry for AliasFrom will be deleted

References ddcLogWarn, Format(), GetIndexByAlias(), m_IndexMap, and CStringIndexSet::m_Name.

Here is the call graph for this function:

◆ SetPath()

void CStringIndexator::SetPath ( string  Path)

set the path to the indices

References m_Path.

◆ GetIndicesString()

string CStringIndexator::GetIndicesString ( ) const

return all registered index declarations, in opt-file syntax

References ChunkIndexName, Format(), m_Indices, and Trim().

Referenced by CConcordance::LoadOptionsFromString(), and CConcordance::SaveOptionsToString().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ GetIndexAliasString()

string CStringIndexator::GetIndexAliasString ( ) const

return all registered index aliases, in opt-file syntax

References m_IndexAlias.

Referenced by CConcordance::LoadOptionsFromString(), and CConcordance::SaveOptionsToString().

Here is the caller graph for this function:

◆ GetSearchPeriodsCount()

size_t CStringIndexator::GetSearchPeriodsCount ( ) const

◆ GetSearchPeriod()

const CTokenNo& CStringIndexator::GetSearchPeriod ( size_t  i) const
inline

◆ StartIndexing()

bool CStringIndexator::StartIndexing ( string  Path)

call CreateTempFiles for all registered indices

References m_Indices, and m_Path.

Referenced by CConcIndexator::StartIndexing().

Here is the caller graph for this function:

◆ TerminateIndexing()

void CStringIndexator::TerminateIndexing ( )

call DeleteTempFiles for all registered indices

References m_Indices.

Referenced by CConcIndexator::TerminateIndexing().

Here is the caller graph for this function:

◆ FinalSaveAllIndices()

bool CStringIndexator::FinalSaveAllIndices ( bool  bAfterLoading)

final saving all indices to disk (converting temp files to persistent)

References GetSearchPeriodsFileName(), m_Indices, m_SearchPeriods, and WriteVector().

Referenced by CConcIndexator::CreateAsUnion(), ConcIndexatorInvoker::FinalizeIndex(), and CConcIndexator::SplitProject().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ AddInputLoadIndexToMemoryLoadIndex()

bool CStringIndexator::AddInputLoadIndexToMemoryLoadIndex ( )

unites input index with memory index and clears input load index

References m_Indices.

Referenced by ConcIndexatorInvoker::AddInputLoadIndexToMemoryLoadIndexWrapper(), ConcIndexatorInvoker::FinalizeIndex(), and ConcIndexatorInvoker::SaveLoadIndexToDisk().

Here is the caller graph for this function:

◆ AddMemoryLoadIndexToMainLoadIndex()

bool CStringIndexator::AddMemoryLoadIndexToMainLoadIndex ( )

unites memory index with main index and clears memory load index

References m_Indices.

Referenced by ConcIndexatorInvoker::FinalizeIndex(), and ConcIndexatorInvoker::SaveLoadIndexToDisk().

Here is the caller graph for this function:

◆ SaveMemoryLoadIndex()

bool CStringIndexator::SaveMemoryLoadIndex ( )

store memory load index on the disk

References m_Indices.

Referenced by ConcIndexatorInvoker::FinalizeIndex(), and ConcIndexatorInvoker::SaveLoadIndexToDisk().

Here is the caller graph for this function:

◆ GetIndexByName()

CStringIndexSet * CStringIndexator::GetIndexByName ( const string &  Name)

return a pointer to the index by CStringIndexSet::m_Name (linear search)

References m_Indices.

◆ GetIndexByNameOrShortName()

CStringIndexSet * CStringIndexator::GetIndexByNameOrShortName ( const string &  Name)

return a pointer to the index by CStringIndexSet::m_Name or CStringIndexSet::m_ShortName (linear search)

References m_Indices, and Name.

◆ GetIndexByAlias()

CStringIndexSet * CStringIndexator::GetIndexByAlias ( const string &  Alias) const

◆ GetTokenIndex() [1/2]

CStringIndexSet * CStringIndexator::GetTokenIndex ( )

return the first index that normally contains tokens themselves

References m_Indices.

Referenced by CQFContextSort::Compile().

Here is the caller graph for this function:

◆ GetTokenIndex() [2/2]

const CStringIndexSet * CStringIndexator::GetTokenIndex ( ) const

return the first index that normally contains tokens themselves

References m_Indices.

Member Data Documentation

◆ m_SearchPeriods

vector<CTokenNo> CStringIndexator::m_SearchPeriods
protected

◆ m_Path

string CStringIndexator::m_Path

◆ m_bMemoryMap

bool CStringIndexator::m_bMemoryMap

◆ m_Indices

vector<CStringIndexSet*> CStringIndexator::m_Indices

◆ m_IndexAlias

IndexAliasMap CStringIndexator::m_IndexAlias

declared index aliases (FROM -> TO); not really used at runtime

Referenced by GetIndexAliasString(), and RegisterIndexAlias().

◆ m_IndexMap

IndexMap CStringIndexator::m_IndexMap

all registered indices, keyed by long-name, short-name, or label (LABEL -> INDEX)

Referenced by GetIndexByAlias(), CDDCLeafServer::handle__info(), and RegisterIndexAlias().

◆ m_MaxRegExpExpansionSize

size_t CStringIndexator::m_MaxRegExpExpansionSize

◆ m_pChunkIndex

CStringIndexSet* CStringIndexator::m_pChunkIndex

The documentation for this class was generated from the following files: