ddc
|
class for a single (thread-local) DDC query session; formerly CConcHolder An instance of CConcSession is created for each thread querying a corpus. It contains all user options for query processing such as m_ResultLimit (maximal number of hits to output) or m_QueryResultStr (the string representation of the query result). More...
#include <ConcSession.h>
Public Member Functions | |
Low-level API (formerly private) | |
void | AddFileReference (const long FileNo) |
add a reference to FileNo according to m_ResultFormat More... | |
void | ShowBibliographyForTextOrHtml (const CHit &Hit, DWORD PageNumber) |
add bibliographical information about Hit to m_QueryResultStr More... | |
bool | ShowBibliographyForTable (DWORD PageNumber, const CHit &Hit, const vector< COutputToken > &Tokens) |
add bibliographical information about Hit to m_QueryResultStr under TableFormat More... | |
bool | GenerateOneHitString (DWORD PageNumber, const CHit &Hit, const vector< COutputToken > &Tokens) |
add hit string built by Hit to to m_QueryResultStr More... | |
bool | GenerateOneHitStringJson (DWORD PageNumber, const CHit &Hit, const vector< COutputToken > &Tokens) |
json: add hit string built by Hit to to m_QueryResultStr More... | |
bool | GetContext (int StartBreakNo, int EndBreakNo, const DWORD CurrFileNo, const bool bConvertASCIIToHtmlSymbols, string &Result) const |
add hit strings [StartBreakNo, EndBreakNo) without highlighting to m_QueryResultStr More... | |
bool | GetContextJson (int StartBreakNo, int EndBreakNo, const DWORD CurrFileNo, string &js) const |
append json hit strings [StartBreakNo, EndBreakNo) without highlighting to js More... | |
DDCErrorEnum | GetAllHits (const string &Query, size_t Start, size_t Limit) |
DDCErrorEnum | GetAllHits (CQuery *QueryRoot, size_t Start, size_t Limit) |
bool | IsUniversalCountQuery (CQuery *QueryRoot) const |
check if CountQuery is a count(*) query suitable for use with GetUniversalCounts() More... | |
bool | TryToGetFromCache (const string &Query, DWORD &EndHitNo) |
checks if Query si already in the cache, and if true, it returns its hit results from the cache More... | |
void | SaveToCache (const string &Query, vector< size_t >::const_iterator start, vector< size_t >::const_iterator end) |
stores Query to the cache More... | |
void | SetHitType () |
sets hit type, initializing m_pBreaks More... | |
bool | GetFileSnippets (const int HitNo, vector< COutputToken > &Tokens) const |
creates snippets, concatenating contexts of found words More... | |
bool | SaveOccurrences (const vector< DWORD > &ChunkLengths, int ContextSize, const vector< CTokenNo > &Occurrences, const vector< CHit > &Hits, SaveTriggerType SaveTrigger, DWORD LParam) |
saves current found hits using SaveTrigger, this function is only called from GetOccurrences More... | |
bool | GetTokensFromStorageByBreak (size_t IndexNo, size_t BreakNo, vector< COutputToken > &Tokens) const |
initializes Tokens with words of hit BreakNo More... | |
void | InitFileReferences (vector< CHit > &Hits) const |
initializes CHit::m_FileNo for each hit of Hits More... | |
void | InitSortKeyForHits (const CQuery *pQuery, const CDDCFilterWithBounds &Filter, vector< size_t > &PeriodHitsIndex) |
void | InitSortByRank (const CQuery *pQuery) const |
void | InitSortBySize (const CQuery *pQuery) const |
void | InitSortByRandom (const CQuery *pQuery) const |
void | InitSortByContext (const CQuery *pQuery, const CDDCFilterWithBounds &Filter) const |
void | SortKeyLB (CHitSortKey &key, const CDDCFilterWithBounds &Filter) |
Constructors etc. | |
CConcSession (CConcSessionContext *SessionContext=NULL) | |
~CConcSession () | |
Worker-thread and ConcSessionContext utilities | |
CConcSession * | WorkerClone (size_t WorkerId) |
void | WorkerCloneFree () |
int | LockSessionContext () |
int | UnlockSessionContext () |
void | ClearQueryCache () |
size_t | CacheSize (void) const |
Public Member Functions inherited from ConcIndexatorInvoker | |
ConcIndexatorInvoker () | |
void | SetCurrMessage (string Message) const |
outputs a message to stdout or to GUI More... | |
bool | BuildIndex (string ProjectFile) |
builds index files for project ProjectFile More... | |
Public Member Functions inherited from CQueryResult | |
void | ClearQueryResults () |
clears CQueryResult fields More... | |
template<typename HitIndexLessThanT > | |
void | SortResultsByIndex (HitIndexLessThanT &HitIndexLess) |
template<typename HitLessThanT > | |
void | SortResultsByHit (HitLessThanT &HitLess) |
template<typename HitIndexLessThanT > | |
void | SortResultsByIndexP (HitIndexLessThanT HitIndexLess) |
template<typename HitLessThanT > | |
void | SortResultsByHitP (HitLessThanT HitLess) |
Public Attributes | |
Low-level data (formerly private) | |
CConcSessionContext * | m_pSessionContext |
shared session data (cache, etc.) More... | |
bool | m_bSessionMaster |
are we acting as a session master? if false, m_pSessionContext will be freed on object destruction; default=true More... | |
size_t | m_WorkerId |
local worker-thread ID (default=0) More... | |
CQueryCompiler * | m_pQueryCompiler |
current query compiler, for compilation & evaluation of input queries. More... | |
DDCRandom * | m_pRandom |
pseudo-random number generator More... | |
time_t | m_QueryEndTime |
how much time a query can be processed, by default unlimited (-1) More... | |
const ddcBreakVector * | m_pBreaks |
a pointer to the current hits collection More... | |
DDCFormatTypeEnum | m_ResultFormat |
the format of query result More... | |
High-level data | |
unsigned int | m_RandomSeed |
initial random-state components for m_pRandom More... | |
CShortOccurCacheMap | m_ShortOccurCaches |
a cache for short occurrence lists which is used during iterating through corpus periods and evaluating of the same query More... | |
string | m_QueryResultStr |
the result of the query (its format depends upon m_ResultFormat) More... | |
string | m_ErrorStr |
most recent error message (if applicable) More... | |
CConcordance * | m_pConcordance |
m_pConcordance is the main (and the only) pointer to corpus indices and break collections. During the querying this pointer is used as a constant. Class CConcSession's original name "CConcHolder" was chosen because the class "holds" this pointer. More... | |
size_t | m_ResultOffset |
size_t | m_ResultLimit |
string | m_ResultMinKey |
string | m_RequestPath |
full request path leading to this session (used by CDDCLeafServer) More... | |
size_t | m_CurrentSearchPeriodNo |
The index of the current subcorpora, which is now being processed. More... | |
string | m_AdditionalHitDelimiter |
a delimiter which should be used between hits in m_QueryResultStr in the distributed model More... | |
Public Attributes inherited from ConcIndexatorInvoker | |
bool | m_bStoppedByUser |
if true, CConcIndexatorInvoker tries to stop indexing More... | |
bool | m_bOnlyReindexMorphology |
if true, then BuildIndex should only rebuild MorphPattern index More... | |
bool | m_bSkipInitialFileChecking |
if true, then there is no initial checking whether the source files exist More... | |
string | m_CurrMessage |
the last message from indexing process More... | |
bool | m_bStdout |
should DDC send all messages to stdout More... | |
int | m_CurrentSourceFileNo |
the index of the currently processing source file More... | |
int | m_SourceFilesNumber |
the number of files to index More... | |
string | m_CurrentSourceFileName |
the name of file More... | |
CMyTimeSpanHolder | m_Profiler |
a slot to gather profiling information for loading stage More... | |
Public Attributes inherited from CQueryResult | |
vector< CHit > | m_Hits |
found hits (not more than m_pConcordance->m_MaxCachedHitsCount). More... | |
vector< CTokenNo > | m_HighlightOccurs |
words that should be highlighted in hits; m_HighlightOccurs is the concatenation of CQueryNode::m_Occurrences for all subcorpora More... | |
vector< BYTE > | m_HighlightIds |
highlighting match-ids for m_HighlightOccurs More... | |
size_t | m_AllHitsCount |
the number of all found hits (if m_AllHitsCount < m_pConcordance->m_MaxCachedHitsCount, then m_Hits.size() == m_AllHitsCount) More... | |
size_t | m_RelevantDocumentCount |
the number of documents, where at least one hit is found More... | |
bool | m_bSortByString: 1 |
whether to sort by string-value More... | |
bool | m_bPrune: 1 |
whether primary sort is a prune-sort More... | |
HitSortOrderEnum | m_SortOrder: 4 |
hit sort order More... | |
vector< BYTE > | m_DebugInfo |
? More... | |
High-level API (public) | |
const ddcBreakVector & | GetBreaks () const |
GetBreaks returns the vector of current breaks (by m_pBreaks). More... | |
DDCFormatTypeEnum | GetResultFormat () const |
return the current format of hit More... | |
string | GetResultFormatStr () const |
return string representation of m_ResultFormat More... | |
void | SetResultFormat (string ResultTypeStr) |
set the current format of hit More... | |
DDCErrorEnum | GetOccurrences (const string &Query, int ContextSize, SaveTriggerType SaveTrigger, DWORD LParam) |
Finds all occurrences of Query(only occurrences , not hits!), if Query is an atomic query (CQueryNode::m_bAtomic), For each found occurrence it calls SaveTrigger, which normally should save all occurrences to a file. This function is called in application ConcordPattern. More... | |
DDCErrorEnum | SimpleQuery (const string &Query, DWORD &EndHitNo, DWORD &HitsCount) |
SimpleQuery finds hits by the given query. EndHitNo is used as an input/output parameter. More... | |
DDCErrorEnum | GetHits (const string &QueryStr, DWORD &EndHitNo) |
DDCErrorEnum | GetHits (CQuery *QueryRoot, DWORD &EndHitNo) |
DDCErrorEnum | GetHits (CQuery *QueryRoot, DWORD &EndHitNo, const string &QueryStr) |
DDCErrorEnum | GenerateHitStrings (const int StartHitNo, bool UseAdditionalHitDelimiter=true) |
DDCErrorEnum | GenerateCountStrings (const int StartHitNo, bool UseAdditionalHitDelimiter=true) |
size_t | GetOffsetHint (const size_t StartHitNo) const |
get offset-hint appropriate for next page (after GenerateHitStrings()); used by CDDCLeafServer More... | |
string | GetSortKeyHint (const size_t StartHitNo) const |
get sortkey-hint appropriate for next page (after GenerateHitStrings()); used by CDDCLeafServer More... | |
string | GetHitIds () const |
string | GetCountIds () const |
HitSortOrderEnum | HitSortOrder () const |
get logical hit sort order; replaces HitsShouldBeSorted() More... | |
void | SetTimeOut (int TimeOut) |
sets timeout for query processing More... | |
void | ClearQuery () |
clears the current parsed query (if any) More... | |
int | GetTextArea () const |
return the text area to be be searched More... | |
void | ClearQueryResults () |
clears CQueryResult fields, also m_ErrorStr and m_ResultOffset More... | |
bool | HasRankOrderOperator () const |
bool | HasMatchIdOperator () const |
int | GetBreakStarterLength () const |
return the length of break prefix, where DDC should search (#within[sentence, 10]) More... | |
string | BuildJsonContextString (const vector< COutputToken > &Tokens, bool doHighlight=true) const |
moo: build a json context string by parsing delimited token data More... | |
string | CanonicalQueryString (const string &Query) |
moo: return a canonical representation of the query string Query (implicitly parses) More... | |
string | JsonQueryString (const string &Query) |
moo: return a JSONr epresentation of the query string Query (implicitly parses) More... | |
void | SetRandomSeed (unsigned int seed1=0) const |
moo: set internal random seed to m_RandomSeed+seed1 More... | |
TxDispatcher * | GetTxDispatcher () const |
moo: get term-expansion dispatcher for this object (wrapper for &m_pConcordance.m_Txd) More... | |
static DDCFormatTypeEnum | GetResultFormatByString (const string &ResultTypeStr) |
converts a string to a FormatTypeEnum More... | |
static void | DecorateQueryResults (const string &ResultTypeStr, string &QueryResultString) |
adds header and footer to QueryResultStr according to format ResultTypeStr More... | |
class for a single (thread-local) DDC query session; formerly CConcHolder An instance of CConcSession is created for each thread querying a corpus. It contains all user options for query processing such as m_ResultLimit (maximal number of hits to output) or m_QueryResultStr (the string representation of the query result).
As of v2.1.0, data to be shared between multiple threads (e.g. cache) lives in CConcSessionContext, which see for details.
CConcSession::CConcSession | ( | CConcSessionContext * | SessionContext = NULL | ) |
Default constructor
SessionContext | shared context for this session; if unspecified or NULL, a new local CConcSessionContext will be created and freed when the CConcSession object itself is destroyed; otherwise, user is responsible for freeing SessionContext. |
References DDC_ResultText, hsoNone, SIZE_MAX, and TheEndOfTheWorld.
CConcSession::~CConcSession | ( | ) |
Default destructor
void CConcSession::AddFileReference | ( | const long | FileNo | ) |
add a reference to FileNo according to m_ResultFormat
References DDC_ResultHTML, DDC_ResultTable, and DDC_ResultText.
add bibliographical information about Hit to m_QueryResultStr
References DDC_ResultHTML, DDC_ResultTable, DDC_ResultText, TinyXPath::dummy, Format(), CBibliography::m_DateStr, CHit::m_DebugRankNo, CHit::m_FileNo, CBibliography::m_OrigBibl, CBibliography::m_ScanBibl, CStringIndexSet::m_ShortName, CHit::m_Value, and UnknownPageNumber.
bool CConcSession::ShowBibliographyForTable | ( | DWORD | PageNumber, |
const CHit & | Hit, | ||
const vector< COutputToken > & | Tokens | ||
) |
add bibliographical information about Hit to m_QueryResultStr under TableFormat
References ConvertASCIIToHtmlSymbols(), Format(), globalTableItemsDelim, CHitSortKey::i, CBibliography::m_DateStr, CHit::m_DebugRankNo, CHit::m_FileNo, CBibliography::m_OrigBibl, CBibliography::m_ScanBibl, CStringIndexSet::m_ShortName, CHit::m_SortKey, CHit::m_Value, and UnknownPageNumber.
bool CConcSession::GenerateOneHitString | ( | DWORD | PageNumber, |
const CHit & | Hit, | ||
const vector< COutputToken > & | Tokens | ||
) |
add hit string built by Hit to to m_QueryResultStr
References BuildHtmlHitStrWithHighlighting(), DDC_ResultDocIds, DDC_ResultHTML, DDC_ResultJson, DDC_ResultTable, DDC_ResultText, ErrorMessage(), Format(), CHitSortKey::i, CHit::m_BreakNo, CHit::m_DebugRankNo, CHit::m_FileNo, CHit::m_SortKey, and CHit::m_Value.
bool CConcSession::GenerateOneHitStringJson | ( | DWORD | PageNumber, |
const CHit & | Hit, | ||
const vector< COutputToken > & | Tokens | ||
) |
json: add hit string built by Hit to to m_QueryResultStr
References ErrorMessage(), Format(), CHitSortKey::i, jsonStr(), CHit::m_BreakNo, CBibliography::m_DateStr, CHit::m_DebugRankNo, CHit::m_FileNo, CHit::m_HighlightOccurrenceEnd, CBibliography::m_OrigBibl, CBibliography::m_ScanBibl, CHit::m_SortKey, CHit::m_Value, CHitSortKey::s, and UnknownPageNumber.
bool CConcSession::GetContext | ( | int | StartBreakNo, |
int | EndBreakNo, | ||
const DWORD | CurrFileNo, | ||
const bool | bConvertASCIIToHtmlSymbols, | ||
string & | Result | ||
) | const |
add hit strings [StartBreakNo, EndBreakNo) without highlighting to m_QueryResultStr
References BuildHtmlHitStrWithHighlighting(), DDC_ResultTable, ErrorMessage(), Format(), PredefinedFileBreakName, and ddcVecFile< T >::size().
bool CConcSession::GetContextJson | ( | int | StartBreakNo, |
int | EndBreakNo, | ||
const DWORD | CurrFileNo, | ||
string & | js | ||
) | const |
append json hit strings [StartBreakNo, EndBreakNo) without highlighting to js
defined(DDC_JSON_DEEP_CONTEXT)
References ErrorMessage(), errOther, Format(), PredefinedFileBreakName, and ddcVecFile< T >::size().
DDCErrorEnum CConcSession::GetAllHits | ( | const string & | Query, |
size_t | Start, | ||
size_t | Limit | ||
) |
evaluate query on the corpus and initialize slots from CQueryResult
DDCErrorEnum CConcSession::GetAllHits | ( | CQuery * | QueryRoot, |
size_t | Start, | ||
size_t | Limit | ||
) |
evaluate query on the corpus and initialize slots from CQueryResult, given a pre-parsed query
References CQCount::CanCountByFile(), ClearContainer(), concord_daemon_log(), CQCount::ConvertCountsToHits(), CQCount::CountLocal(), CQCount::CountUniversal(), ddcLogDebug, dumpHitIndex(), errNone, errTimeoutElapsed, Format(), HIT_TRIM_DEBUG, HitSortEnumNames, hsoNone, CHitSortKey::i, IsLessByHitSortKey::IsLessByHitSortKey(), CDDCFilterWithBounds::IsPruneFilter(), jsonStr(), CDDCFilterWithBounds::m_AttrName, CQueryOptions::m_bSeparateHits, CDDCFilterWithBounds::m_bSet, CQueryNode::m_bUseMatchIds, CQCount::m_CountSample, CQCount::m_dtr, CQueryOptions::m_Filters, CDDCFilterWithBounds::m_FilterType, CHit::m_HighlightOccurrenceEnd, CQueryNode::m_Hits, CDDCFilterWithBounds::m_KeyHi, CDDCFilterWithBounds::m_KeyLo, CQuery::m_Node, CQueryNode::m_OccurrenceMatchIds, CQueryNode::m_Occurrences, CQuery::m_Options, CQCount::m_sample, CDDCFilterWithBounds::m_SatisfiedValues, my_lower_bound(), my_upper_bound(), CHitSortKey::s, CDDCFilterWithBounds::SortOrder(), and TheEndOfTheWorld.
bool CConcSession::IsUniversalCountQuery | ( | CQuery * | QueryRoot | ) | const |
check if CountQuery is a count(*) query suitable for use with GetUniversalCounts()
bool CConcSession::TryToGetFromCache | ( | const string & | Query, |
DWORD & | EndHitNo | ||
) |
checks if Query si already in the cache, and if true, it returns its hit results from the cache
void CConcSession::SaveToCache | ( | const string & | Query, |
vector< size_t >::const_iterator | start, | ||
vector< size_t >::const_iterator | end | ||
) |
stores Query to the cache
void CConcSession::SetHitType | ( | ) |
sets hit type, initializing m_pBreaks
References PredefinedTextAreaBreakName, and UnknownTextAreaNo.
bool CConcSession::GetFileSnippets | ( | const int | HitNo, |
vector< COutputToken > & | Tokens | ||
) | const |
creates snippets, concatenating contexts of found words
References errReadSourceFile.
bool CConcSession::SaveOccurrences | ( | const vector< DWORD > & | ChunkLengths, |
int | ContextSize, | ||
const vector< CTokenNo > & | Occurrences, | ||
const vector< CHit > & | Hits, | ||
SaveTriggerType | SaveTrigger, | ||
DWORD | LParam | ||
) |
saves current found hits using SaveTrigger, this function is only called from GetOccurrences
References concord_daemon_log(), DDC_STATIC_BUFLEN, and Format().
bool CConcSession::GetTokensFromStorageByBreak | ( | size_t | IndexNo, |
size_t | BreakNo, | ||
vector< COutputToken > & | Tokens | ||
) | const |
initializes Tokens with words of hit BreakNo
References errReadSourceFile, CStringIndexSet::GetTokensFromStorage(), and CStringIndexSet::m_Name.
void CConcSession::InitFileReferences | ( | vector< CHit > & | Hits | ) | const |
initializes CHit::m_FileNo for each hit of Hits
References ddcVecFile< T >::begin(), DDC_FILEREF_BINSEARCH_COEF, ddcVecFile< T >::end(), log2u32(), CHit::m_BreakNo, CHit::m_FileNo, CHit::m_SortKey, ddcVecFile< T >::size(), and VectorStride().
void CConcSession::InitSortKeyForHits | ( | const CQuery * | pQuery, |
const CDDCFilterWithBounds & | Filter, | ||
vector< size_t > & | PeriodHitsIndex | ||
) |
initializes CHit::m_SortKey for each in m_pQueryCompiler->m_pQueryTree->m_Hits. Also sets m_bSortDescending=Filter.isDescending(), m_bSortByString, m_bPrune appropriately
References cfbiString, cfbiStringConstant, GreaterByDate, GreaterByFreeBiblField, GreaterByLeftContext, GreaterByMiddleContext, GreaterByPruneKey, GreaterByRank, GreaterByRightContext, GreaterBySize, HitSortsCount, LessByDate, LessByFreeBiblField, LessByLeftContext, LessByMiddleContext, LessByPruneKey, LessByRank, LessByRightContext, LessBySize, CDDCFilterWithBounds::m_AttrName, CDDCFilterWithBounds::m_FilterType, CQueryNode::m_Hits, CQuery::m_Node, CDDCFilterWithBounds::m_Parent, NoSort, RandomSort, and CDDCFilterWithBounds::SortOrder().
void CConcSession::InitSortByRank | ( | const CQuery * | pQuery | ) | const |
References Format(), GetHitRankLen(), CQueryNode::GetNodeFrequencyByNodeIndex(), CHitSortKey::i, log(), CQueryOptions::m_bDebugRank, CHit::m_BreakNo, CHit::m_DebugRankNo, CHit::m_HighlightOccurrenceEnd, CQueryNode::m_Hits, CHitRank::m_IDFs, CQuery::m_Node, CQueryNode::m_OccurrenceNodeIndices, CQueryNode::m_Occurrences, CQuery::m_Options, CHitRank::m_PassageEnd, CHitRank::m_PassageStart, CHitRank::m_QueryNodeFreqs, CHit::m_SortKey, CHit::m_Value, and ddcVecFile< T >::size().
void CConcSession::InitSortBySize | ( | const CQuery * | pQuery | ) | const |
References CHit::m_BreakNo, CQueryNode::m_Hits, CQuery::m_Node, and CHit::m_SortKey.
void CConcSession::InitSortByRandom | ( | const CQuery * | pQuery | ) | const |
References CQueryNode::m_Hits, CQuery::m_Node, and CHit::m_SortKey.
void CConcSession::InitSortByContext | ( | const CQuery * | pQuery, |
const CDDCFilterWithBounds & | Filter | ||
) | const |
References errOther, Format(), CQueryNode::GetFirstOccurrenceInHit(), CStringIndexSet::GetIndexItemStr(), CQueryNode::GetLastOccurrenceInHit(), CQueryNode::GetMiddleOccurrenceInHit(), CStringIndexSet::GetTokenIndexId(), GreaterByLeftContext, GreaterByMiddleContext, GreaterByRightContext, CHitSortKey::i, LessByLeftContext, LessByMiddleContext, LessByRightContext, CDDCFilterWithBounds::m_AttrName, CHit::m_BreakNo, CQueryNode::m_bUseMatchIds, CDDCFilterWithBounds::m_ContextMatchId, CDDCFilterWithBounds::m_ContextOffset, CDDCFilterWithBounds::m_FilterType, CHit::m_HighlightOccurrenceEnd, CQueryNode::m_Hits, CIndexSetForQueryingStage::m_Index, CQuery::m_Node, CQueryNode::m_OccurrenceMatchIds, CQueryNode::m_Occurrences, CHit::m_SortKey, and CHitSortKey::s.
void CConcSession::SortKeyLB | ( | CHitSortKey & | key, |
const CDDCFilterWithBounds & | Filter | ||
) |
initialize a CHitSortKey integer lower-bound for its string key with respect to Filter
Key | (input/output) |
Filter | (input) filter with respect to which key is to be initialized |
References ddcLogWarn, errOther, Format(), CFreeBiblIndexInterface::GetIntegerLowerBound(), CStringIndexSet::GetTypeIndexIdLowerBound(), GreaterByFreeBiblField, GreaterByLeftContext, GreaterByMiddleContext, GreaterByPruneKey, GreaterByRightContext, hex2int(), HitSortEnumStrings, CHitSortKey::i, LessByFreeBiblField, LessByLeftContext, LessByMiddleContext, LessByPruneKey, LessByRightContext, CDDCFilterWithBounds::m_AttrName, CDDCFilterWithBounds::m_BiblIndex, CDDCFilterWithBounds::m_FilterType, NoSort, and CHitSortKey::s.
CConcSession * CConcSession::WorkerClone | ( | size_t | WorkerId | ) |
create a minimal copy of this object for use by a worker thread
!!!!!!! hic sunt dracones !!!!!!!
References concord_daemon_log(), Format(), m_AdditionalHitDelimiter, m_pBreaks, m_pConcordance, m_pRandom, m_pSessionContext, m_RandomSeed, m_WorkerId, and DDCRandom::set_seed().
Referenced by CDDCLeafServer::WorkerCloneInit().
void CConcSession::WorkerCloneFree | ( | ) |
perform local cleanup of a worker clone prior to deletion
Referenced by CDDCLeafServer::WorkerCloneFree().
int CConcSession::LockSessionContext | ( | ) |
lock session context
int CConcSession::UnlockSessionContext | ( | ) |
unlock session context
void CConcSession::ClearQueryCache | ( | ) |
clears the shared query cache
Referenced by CDDCLeafServer::handle__clear_cache().
size_t CConcSession::CacheSize | ( | void | ) | const |
return shared query cache size
Referenced by CDDCLeafServer::handle__status().
const ddcBreakVector & CConcSession::GetBreaks | ( | ) | const |
GetBreaks returns the vector of current breaks (by m_pBreaks).
Referenced by CQueryNode::ConvertOccurrencesToHits(), CQueryNode::ConvertOccurrencesToHitsForPatterns(), and CQCountKeyExprToken::Evaluate().
|
inline |
return the current format of hit
References BuildJsonContextString(), CanonicalQueryString(), ClearQuery(), ClearQueryResults(), DecorateQueryResults(), GenerateCountStrings(), GenerateHitStrings(), GetBreakStarterLength(), GetCountIds(), GetHitIds(), GetHits(), GetOccurrences(), GetOffsetHint(), GetResultFormatByString(), GetResultFormatStr(), GetSortKeyHint(), GetTextArea(), HasMatchIdOperator(), HasRankOrderOperator(), HitSortOrder(), JsonQueryString(), m_ResultFormat, SetRandomSeed(), SetResultFormat(), SetTimeOut(), and SimpleQuery().
string CConcSession::GetResultFormatStr | ( | ) | const |
return string representation of m_ResultFormat
References DDC_ResultDocIds, DDC_ResultHTML, DDC_ResultJson, and DDC_ResultTable.
Referenced by GetResultFormat(), CDDCLeafServer::handle__get_hit_strings(), and CDDCLeafServer::handle__run_query().
void CConcSession::SetResultFormat | ( | string | ResultTypeStr | ) |
set the current format of hit
Referenced by GetResultFormat(), CDDCLeafServer::handle__get_hit_strings(), and CDDCLeafServer::handle__run_query().
DDCErrorEnum CConcSession::GetOccurrences | ( | const string & | Query, |
int | ContextSize, | ||
SaveTriggerType | SaveTrigger, | ||
DWORD | LParam | ||
) |
Finds all occurrences of Query(only occurrences , not hits!), if Query is an atomic query (CQueryNode::m_bAtomic), For each found occurrence it calls SaveTrigger, which normally should save all occurrences to a file. This function is called in application ConcordPattern.
References CQueryNode::ConvertOccurrencesToHitsForPatterns(), ddcLogError, errNone, ErrorMessage(), errParseError, CQueryNode::EvaluateWithoutHits(), CQueryNode::m_ChunkLengths, CExpc::m_ErrorCode, CQueryNode::m_Hits, CQueryNode::m_Occurrences, and CExpc::m_strCause.
Referenced by GetResultFormat().
DDCErrorEnum CConcSession::SimpleQuery | ( | const string & | Query, |
DWORD & | EndHitNo, | ||
DWORD & | HitsCount | ||
) |
SimpleQuery finds hits by the given query. EndHitNo is used as an input/output parameter.
Let H0...Hn be all hits which match the Query. Then EndHitNo must be 0<= EndHitNo<=n. Let S be min (n, EndHitNo+m_ResultLimit-1). The function saves hits [EndHitNo, EndHitNo+1,... EndHitNo+s] to result strings. After this it makes EndHitNo equal to s+1. The function returns errNone, if there is no parse error in the given query.
(moo) put simply:
References DDC_ResultHTML, ddcLogError, errNone, ErrorMessage(), Format(), and CExpc::m_strCause.
Referenced by GetResultFormat().
DDCErrorEnum CConcSession::GetHits | ( | const string & | QueryStr, |
DWORD & | EndHitNo | ||
) |
GetHits does the same as SimpleQuery does, but without GenerateHitStrings()
References errParseError.
Referenced by CQKeys::Compile(), GetResultFormat(), CDDCLeafServer::handle__get_first_hits(), and CDDCLeafServer::handle__run_query().
DDCErrorEnum CConcSession::GetHits | ( | CQuery * | QueryRoot, |
DWORD & | EndHitNo | ||
) |
GetHits() variant for pre-parsed queries, cache key is generated as QueryRoot->toString() + QueryRoot->optionsToString()
References CQuery::optionsToString(), and CQuery::toString().
DDCErrorEnum CConcSession::GetHits | ( | CQuery * | QueryRoot, |
DWORD & | EndHitNo, | ||
const string & | QueryStr | ||
) |
GetHits() guts: pre-parsed QueryRoot and given string cache-key QueryStr
References CQueryResultIndex::Apply(), ddcLogDebug, ddcLogError, dumpHits(), errNone, ErrorMessage(), errParseError, errRuntime, Format(), HIT_SORT_DEBUG, HIT_TRIM_DEBUG, IsLessByHitSortKey::IsLessByHitSortKey(), CQueryCompiler::m_bSatisfiable, CQuery::m_Compiler, and CExpc::m_strCause.
DDCErrorEnum CConcSession::GenerateHitStrings | ( | const int | StartHitNo, |
bool | UseAdditionalHitDelimiter = true |
||
) |
initializes m_QueryResultStr using current m_Hits and m_HighlightOccurs (for context-queries)
StartHitNo | logical offset of 1st hit |
References DDC_ResultHTML, DDC_ResultJson, errNone, errReadSourceFile, errRuntime, IsCountSort(), and PredefinedFileBreakName.
Referenced by GetResultFormat(), CDDCLeafServer::handle__get_hit_strings(), and CDDCLeafServer::handle__run_query().
DDCErrorEnum CConcSession::GenerateCountStrings | ( | const int | StartHitNo, |
bool | UseAdditionalHitDelimiter = true |
||
) |
initializes m_QueryResultStr using current m_Hits (for count-queries)
StartHitNo | logical offset of 1st count-hit |
References DDC_ResultJson, ddcLogDebug, dumpHits(), errNone, Format(), GenerateCountString(), and HIT_SORT_DEBUG.
Referenced by GetResultFormat().
size_t CConcSession::GetOffsetHint | ( | const size_t | StartHitNo | ) | const |
get offset-hint appropriate for next page (after GenerateHitStrings()); used by CDDCLeafServer
Referenced by GetResultFormat(), CDDCLeafServer::handle__get_hit_strings(), and CDDCLeafServer::handle__run_query().
string CConcSession::GetSortKeyHint | ( | const size_t | StartHitNo | ) | const |
get sortkey-hint appropriate for next page (after GenerateHitStrings()); used by CDDCLeafServer
References int2hex(), and IsCountSort().
Referenced by GetResultFormat(), and CDDCLeafServer::handle__run_query().
string CConcSession::GetHitIds | ( | ) | const |
stores sort-keys (CHit::m_SortKey) of current m_Hits to a string
References DDC_SORTKEY_MAXLEN, Format(), int2hex(), IsCountSort(), CHit::m_SortKey, and CHitSortKey::s.
Referenced by GetResultFormat(), and CDDCLeafServer::handle__get_first_hits().
string CConcSession::GetCountIds | ( | ) | const |
GetHidIds() variant for count-queries; returns count-IDs of all hits in m_Hits[]
References DDC_SORTKEY_MAXLEN, and Format().
Referenced by GetResultFormat().
HitSortOrderEnum CConcSession::HitSortOrder | ( | ) | const |
get logical hit sort order; replaces HitsShouldBeSorted()
Referenced by GetResultFormat(), CDDCLeafServer::handle__get_first_hits(), and CDDCLeafServer::handle__run_query().
void CConcSession::SetTimeOut | ( | int | TimeOut | ) |
sets timeout for query processing
References TheEndOfTheWorld.
Referenced by GetResultFormat(), CDDCLeafServer::handle__expand_terms(), CDDCLeafServer::handle__get_first_hits(), and CDDCLeafServer::handle__run_query().
void CConcSession::ClearQuery | ( | ) |
clears the current parsed query (if any)
Referenced by CDDCLeafServer::Close(), and GetResultFormat().
int CConcSession::GetTextArea | ( | ) | const |
return the text area to be be searched
Referenced by CQueryNode::ConvertOccurrencesToHits(), CQueryNode::ConvertOccurrencesToHitsForPatterns(), and GetResultFormat().
void CConcSession::ClearQueryResults | ( | ) |
clears CQueryResult fields, also m_ErrorStr and m_ResultOffset
References CQueryResult::ClearQueryResults(), and ClearString().
Referenced by CConcSessionContext::CacheGet(), CDDCLeafServer::Close(), CQKeys::Compile(), and GetResultFormat().
|
static |
converts a string to a FormatTypeEnum
References DDC_ResultDocIds, DDC_ResultHTML, DDC_ResultJson, DDC_ResultTable, DDC_ResultText, and EngMakeUpper().
Referenced by GetResultFormat(), CDDCBranchServer::handle__get_hit_strings(), CDDCLeafServer::handle__run_query(), CDDCBranchServer::RunDistributed(), and CRunQueryData::toString().
|
static |
adds header and footer to QueryResultStr according to format ResultTypeStr
References DDC_ResultHTML.
Referenced by GetResultFormat(), and CDDCBranchServer::RunDistributed().
bool CConcSession::HasRankOrderOperator | ( | ) | const |
return true, if the input query contains #less_by_rank or #greater_by_rank moo: sick, bad, ugly, and wrong!
References GreaterByRank, LessByRank, and CDDCFilterWithBounds::m_FilterType.
Referenced by GetResultFormat(), and CQueryNode::SetHolder().
bool CConcSession::HasMatchIdOperator | ( | ) | const |
return true iff the input query contains a match-id operator (=ID) moo: sicker, badder, uglier, and wronger!
Referenced by GetResultFormat(), and CQueryNode::SetHolder().
int CConcSession::GetBreakStarterLength | ( | ) | const |
return the length of break prefix, where DDC should search (#within[sentence, 10])
Referenced by GetResultFormat().
string CConcSession::BuildJsonContextString | ( | const vector< COutputToken > & | Tokens, |
bool | doHighlight = true |
||
) | const |
moo: build a json context string by parsing delimited token data
References jsonStr(), and CStringIndexSet::m_ShortName.
Referenced by GetResultFormat().
string CConcSession::CanonicalQueryString | ( | const string & | Query | ) |
moo: return a canonical representation of the query string Query
(implicitly parses)
Referenced by GetResultFormat().
string CConcSession::JsonQueryString | ( | const string & | Query | ) |
moo: return a JSONr epresentation of the query string Query
(implicitly parses)
Referenced by GetResultFormat().
void CConcSession::SetRandomSeed | ( | unsigned int | seed1 = 0 | ) | const |
moo: set internal random seed to m_RandomSeed+seed1
Referenced by CQFRandomSort::Compile(), and GetResultFormat().
|
inline |
moo: get term-expansion dispatcher for this object (wrapper for &m_pConcordance.m_Txd)
References CConcordance::m_Txd.
Referenced by CQTokInfl::GetChain(), CDDCLeafServer::handle__expand_terms(), and CDDCLeafServer::handle__info().
CConcSessionContext* CConcSession::m_pSessionContext |
shared session data (cache, etc.)
Referenced by WorkerClone().
bool CConcSession::m_bSessionMaster |
are we acting as a session master? if false, m_pSessionContext will be freed on object destruction; default=true
size_t CConcSession::m_WorkerId |
local worker-thread ID (default=0)
Referenced by WorkerClone().
CQueryCompiler* CConcSession::m_pQueryCompiler |
current query compiler, for compilation & evaluation of input queries.
DDCRandom* CConcSession::m_pRandom |
pseudo-random number generator
Referenced by WorkerClone().
time_t CConcSession::m_QueryEndTime |
how much time a query can be processed, by default unlimited (-1)
const ddcBreakVector* CConcSession::m_pBreaks |
a pointer to the current hits collection
Referenced by WorkerClone().
DDCFormatTypeEnum CConcSession::m_ResultFormat |
the format of query result
Referenced by GetResultFormat().
unsigned int CConcSession::m_RandomSeed |
initial random-state components for m_pRandom
Referenced by WorkerClone().
|
mutable |
a cache for short occurrence lists which is used during iterating through corpus periods and evaluating of the same query
Referenced by CDDCLeafServer::Close(), and CQueryTokenNode::EvaluateWithoutHits().
string CConcSession::m_QueryResultStr |
the result of the query (its format depends upon m_ResultFormat)
Referenced by CDDCLeafServer::GetHitContexts(), CDDCLeafServer::GetHitCounts(), CDDCLeafServer::handle__get_hit_strings(), and CDDCLeafServer::handle__run_query().
string CConcSession::m_ErrorStr |
most recent error message (if applicable)
Referenced by CDDCLeafServer::handle__get_first_hits(), CDDCLeafServer::handle__get_hit_strings(), and CDDCLeafServer::handle__run_query().
CConcordance* CConcSession::m_pConcordance |
m_pConcordance is the main (and the only) pointer to corpus indices and break collections. During the querying this pointer is used as a constant. Class CConcSession's original name "CConcHolder" was chosen because the class "holds" this pointer.
Referenced by CQToken::BreakName(), CQueryTokenNode::BuildRegExp(), CConcSessionContext::CacheGet(), CConcSessionContext::CacheSet(), CQueryOptions::CheckSatisfiable(), CQueryOptions::Compile(), CQCountKeyExprIndexed::Compile(), CQFBiblSort::Compile(), CQCountKeyExprToken::Compile(), CQFContextSort::Compile(), CQFHasFieldValue::Compile(), CQFHasFieldRegex::Compile(), CQFHasFieldSet::Compile(), CQueryNode::ConvertOccurrencesToHits(), CQueryNode::ConvertOccurrencesToHitsForPatterns(), CQTokLemma::Create(), CQTokChunk::Create(), CQTokFile::Create(), CQueryTokenNode::CreateChunkPattern(), CQueryTokenNode::CreateLemmaPattern(), CQueryTokenNode::CreateThesPattern(), CQueryTokenNode::EvaluateWithoutHits(), CQFSort::GetBiblConstant(), CQueryTokenNode::GetIndex(), CDDCLeafServer::handle__info(), CDDCLeafServer::handle__reload(), CQToken::IndexName(), CDDCLeafServer::LoadHolder(), CQFPrune::PruneHitsIndex(), CQFSort::ResolveAttributeName(), and WorkerClone().
size_t CConcSession::m_ResultOffset |
actual logical offset (starting from zero) of the first returnable hit in m_Hits[] (~ StartHitNo)
Referenced by CConcSessionContext::CacheGet().
size_t CConcSession::m_ResultLimit |
the maximal number of hits to be returned by the current query operation
Referenced by CConcSessionContext::CacheGet(), CQKeys::Compile(), CDDCLeafServer::handle__get_first_hits(), CDDCLeafServer::handle__get_hit_strings(), and CDDCLeafServer::handle__run_query().
string CConcSession::m_ResultMinKey |
minimum final sort-key of hits to be retrieved by GetAllHits()
Referenced by CDDCLeafServer::handle__get_first_hits(), and CDDCLeafServer::handle__run_query().
string CConcSession::m_RequestPath |
full request path leading to this session (used by CDDCLeafServer)
Referenced by CQueryOptions::CheckSatisfiable(), CDDCLeafServer::handle__get_first_hits(), and CDDCLeafServer::handle__run_query().
size_t CConcSession::m_CurrentSearchPeriodNo |
The index of the current subcorpora, which is now being processed.
Referenced by CQueryTokenNode::EvaluateWithoutHits().
string CConcSession::m_AdditionalHitDelimiter |
a delimiter which should be used between hits in m_QueryResultStr in the distributed model
Referenced by WorkerClone().