Taxi::Mysql::Query::Parser - extendable full-text index using mysql: high-level query parser |
Taxi::Mysql::Query::Parser - extendable full-text index using mysql: high-level query parser
##======================================================================== ## PRELIMINARIES
use Taxi::Mysql::Query::Parser;
##======================================================================== ## Constructors etc.
$qp = $CLASS_OR_OBJ->new(%args); $undef = $qp->free(); ##-- explicit destruction REQUIRED! $qp = $qp->useIndex($index);
##======================================================================== ## API: High-level Parsing
$undef = $qp->reset(); $query_or_undef = $qp->parse(@query_strings);
##======================================================================== ## API: Mid-level: Query Generation API
$q = $parser->newQuery(@args); $q = $parser->finishQuery($srcQuery);
$quotedStr = $qp->sqlQuoteString($str); $varLabel = $parser->newVariable(%args); $varLabel = $qp->newReference(src=>$srcVarName, ref=>$refColName);
[$tokVar,$attr,...] = $CLASS_OR_OBJ->parseReference($varName); $tokenVarName = $CLASS_OR_OBJ->reference2token($varName);
$q = $parser->constantQuery($sqlWhereFragment); $q = $parser->literalQuery($literal_text); $q = $qp->soundsLikeQuery($attributeId,$soundsLikeText); $q = $parser->attributeQuery($attributeId, $sqlOpFragment, $sqlValueFragment); $q = $parser->precedesQuery($q1,$q2); $q = $parser->sequenceQuery(\@queryList); $q = $qp->nearQuery($maxDist, \@queryList); $q = $parser->withinQuery($srcQuery, $withinTabName); $q = $qp->metaQueryLocal($metaPath,$srcQuery,$sqlOpFragment,$sqlValueFragment); undef = $qp->metaQueryDelayed($metaPath, $sqlOpFragment, $sqlValueFragment);
$sqlStr = $qp->attributeValue($attributeIdOrValue); $varName = $qp->parseReferencePath([$varNameOrUndef]); [$varName,$attrPath] = $qp->parseAttributePath([@refPath,$attrName]);
$sqlFragment = $qp->sqlGreatestId($query,\@tokenVars); $sqlFragment = $qp->sqlLeastId($query,\@tokenVars); $sqlFragment = $qp->sqlMinMax($func,$query);
$q_expanded = $qp->expandMeta($q);
##======================================================================== ## API: Low-level: Lexer/Parser Connecton, Error Reporting, etc.
\&yylex_sub = $qp->_yylex_sub(); \&yyerror_sub = $qp->_yyerror_sub(); $errorString = $qp->setError($errorCode,\%userMacros);
##======================================================================== ## I/O: Hooks
$tmpData = $obj->preSaveHook(); undef = $obj->postSaveHook($tmpData); undef = $obj->postLoadHook();
Taxi::Mysql::Query::Parser is a high-level parser for user queries expressed in a convenient Taxi-native query language. It uses a Parse::Lex subclass and a Parse::Yapp generated parser for low-level parsing.
NOTE/TODO: the low-level lexer class is NOT thread-safe -- it isn't even a true object in the sense that creating multiple instances causes the re-creation of the same packages and package-global variables. This is a bug in Parse::Lex, which appears no longer to be maintained. At some point, the Parse::Lex subclass should be replaced by a true object class.
Taxi::Mysql::Query::Parser inherits from Taxi::Mysql::Bas.
$qp = $CLASS_OR_OBJ->new(%args);
Constructor.
NOTE: you should probably call free()
before destroying the returned object to be safe.
Object structure / known %args:
{ ##-- Query Defaults index => $ix, ##-- underlying index default_attribute => \@attributePath, ##-- for literal queries; default:[qw(type text)] default_op => $sqlOp, ##-- for literal queries; default:'='
##-- Status flags errstr => $current_errstr, ##-- false indicates no error
##-- Underlying lexer/parser pair lexer => $yylexer, ##-- a Taxi::Mysql::YYLexer object parser => $yyparser, ##-- a Taxi::Mysql::YYParser object yydebug => $mask, ##-- yydebug value
##-- Dynamic data (parse-local) joins => { $joinStr=>undef, ... }, ##-- dynamic joins qtmp => $query, ##-- dummy query, used for variable allocation etc. meta => [ [$metaPath,$sqlOp,$sqlVal],... ] ##-- meta queries
##-- Closures yylex => \&yylex, ##-- yapp-friendly lexer sub yyerror => \&yyerror, ##-- yapp-friendly parser sub }
$undef = $qp->free();
Performs required pre-destruction cleanup (trims circular references, etc.), in particular: clears $qp itself, as well as $qp->{parser}{USER}, which makes $qp subsequently useless, but destroyable.
$qp = $qp->useIndex($index);
Sets up parser to use the Taxi::Mysql index $index.
The following methods comprise the top-level parsing API.
undef = $qp->reset();
Reset all parse-relevant data structures in preparation for parsing a new query.
$query_or_undef = $qp->parse(@query_strings); $query_or_undef = $qp->parse(\*query_fh)
Parse and return a user query as a Taxi::Mysql::Query::Base object (or subclass)
from a (list of) string(s)
[first form], or from an open filehandle [second form].
If an error is encountered, parse()
returns undef
.
The following methods comprise the mid-level parsing API. Users should never need to call these methods directly, but they may be useful if you are deriving a new parser (sub)class, e.g. implementing an alternate query syntax.
$q = $parser->newQuery(@args);
Wrapper for $parser->{index}->newQuery(@args)
$q = $parser->finishQuery($srcQuery);
Imposes default 'hit' restrictor on parsed query $srcQuery, and other finalizing touches (join insertion, variable merge, independent variable check, meta expansion).
$quotedStr = $qp->sqlQuoteString($str);
Adds single quotes around $str and escapes any string-internal single quotes.
$varLabel = $qp->newVariable(%args);
Wrapper for $qp->{qtmp}->newVariable(), with different default semantics. Known %args:
label => $varLabel, ##-- variable label (default=(generated)) table => $varTable, ##-- variable table name (default=$qp->{qtmp}{default_table}) tok => $bool, ##-- is this var independent? (default=true)
$varLabel = $qp->newReference(src=>$srcVarName, ref=>$refColName);
Wrapper for newVariable()
which creates a new variable dependent
on $srcVarName which will be joined to the table referenced by
the 'ref' field $refColName, for transparent de-referencing
in queries. Implementation handles variable
aliasing by naming conventions, and performs some basic sanity checks.
[$tokVar,$attr,...] = $CLASS_OR_OBJ->parseReference($varName);
Parses a reference returned by the low-level parser as a dot-separated string of the form ``${tokenVarName}.${refAttr1}.(...).${refAttrN}.${attrName}''.
$tokenVarName = $CLASS_OR_OBJ->reference2token($varName);
Hack: get the name of the independent (token) variable associated with the dot-separated string $varName.
$q = $parser->constantQuery($sqlWhereFragment);
Simple constant query (boolean true or false). Can also be used to add literal SQL fragments to a query object.
$q = $parser->literalQuery($literal_text);
Handler for ``literal'' single-word or -string queries. Default is an atribute query on $q->{defaultAttribute} via $q->{defaultOp} with value $literal_text.
Sets $q->{tok} to the newly generated variable as a side-effect.
$q = $qp->soundsLikeQuery($attributeId,$soundsLikeText);
Handler for ``sounds-like'' queries over $attributeId with (orthographic) value $soundsLikeText. Default uses path 'type.pho' on $varName, 'pho' on $soundsLikeText (implicit table: 'type').
$q = $parser->attributeQuery($attributeId, $sqlOpFragment, $sqlValueFragment);
Handler for generic attribute queries, where $attributeId = [$varName,$attrName].
$sqlStr = $qp->attributeValue($attributeIdOrValue);
Returns an SQL-string representing $attributeIdOrValue, where $attributeIdOrValue is one of the following:
a literal value (numeric or string, pre-parsed)
a pair [ $varName, $attrName ]
$varName = $qp->parseReferencePath([$varNameOrUndef]); $varNameN = $qp->parseReferencePath([$varNameOrUndef,@refNames])
Calls newVariable()
if $varNameOrUndef is undefined to allocate
an independent base variable, and calls newReference()
for each
reference named in @refNames to perform nested variable de-referencing.
[$varName,$attrPath] = $qp->parseAttributePath([@refPath,$attrName]);
Wrapper which calls parseReferencePath()
on non-final components
of [@refPath,$attrName] and returns and $attributeId ARRAY-ref
[$varName,$attrPath] representing the (de-referenced) argument array.
$q = $parser->precedesQuery($q1,$q2);
Enforces restriction that all 'tok' variables in $q1 precede all those in $q2 (by primary key).
$sqlFragment = $qp->sqlGreatestId($query,\@tokenVars);
Returns SQL fragment representing the value of the greatest primary key of any independent variable named in \@tokenVars.
$sqlFragment = $qp->sqlLeastId($query,\@tokenVars);
Returns SQL fragment representing the value of the smallest primary key of any independent variable named in \@tokenVars.
$sqlFragment = $qp->sqlMinMax($func,$query); $sqlFragment = $qp->sqlMinMax($func,$query,\@tokenVars)
Guts for sqlGreatestId()
and sqlLeastId(): returns $tokenVars[0]
if only one token is specified in @tokenVars,
otherwise applies SQL function $func to SQL forms of @tokenVars.
$q = $parser->sequenceQuery(\@queryList);
Handler for back-to-back ordered sequences of queries. Default implementation interprets these as serial order of independent variables' primary keys.
$q = $qp->unearQuery($maxDist, \@queryList)
Handles unordered 'near' queries over at most $maxDist intervening tokens.
$q = $qp->nearQuery($maxDist, \@queryList);
Handles ordered 'near' queries over at most $maxDist intervening tokens.
$q = $parser->withinQuery($srcQuery, $withinTabName);
Handles 'within' queries: imposes default 'hit' container by join clause manipulation.
$q = $qp->metaQueryLocal($metaPath,$srcQuery,$sqlOpFragment,$sqlValueFragment);
Handler for metadata queries. Current version performs immediate expansion on all token vars in $srcQuery. This is the Right Way To Do It if metadata queries should be locally scoped.
undef = $qp->metaQueryDelayed($metaPath, $sqlOpFragment, $sqlValueFragment);
Alternate handler for metadata queries (currently unused). This version performs no expansion when the meta-query is parsed, but rather enqueues all metadata queries for later expansion (e.g. on $qp->finishQuery()). This would be the Right Way To Do It if metadata queries should always be interpreted as globally scoped.
$q_expanded = $qp->expandMeta($q);
Expands delayed metadata conditions in $qp->{meta} (if any) into $q.
\&yylex_sub = $qp->_yylex_sub();
Returns a Parse::Yapp-friendly lexer subroutine.
\&yyerror_sub = $qp->_yyerror_sub();
Returns error subroutine for the underlying Yapp parser.
$errorString = $qp->setError($errorCode,\%userMacros);
Should set $qp->{errstr} to expanded $errorString.
default behavior just replaces the following macros in $errorCode:
__LINE__ ___COL__ __LEXTOKNAME__ __LEXTOKTEXT__ __TOKNAME__ __TOKTEXT__ __EXPECTED__
$tmpData = $obj->preSaveHook();
Sanitize object for save, returns temprorary data.
undef = $obj->postSaveHook($tmpData);
(undocumented)
undef = $obj->postLoadHook();
(undocumented)
Perl by Larry Wall.
Bryan Jurish <moocow@ling.uni-potsdam.de>
Copyright (C) 2006 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available.
perl(1), Taxi::Mysql(3perl), Taxi::Mysql::Query::YYLexer(3perl), Taxi::Mysql::Query::YYParser(3perl), Taxi::Mysql::Query::Base(3perl), Taxi::Mysql::Query::Boolean(3perl).
Taxi::Mysql::Query::Parser - extendable full-text index using mysql: high-level query parser |