Taxi::Mysql::Query::Parser - extendable full-text index using mysql: high-level query parser


(Back to Top)


NAME

Taxi::Mysql::Query::Parser - extendable full-text index using mysql: high-level query parser

(Back to Top)


SYNOPSIS

 ##========================================================================
 ## PRELIMINARIES
 use Taxi::Mysql::Query::Parser;
 ##========================================================================
 ## Constructors etc.
 $qp    = $CLASS_OR_OBJ->new(%args);
 $undef = $qp->free();                ##-- explicit destruction REQUIRED!
 $qp    = $qp->useIndex($index);
 ##========================================================================
 ## API: High-level Parsing
 $undef = $qp->reset();
 $query_or_undef = $qp->parse(@query_strings);
 ##========================================================================
 ## API: Mid-level: Query Generation API
 $q = $parser->newQuery(@args);
 $q = $parser->finishQuery($srcQuery);
 $quotedStr = $qp->sqlQuoteString($str);
 $varLabel = $parser->newVariable(%args);
 $varLabel = $qp->newReference(src=>$srcVarName, ref=>$refColName);
 [$tokVar,$attr,...] = $CLASS_OR_OBJ->parseReference($varName);
 $tokenVarName = $CLASS_OR_OBJ->reference2token($varName);
 $q = $parser->constantQuery($sqlWhereFragment);
 $q = $parser->literalQuery($literal_text);
 $q = $qp->soundsLikeQuery($attributeId,$soundsLikeText);
 $q = $parser->attributeQuery($attributeId, $sqlOpFragment, $sqlValueFragment);
 $q = $parser->precedesQuery($q1,$q2);
 $q = $parser->sequenceQuery(\@queryList);
 $q = $qp->nearQuery($maxDist, \@queryList);
 $q = $parser->withinQuery($srcQuery, $withinTabName);
 $q = $qp->metaQueryLocal($metaPath,$srcQuery,$sqlOpFragment,$sqlValueFragment);
 undef = $qp->metaQueryDelayed($metaPath, $sqlOpFragment, $sqlValueFragment);
 $sqlStr = $qp->attributeValue($attributeIdOrValue);
 $varName  = $qp->parseReferencePath([$varNameOrUndef]);
 [$varName,$attrPath] = $qp->parseAttributePath([@refPath,$attrName]);
 $sqlFragment = $qp->sqlGreatestId($query,\@tokenVars);
 $sqlFragment = $qp->sqlLeastId($query,\@tokenVars);
 $sqlFragment = $qp->sqlMinMax($func,$query);
 $q_expanded = $qp->expandMeta($q);
 ##========================================================================
 ## API: Low-level: Lexer/Parser Connecton, Error Reporting, etc.
 \&yylex_sub   = $qp->_yylex_sub();
 \&yyerror_sub = $qp->_yyerror_sub();
 $errorString  = $qp->setError($errorCode,\%userMacros);
 ##========================================================================
 ## I/O: Hooks
 $tmpData = $obj->preSaveHook();
 undef = $obj->postSaveHook($tmpData);
 undef = $obj->postLoadHook();

(Back to Top)


DESCRIPTION

Taxi::Mysql::Query::Parser is a high-level parser for user queries expressed in a convenient Taxi-native query language. It uses a Parse::Lex subclass and a Parse::Yapp generated parser for low-level parsing.

NOTE/TODO: the low-level lexer class is NOT thread-safe -- it isn't even a true object in the sense that creating multiple instances causes the re-creation of the same packages and package-global variables. This is a bug in Parse::Lex, which appears no longer to be maintained. At some point, the Parse::Lex subclass should be replaced by a true object class.

Globals etc.

Variable: @ISA

Taxi::Mysql::Query::Parser inherits from Taxi::Mysql::Bas.

Constructors etc.

new
 $qp = $CLASS_OR_OBJ->new(%args);

Constructor. NOTE: you should probably call free() before destroying the returned object to be safe.

Object structure / known %args:

   {
    ##-- Query Defaults
    index             => $ix,                ##-- underlying index
    default_attribute => \@attributePath,    ##-- for literal queries; default:[qw(type text)]
    default_op        => $sqlOp,             ##-- for literal queries; default:'='
    ##-- Status flags
    errstr => $current_errstr, ##-- false indicates no error
    ##-- Underlying lexer/parser pair
    lexer  => $yylexer,   ##-- a Taxi::Mysql::YYLexer object
    parser => $yyparser,  ##-- a Taxi::Mysql::YYParser object
    yydebug => $mask,     ##-- yydebug value
    ##-- Dynamic data (parse-local)
    joins   => { $joinStr=>undef, ... },          ##-- dynamic joins
    qtmp    => $query,                            ##-- dummy query, used for variable allocation etc.
    meta    => [ [$metaPath,$sqlOp,$sqlVal],... ] ##-- meta queries
    ##-- Closures
    yylex    => \&yylex,   ##-- yapp-friendly lexer sub
    yyerror  => \&yyerror, ##-- yapp-friendly parser sub
   }
free
 $undef = $qp->free();

Performs required pre-destruction cleanup (trims circular references, etc.), in particular: clears $qp itself, as well as $qp->{parser}{USER}, which makes $qp subsequently useless, but destroyable.

useIndex
 $qp = $qp->useIndex($index);

Sets up parser to use the Taxi::Mysql index $index.

API: High-level Parsing

The following methods comprise the top-level parsing API.

reset
 undef = $qp->reset();

Reset all parse-relevant data structures in preparation for parsing a new query.

parse
 $query_or_undef = $qp->parse(@query_strings);
 $query_or_undef = $qp->parse(\*query_fh)

Parse and return a user query as a Taxi::Mysql::Query::Base object (or subclass) from a (list of) string(s) [first form], or from an open filehandle [second form]. If an error is encountered, parse() returns undef.

API: Mid-level: Query Generation API

The following methods comprise the mid-level parsing API. Users should never need to call these methods directly, but they may be useful if you are deriving a new parser (sub)class, e.g. implementing an alternate query syntax.

newQuery
 $q = $parser->newQuery(@args);

Wrapper for $parser->{index}->newQuery(@args)

finishQuery
 $q = $parser->finishQuery($srcQuery);

Imposes default 'hit' restrictor on parsed query $srcQuery, and other finalizing touches (join insertion, variable merge, independent variable check, meta expansion).

sqlQuoteString
 $quotedStr = $qp->sqlQuoteString($str);

Adds single quotes around $str and escapes any string-internal single quotes.

newVariable
 $varLabel = $qp->newVariable(%args);

Wrapper for $qp->{qtmp}->newVariable(), with different default semantics. Known %args:

 label => $varLabel,    ##-- variable label           (default=(generated))
 table => $varTable,    ##-- variable table name      (default=$qp->{qtmp}{default_table})
 tok   => $bool,        ##-- is this var independent? (default=true)
newReference
 $varLabel = $qp->newReference(src=>$srcVarName, ref=>$refColName);

Wrapper for newVariable() which creates a new variable dependent on $srcVarName which will be joined to the table referenced by the 'ref' field $refColName, for transparent de-referencing in queries. Implementation handles variable aliasing by naming conventions, and performs some basic sanity checks.

parseReference
 [$tokVar,$attr,...] = $CLASS_OR_OBJ->parseReference($varName);

Parses a reference returned by the low-level parser as a dot-separated string of the form ``${tokenVarName}.${refAttr1}.(...).${refAttrN}.${attrName}''.

reference2token
 $tokenVarName = $CLASS_OR_OBJ->reference2token($varName);

Hack: get the name of the independent (token) variable associated with the dot-separated string $varName.

constantQuery
 $q = $parser->constantQuery($sqlWhereFragment);

Simple constant query (boolean true or false). Can also be used to add literal SQL fragments to a query object.

literalQuery
 $q = $parser->literalQuery($literal_text);

Handler for ``literal'' single-word or -string queries. Default is an atribute query on $q->{defaultAttribute} via $q->{defaultOp} with value $literal_text.

Sets $q->{tok} to the newly generated variable as a side-effect.

soundsLikeQuery
 $q = $qp->soundsLikeQuery($attributeId,$soundsLikeText);

Handler for ``sounds-like'' queries over $attributeId with (orthographic) value $soundsLikeText. Default uses path 'type.pho' on $varName, 'pho' on $soundsLikeText (implicit table: 'type').

attributeQuery
 $q = $parser->attributeQuery($attributeId, $sqlOpFragment, $sqlValueFragment);

Handler for generic attribute queries, where $attributeId = [$varName,$attrName].

attributeValue
 $sqlStr = $qp->attributeValue($attributeIdOrValue);

Returns an SQL-string representing $attributeIdOrValue, where $attributeIdOrValue is one of the following:

parseReferencePath
 $varName  = $qp->parseReferencePath([$varNameOrUndef]);
 $varNameN = $qp->parseReferencePath([$varNameOrUndef,@refNames])

Calls newVariable() if $varNameOrUndef is undefined to allocate an independent base variable, and calls newReference() for each reference named in @refNames to perform nested variable de-referencing.

parseAttributePath
 [$varName,$attrPath] = $qp->parseAttributePath([@refPath,$attrName]);

Wrapper which calls parseReferencePath() on non-final components of [@refPath,$attrName] and returns and $attributeId ARRAY-ref [$varName,$attrPath] representing the (de-referenced) argument array.

precedesQuery
 $q = $parser->precedesQuery($q1,$q2);

Enforces restriction that all 'tok' variables in $q1 precede all those in $q2 (by primary key).

sqlGreatestId
 $sqlFragment = $qp->sqlGreatestId($query,\@tokenVars);

Returns SQL fragment representing the value of the greatest primary key of any independent variable named in \@tokenVars.

sqlLeastId
 $sqlFragment = $qp->sqlLeastId($query,\@tokenVars);

Returns SQL fragment representing the value of the smallest primary key of any independent variable named in \@tokenVars.

sqlMinMax
 $sqlFragment = $qp->sqlMinMax($func,$query);
 $sqlFragment = $qp->sqlMinMax($func,$query,\@tokenVars)

Guts for sqlGreatestId() and sqlLeastId(): returns $tokenVars[0] if only one token is specified in @tokenVars, otherwise applies SQL function $func to SQL forms of @tokenVars.

sequenceQuery
 $q = $parser->sequenceQuery(\@queryList);

Handler for back-to-back ordered sequences of queries. Default implementation interprets these as serial order of independent variables' primary keys.

unearQuery
 $q = $qp->unearQuery($maxDist, \@queryList)

Handles unordered 'near' queries over at most $maxDist intervening tokens.

nearQuery
 $q = $qp->nearQuery($maxDist, \@queryList);

Handles ordered 'near' queries over at most $maxDist intervening tokens.

withinQuery
 $q = $parser->withinQuery($srcQuery, $withinTabName);

Handles 'within' queries: imposes default 'hit' container by join clause manipulation.

metaQueryLocal
 $q = $qp->metaQueryLocal($metaPath,$srcQuery,$sqlOpFragment,$sqlValueFragment);

Handler for metadata queries. Current version performs immediate expansion on all token vars in $srcQuery. This is the Right Way To Do It if metadata queries should be locally scoped.

metaQueryDelayed
 undef = $qp->metaQueryDelayed($metaPath, $sqlOpFragment, $sqlValueFragment);

Alternate handler for metadata queries (currently unused). This version performs no expansion when the meta-query is parsed, but rather enqueues all metadata queries for later expansion (e.g. on $qp->finishQuery()). This would be the Right Way To Do It if metadata queries should always be interpreted as globally scoped.

expandMeta
 $q_expanded = $qp->expandMeta($q);

Expands delayed metadata conditions in $qp->{meta} (if any) into $q.

API: Low-level: Lexer/Parser Connection, Error Reporting, etc.

_yylex_sub
 \&yylex_sub = $qp->_yylex_sub();

Returns a Parse::Yapp-friendly lexer subroutine.

_yyerror_sub
 \&yyerror_sub = $qp->_yyerror_sub();

Returns error subroutine for the underlying Yapp parser.

setError
 $errorString = $qp->setError($errorCode,\%userMacros);

Should set $qp->{errstr} to expanded $errorString.

default behavior just replaces the following macros in $errorCode:

 __LINE__
 ___COL__
 __LEXTOKNAME__
 __LEXTOKTEXT__
 __TOKNAME__
 __TOKTEXT__
 __EXPECTED__

I/O: Hooks

preSaveHook
 $tmpData = $obj->preSaveHook();

Sanitize object for save, returns temprorary data.

postSaveHook
 undef = $obj->postSaveHook($tmpData);

(undocumented)

postLoadHook
 undef = $obj->postLoadHook();

(undocumented)

(Back to Top)


ACKNOWLEDGEMENTS

Perl by Larry Wall.

(Back to Top)


AUTHOR

Bryan Jurish <moocow@ling.uni-potsdam.de>

(Back to Top)


COPYRIGHT AND LICENSE

Copyright (C) 2006 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available.

(Back to Top)


SEE ALSO

perl(1), Taxi::Mysql(3perl), Taxi::Mysql::Query::YYLexer(3perl), Taxi::Mysql::Query::YYParser(3perl), Taxi::Mysql::Query::Base(3perl), Taxi::Mysql::Query::Boolean(3perl).

(Back to Top)

 Taxi::Mysql::Query::Parser - extendable full-text index using mysql: high-level query parser