Taxi::Mysql::Grimm - Grimm index subclass of Taxi::Mysql


NAME

Taxi::Mysql::Grimm - Grimm index subclass of Taxi::Mysql (v1)

(Back to Top)


PACKAGES

Taxi::Mysql::Grimm
Taxi::Mysql::URI::Grimm::WordInfo
Taxi::Mysql::URI::Grimm::Status

(Back to Top)


SYNOPSIS

 ##========================================================================
 ## PRELIMINARIES
 use Taxi::Mysql::Grimm;

Taxi::Mysql::Grimm Synopsis

 ##========================================================================
 ## Constructors etc.
 $q = $CLASS_OR_OBJ->new(%args);
 ##========================================================================
 ## Overrides: analysis
 $bool = $ix->analyzeDataFiles(%loadDataArgs);
 ##========================================================================
 ## Miscellaneous data-file utilities
 $filename = $ix->_loadDataFilename($table_or_name,%loadDataArgs);
 ##========================================================================
 ## Analysis: Files: LTS (Phonetic)
 $lts        = $ix->ltsAutomaton();
 $bool       = $ix->analyzeLTS(%loadDataArgs);
 ($rc,@ARGV) = $ix->admin_analyzeLTS(@ARGV);
 $undef      = $ix->ltsSummary($lts,$elapsed_secs);
 ##========================================================================
 ## Analysis: Files: Morph (morphological)
 $morph      = $ix->morphAutomaton();
 $bool       = $ix->analyzeMorph(%loadDataArgs);
 ($rc,@ARGV) = $ix->admin_analyzeMorph(@ARGV);
 $undef      = $ix->morphSummary($morph,$elapsed_secs);
 ##========================================================================
 ## Analysis: database-global
 $bool = $ix->insertCoverageRow($typClass, $whereConds);
 ##========================================================================
 ## Word Type Details
 $xmlDoc  = $ix->wordInfoXml($word,%options);
 $htmlDoc = $ix->wordInfoHtml($xmlDoc);
 ##========================================================================
 ## Index Status Details
 $ixStatusDoc = $ix->indexStatusXml(%args);
 $coverageElt = $ix->indexStatusTypeElement($eltName);
 $typeElt     = $ix->indexStatusSubTypeElement($eltName,$coverageTypClassKey);
 $dbStatusElt = $ix->indexStatusDbElement();
 $htmlDoc     = $ix->indexStatusHtml($xmlDoc);

Taxi::Mysql::URI::Grimm::WordInfo Synopsis

 ##========================================================================
 ## URI package: Word Information
 $uri = $class_or_obj->new(%options);
 \%clientRequest = $uri->parseClientRequest($server, $localPath, $clientSocket, $clientHttpRequest);
 $rc = $uri->processClientRequest($server, $clientRequest);

Taxi::Mysql::URI::Grimm::Status Synopsis

 ##========================================================================
 ## URI package: Database Info
 $uri = $class_or_obj->new(%options);
 \%clientRequest = $uri->parseClientRequest($server, $localPath, $clientSocket, $clientHttpRequest);
 $rc = $uri->processClientRequest($server, $clientRequest);

(Back to Top)


DESCRIPTION

The Taxi::Mysql::Grimm module includes all derived classes for the Taxi/Grimm server version pre-1.

Taxi::Mysql::Grimm Description

The Taxi::Mysql::Grimm class is a Taxi::Mysql subclass for indexing a corpus of quotation evidence drawn from the electronic sources of the Deutsches Woerterbuch (DWB) by Jacob and Wilhelm Grimm.

It is useable ``out-of-the-box'', once you have set the relevant database connection flags in 'handleArgs', 'prefix', 'dbEncoding', as well as the automaton locations in 'ltsFstFiles' and 'morphFstFiles'.

Globals etc.

Variable: @ISA

Taxi::Mysql::Grimm inherits from Taxi::Mysql and supports all Taxi::Mysql mthods.

Variable: $index_metadata

Set this to false if you don't want to index metadata attributes in the backend DB.

Variable: $strdef_utf8

SQL string datatype definition for UTF-8 strings.

Variable: $strdef_lat1

SQL string datatype definition for Latin-1 strings (currently unused).

Variable: $strdef

SQL string datatype definition.

Constructors etc.

new
 $q = $CLASS_OR_OBJ->new(%args);

Constructor supports all Taxi::Mysql %args, but most of these have sensible defaults implemented in the Grimm subclass already. New %args (optional):

 ltsFstFiles => {
   fst  =>'lts-grimm.gfst',       ##-- LTS transducer filename
   lab  =>'lts-grimm-latin1.lab', ##-- LTS labels filename
   dict =>'lts-grimm.dict',       ##-- LTS dictionary filename
 },
 morphFstFiles => {
   fst=>'morph-grimm.gfst',       ##-- morphology transducer filename
   lab=>'morph-grimm.lab',        ##-- morphology labels filename
   dict=>'morph-grimm.dict',      ##-- morphology dictionary filename
 },

Filenames for all automata, labels, and dictionaries may be undef to disable acquisition and indexing of the relevant attributes.

Overrides: analysis

analyzeDataFiles
 $bool = $ix->analyzeDataFiles(%loadDataArgs);

Data file preprocessor: calls $ix->analyzeLTS(%args) and $ix->analyzeMorph(%args)

Miscellaneous data-file utilities

_loadDataFilename
 $filename = $ix->_loadDataFilename($table_or_name,%loadDataArgs);

Returns filename for $table_or_name according to %loadDataArgs. This should really live somewhere else.

Analysis: Files: LTS (Phonetic)

ltsAutomaton
 $lts = $ix->ltsAutomaton();

Returns $ix->{ltsFst} (a Lingua::LTS::Gfsm object) if present, otherwise returns a new Lingua::LTS::Gfsm created & loaded using $ix->{ltsFstArgs}, $ix->{ltsFstFiles}.

analyzeLTS
 $bool = $ix->analyzeLTS(%loadDataArgs);

Performs phonetic analysis on all orthographic types in the 'type' table. Additional %loadDataArgs:

 keepall => $bool, ##-- set to true to keep temporary (renamed) files
admin_analyzeLTS
 ($rc,@ARGV) = $ix->admin_analyzeLTS(@ARGV);

taxi-admin.perl wrapper for the analyzeLTS() method.

ltsSummary
 undef = $ix->ltsSummary($lts,$elapsed_secs);

Prints out a summary of a completed LTS analysis run.

Analysis: Files: Morph (morphological)

morphAutomaton
 $morph = $ix->morphAutomaton();

Returns $ix->{morphFst} (a Lingua::LTS::Gfsm object) if present, otherwise returns a new Lingua::LTS::Gfsm created & loaded using $ix->{morphFstArgs}, $ix->{morphFstFiles}.

analyzeMorph
 $bool = $ix->analyzeMorph(%loadDataArgs);

Performs morphological analysis on all orthographic types in the 'type' table. Additional %loadDataArgs:

 keepall => $bool, ##-- keep temporary (renamed) files
admin_analyzeMorph
 ($rc,@ARGV) = $ix->admin_analyzeMorph(@ARGV);

taxi-admin.perl wrapper for the analyzeMorph() method.

morphSummary
 undef = $ix->morphSummary($morph,$elapsed_secs);

Prints out a summary of a completed morphological analysis run.

Analysis: database-global

dbAnalyze

Updates types table 'haspmorph', 'freq', 'isalpha', columns with backend destructive SQL queries. Also populates 'coverage' table.

insertCoverageRow
 $bool = $ix->insertCoverageRow($typClass, $whereConds);

Inserts a row into the index 'coverage' table for symbolic $typClass, identified by $whereConds.

Word Type Details

The following methods may be used to retrieve information on a single word type.

wordInfoXml
 $xmlDoc = $ix->wordInfoXml($word,%options);

%options: encoding => $xmlEncoding, client => \%eltNameToText, ## particularly 'detailURL', 'contextURL', 'homeURL'

wordInfoHtml
 $htmlDoc = $ix->wordInfoHtml($xmlDoc);

Links require XPaths ``/*/client/detailURL'' and ``/*/client/contextURL''.

Index Status Details

The following methods may be used to retrieve global information on the status and structure of the backend index.

indexStatusXml
 $ixStatusDoc = $ix->indexStatusXml(%args);

Get index status / coverage information as an XML document.

indexStatusTypeElement
 $coverageElt = $ix->indexStatusTypeElement($eltName);
 $coverageElt = $ix->indexStatusTypeElement($eltName,$typClassBasename);

Coverage XML generation utility. $eltName defaults to 'all', $typClassBasename defaults to $eltName

indexStatusSubTypeElement
 $typeElt = $ix->indexStatusSubTypeElement($eltName,$coverageTypClassKey);

Coverage XML generation utility. $eltName defaults to 'all', $coverageTypClassKey defaults to $eltName

indexStatusDbElement
 $dbStatusElt = $ix->indexStatusDbElement();
 $dbStatusElt = $ix->indexStatusDbElement($eltName)

Returns an element representing the database structure. $eltName defaults to 'db'

indexStatusHtml
 $htmlDoc = $ix->indexStatusHtml($xmlDoc);

Returns an HTML document representing database structure and information.

Links require XPath ``/*/client/homeURL''.

Taxi::Mysql::URI::Grimm::WordInfo Description

CGI-like URI class for type-wise word information.

Variable: @ISA

Inherits from Taxi::Mysql::URI.

new
 $uri = $class_or_obj->new(%options);

%options:

 encoding   => 'UTF-8',       ##-- query encoding
 homeURL    => '/index.html', ##-- URL for 'Home' navigation link
 contextURL => '/grimm',      ##-- base URL for context query links
 detailURL  => '',            ##-- base URL for wordInfo (detail) query links
parseClientRequest
 \%clientRequest = $uri->parseClientRequest($server, $localPath, $clientSocket, $clientHttpRequest);
processClientRequest
 $rc = $uri->processClientRequest($server, $clientRequest);

Taxi::Mysql::URI::Grimm::Status Description

CGI-like URI class for database-global information and coverage statistics.

Variable: @ISA

Inherits from Taxi::Mysql::URI.

new
 $uri = $class_or_obj->new(%options);

New %options: (?)

 xmlStatusOptions => {
  encoding   => 'UTF-8',
  homeURL    => '/index.html',
  contextURL => '/grimm',
  detailURL  => '',
 }
parseClientRequest
 \%clientRequest = $uri->parseClientRequest($server, $localPath, $clientSocket, $clientHttpRequest);
processClientRequest
 $rc = $uri->processClientRequest($server, $clientRequest);

(Back to Top)


ACKNOWLEDGEMENTS

Perl by Larry Wall.

(Back to Top)


AUTHOR

Bryan Jurish <moocow@ling.uni-potsdam.de>

(Back to Top)


COPYRIGHT AND LICENSE

Copyright (C) 2006 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available.

(Back to Top)


SEE ALSO

perl(1), Taxi::Mysql(3perl), Taxi::Mysql::Grimm2(3perl).

(Back to Top)

 Taxi::Mysql::Grimm - Grimm index subclass of Taxi::Mysql