Lingua::LTS::ACPM - native Perl Aho-Corasick pattern matcher object |
Lingua::LTS::ACPM - native Perl Aho-Corasick pattern matcher object
##======================================================================== ## PRELIMINARIES
use Lingua::LTS::Trie; use Lingua::LTS::ACPM;
##======================================================================== ## Constructors etc.
$obj = CLASS_OR_OBJ->new(%args); $acpm = $class_or_obj->newFromTrie($lingua_lts_trie,%compile_args);
##======================================================================== ## Methods: Construction
$acpm = $acpm->fromTrie($lingua_lts_trie,%args); $acpm = $acpm->compile(%args); $acpm = $acpm->complete(%args);
##======================================================================== ## Methods: Class-Expansion
$acpm = $acpm->expand($acpm, \%classes, %args);
##======================================================================== ## Methods: Lookup
$q = $acpm->s2q($str); \@states = $acpm->s2path($str);
##======================================================================== ## Methods: Full Match
@outputs = $acpm->matches($str) ##-- list context;
##======================================================================== ## Methods: Export: Gfsm
$labs = $acpm->gfsmInputLabels(); $gfsmDFA = $acpm->gfsmAutomaton(%args);
##======================================================================== ## Methods: Inherited
#... any Lingua::LTS::Trie method ...
$obj = CLASS_OR_OBJ->new(%args);
Creates and returns a new ACPM object. Output values (in $args{out}) are assumed to be hashrefs where they are defined.
Object structure / keyword %args:
##-- inherited from Lingua::LTS::Trie goto => \@delta, ##-- [$qid]{$sym} => $qid_to s.t. $qid --$sym--E<gt> $qid_to rgoto => \%rdelta, ##-- [$qid_to] => "$qid $sym" s.t. $qid --$sym--E<gt> $qid_to out => \%output, ##-- {$qid} => $output_hashref chars => \%chars, ##-- {$char} => undef cw => $symbol_width, ##-- scalar width of a single input symbol (default=1) nq => $nstates, ##-- scalar: number of states (E<gt>= 1)
$acpm = $class_or_obj->newFromTrie($lingua_lts_trie,%compile_args);
Creates and compiles new ACPM object from a Lingua::LTS::Trie object.
$acpm = $acpm->fromTrie($lingua_lts_trie,%args);
(Re-)initialize and compile an existing ACPM object from a Lingua::LTS::Trie.
%args are as for $acpm->compile()
.
$acpm = $acpm->compile(%args);
Compile an ACPM object. This method accepts an ACPM in trie-like format, and completes its {goto} key, populates its {fail} key, and updates its {out} key by the user-specified join callback (if any).
Recognized %args:
joinout=>\&sub ##-- $out1_NEW = &sub($out1_old,$out2) ##-- i.e. a union operation: if undefined, no output is joined
$acpm = $acpm->complete(%args);
Adds {goto} links for all {fail} arcs.
Currently does not recognize any %args at all.
$acpm = $acpm->expand($acpm, \%classes, %args);
Expands class-labelled arcs in {acpm} to arcs labelled with literal terminal symbols belonging to the respective classes.
%classes maps ACPM class-symbols to pseudo-sets (keys) of literal symbols.
Accepted %args:
packas => $template_char ##-- either 'S' or 'L': default='L' joinout => \&sub, ##-- as for compile()
Requires:
A completed ACPM $acpm (see the complete()
method).
The Gfsm package.
$q = $acpm->s2q($str);
Returns state achieved after following one arc for each character in $str.
\@states = $acpm->s2path($str);
Returns state path induced by following one arc for each character in $str.
@outputs = $acpm->matches($str) ##-- list context; $outputs = $apcm->matches($str) ##-- scalar context (ARRAY-ref)
Gathers output(s) produced by following one arc for each character in $str.
$labs = $acpm->gfsmInputLabels(); $labs = $acpm->gfsmInputLabels($labs,%args)
Returns ACPM input labels as a Gfsm::Alphabet object.
$gfsmDFA = $acpm->gfsmAutomaton(%args);
Returns ACPM as a Gfsm::Automaton object (recognizer).
Recognized %args:
fsm =>$fsm, ##-- output automaton ilabels =>$inLabels, ##-- default: $trie-E<gt>gfsmInputLabels() dosort =>$bool, ##-- sort automaton? (default=yes)
Bryan Jurish <moocow@ling.uni-potsdam.de>
Copyright (C) 2006 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
Lingua::LTS::ACPM - native Perl Aho-Corasick pattern matcher object |