NAME

DTA::CAB::Chain - serial multi-analyzer pipeline

SYNOPSIS

 use DTA::CAB::Chain;
 
 ##========================================================================
 ## Constructors etc.
 
 $obj = CLASS_OR_OBJ->new(%args);
 @keys = $anl->typeKeys(\%opts);
 
 ##========================================================================
 ## Methods: Chain selection
 
 \@analyzers = $ach->chain();
 \@analyzers = $ach->subAnalyzers();
 
 ##========================================================================
 ## Methods: I/O
 
 $bool = $ach->ensureLoaded();
 
 ##========================================================================
 ## Methods: Analysis
 
 $bool = $ach->canAnalyze();
 $bool = $anl->enabled(\%opts);
 undef = $anl->initInfo();
 
 $doc = $ach->analyzeTypes($doc,$types,\%opts);
 $doc = $ach->analyzeSentences($doc,\%opts);
 $doc = $ach->analyzeLocal($doc,\%opts);
 $doc = $ach->analyzeClean($doc,\%opts);
 

DESCRIPTION

DTA::CAB::Chain is an abstract DTA::CAB::Analyzer subclass for implementing serial document processing "pipelines" or "cascades" in terms of a flat list of DTA::CAB::Analyzer objects.

Constructors etc.

new
 $obj = CLASS_OR_OBJ->new(%args);

%$obj, %args:

 chain => [ $a1, $a2, ..., $aN ],        ##-- default analysis chain; see also chain() method (default: empty)
typeKeys
 @keys = $anl->typeKeys(\%opts);

Returns list of type-wise keys to be expanded for this analyzer by expandTypes() Default implementation just concatenates typeKeys() for sub-analyzers.

Methods: Chain selection

chain
 \@analyzers = $ach->chain();
 \@analyzers = $ach->chain(\%opts)

Get selected analyzer chain. Default method returns all globally enabled analyzers in $anl->{chain}.

subAnalyzers
 \@analyzers = $ach->subAnalyzers();
 \@analyzers = $ach->subAnalyzers(\%opts)

Returns a list of all sub-analyzers. Override just calls chain().

Methods: I/O

ensureLoaded
 $bool = $ach->ensureLoaded();
 $bool = $ach->ensureLoaded(\%opts)

Ensures analysis data is loaded from default files Override calls $a->ensureLoaded() for each $a in $ach->subAnalyzers(\%opts).

Methods: Analysis

canAnalyze
 $bool = $ach->canAnalyze();
 $bool = $ach->canAnalyze(\%opts)

Returns true if analyzer can perform its function (e.g. data is loaded & non-empty). Override returns true if all enabled analyzers in the chain can analyze.

enabled
 $bool = $anl->enabled(\%opts);

Returns $anl->{enabled} and (disjunction over all sub-analyzers).

initInfo
 undef = $anl->initInfo();

Logs initialization info. Default method reports values of {label}, enabled().

Methods: Analysis: API

analyzeTypes
 $doc = $ach->analyzeTypes($doc,$types,\%opts);

Perform type-wise analysis of all (text) types in $doc->{types}. Chain default calls $a->analyzeTypes for each analyzer $a in the chain.

analyzeSentences
 $doc = $ach->analyzeSentences($doc,\%opts);

Perform sentence-wise analysis of all sentences $doc->{body}[$si]. Chain default calls $a->analyzeSentences for each analyzer $a in the chain.

analyzeLocal
 $doc = $ach->analyzeLocal($doc,\%opts);

Perform local document-level analysis of $doc. Chain default calls $a->analyzeLocal for each analyzer $a in the chain.

analyzeClean
 $doc = $ach->analyzeClean($doc,\%opts);

Cleanup any temporary data associated with $doc. Chain default calls $a->analyzeClean for each analyzer $a in the chain, then superclass Analyzer->analyzeClean.

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2010-2019 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

dta-cab-analyze.perl(1), DTA::CAB::Analyzer(3pm), DTA::CAB::Chain::Multi(3pm), DTA::CAB(3pm), perl(1), ...