DTA::TokWrap::Processor::standoff::xsl - DTA tokenizer wrappers: t.xml -> (s.xml, w.xml, a.xml) via XSL
use DTA::TokWrap::Processor::standoff::xsl;
$so = DTA::TokWrap::Processor::standoff::xsl->new(%opts);
$doc_or_undef = $so->sosxml($doc);
$doc_or_undef = $so->sowxml($doc);
$doc_or_undef = $so->soaxml($doc);
$doc_or_undef = $so->standoff($doc);
##-- debugging
undef = $so->dump_t2s_stylesheet($filename_or_fh);
undef = $so->dump_t2w_stylesheet($filename_or_fh);
undef = $so->dump_t2a_stylesheet($filename_or_fh);
This module is deprecated; prefer DTA::TokWrap::Processor::standoff.
DTA::TokWrap::Processor::standoff::xsl provides an object-oriented DTA::TokWrap::Processor wrapper for generation of various standoff XML formats for DTA::TokWrap::Document objects via (slow) XSL stylesheet transformations.
Most users should use the high-level DTA::TokWrap wrapper class instead of using this module directly.
DTA::TokWrap::Processor::standoff::xsl inherits from DTA::TokWrap::Processor.
$so = $CLASS_OR_OBJECT->new(%args);
Constructor.
%args, %$so:
##-- Stylesheet: tx2sx (t.xml -> s.xml)
t2s_stylestr => $stylestr, ##-- xsl stylesheet string
t2s_styleheet => $stylesheet, ##-- compiled xsl stylesheet
##
##-- Styleheet: tx2wx (t.xml -> w.xml)
t2w_stylestr => $stylestr, ##-- xsl stylesheet string
t2w_styleheet => $stylesheet, ##-- compiled xsl stylesheet
##
##-- Styleheet: tx2wx (t.xml -> a.xml)
t2a_stylestr => $stylestr, ##-- xsl stylesheet string
t2a_styleheet => $stylesheet, ##-- compiled xsl stylesheet
%defaults = CLASS->defaults();
Static class-dependent defaults.
$so = $so->init();
Dynamic object-dependent defaults.
Low-level utility methods.
The stylesheets returned may or may not accurately reflect the documents generated by the sosxml(), sowxml(), and soaxml() methods.
$so_or_undef = $so->ensure_stylesheets();
Ensures that required XSL stylesheets have been compiled.
$xsl_str = $mbx0->t2s_stylestr();
Returns XSL stylesheet string for generation of sentence-level standoff XML (.s.xml) from "master" tokenized XML (.t.xml).
$xsl_str = $mbx0->t2w_stylestr();
Returns XSL stylesheet string for generation of token-level standoff XML (.w.xml) from "master" tokenized XML (.t.xml).
$xsl_str = $mbx0->t2a_stylestr();
Returns XSL stylesheet string for generation of token-analysis-level standoff XML (.a.xml) from "master" tokenized XML (.t.xml).
$so->dump_t2s_stylesheet($filename_or_fh);
Dumps the generated sentence-level standoff stylesheet to $filename_or_fh.
$so->dump_t2w_stylesheet($filename_or_fh);
Dumps the generated token-level standoff stylesheet to $filename_or_fh.
$so->dump_t2a_stylesheet($filename_or_fh);
Dumps the generated token-analysis-level standoff stylesheet to $filename_or_fh.
$doc_or_undef = $CLASS_OR_OBJECT->standoff($doc);
$doc_or_undef = $CLASS_OR_OBJECT->sosxml($doc);
Generate sentence-level standoff for the DTA::TokWrap::Document object $doc.
Relevant %$doc keys:
xtokdoc => $xtokdoc, ##-- (input) XML-ified tokenizer output data, as XML::LibXML::Document
xtokdata => $xtokdata, ##-- (input) fallback: string source for $xtokdoc
sosdoc => $sosdoc, ##-- (output) standoff sentence data, refers to $doc->{sowfile}
##
sosxml_stamp0 => $f, ##-- (output) timestamp of operation begin
sosxml_stamp => $f, ##-- (output) timestamp of operation end
sosdoc_stamp => $f, ##-- (output) timestamp of operation end
$doc_or_undef = $CLASS_OR_OBJECT->sowxml($doc);
Generate token-level standoff for the DTA::TokWrap::Document object $doc.
Relevant %$doc keys:
xtokdoc => $xtokdoc, ##-- (input) XML-ified tokenizer output data, as XML::LibXML::Document
xtokdata => $xtokdata, ##-- (input) fallback: string source for $xtokdoc
sowdoc => $sowdoc, ##-- (output) standoff token data, refers to $doc->{xmlfile}
##
sowxml_stamp0 => $f, ##-- (output) timestamp of operation begin
sowxml_stamp => $f, ##-- (output) timestamp of operation end
sowdoc_stamp => $f, ##-- (output) timestamp of operation end
$doc_or_undef = $CLASS_OR_OBJECT->soaxml($doc);
Generate token-analysis-level standoff for the DTA::TokWrap::Document object $doc.
Relevant %$doc keys:
xtokdoc => $xtokdoc, ##-- (input) XML-ified tokenizer output data, as XML::LibXML::Document
xtokdata => $xtokdata, ##-- (input) fallback: string source for $xtokdoc
soadoc => $soadoc, ##-- (output) standoff token-analysis data, refers to $doc->{sowdoc}
##
sowxml_stamp0 => $f, ##-- (output) timestamp of operation begin
sowxml_stamp => $f, ##-- (output) timestamp of operation end
sowdoc_stamp => $f, ##-- (output) timestamp of operation end
DTA::TokWrap::Intro(3pm), dta-tokwrap.perl(1), ...
DTA::TokWrap::Intro(3pm), dta-tokwrap.perl(1), ...
Bryan Jurish <jurish@bbaw.de>
Copyright (C) 2009-2018 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.