DTA::CAB::Format::Text - Datum parser: verbose human-readable text |
DTA::CAB::Format::Text - Datum parser: verbose human-readable text
use DTA::CAB::Format::Text; ##======================================================================== ## Constructors etc. $fmt = DTA::CAB::Format::Text->new(%args); ##======================================================================== ## Methods: Input $doc = $fmt->parseTextString(); ##======================================================================== ## Methods: Output $fmt = $fmt->putToken($tok);
DTA::CAB::Format::Text inherits from DTA::CAB::Format via DTA::CAB::Format::TT.
This module registers the filename regex:
/\.(?i:txt|text)$/
with DTA::CAB::Format.
$fmt = CLASS_OR_OBJ->new(%args);
Constructor. Inherited from DTA::CAB::Format::TT.
%args, %$fmt:
##---- Input doc => $doc, ##-- buffered input document ## ##---- Output #level => $formatLevel, ##-- output formatting level: n/a outbuf => $stringBuffer, ##-- buffered output ## ##---- Common encoding => $encoding, ##-- default: 'UTF-8'
$fmt = $fmt->parseTextString($str);
Guts for document parsing: parse string $str into local document buffer $fmt->{doc}.
$fmt = $fmt->parseTTString($str);
Alias for parseTextString()
.
$fmt = $fmt->putToken($tok);
Override: append formatted token $tok to output buffer.
An example file in the format accepted/generated by this module is:
wie +[xlit] isLatin1=1 isLatinExt=1 latin1Text=wie +[lts] vi <0> +[eqpho] Wie +[eqpho] wie +[morph] wie[_ADV] <0> +[morph] wie[_KON] <0> +[morph] wie[_KOKOM] <0> +[morph] wie[_KOUS] <0> +[morph/safe] 1 oede +[xlit] isLatin1=1 isLatinExt=1 latin1Text=oede +[lts] ?2de <0> +[eqpho] Oede +[eqpho] Öde +[eqpho] öde +[morph/safe] 0 +[rw] öde <1> +[rw/lts] ?2de <0> +[rw/morph] öde[_ADJD] <0> +[rw/morph] öde[_ADJA][pos][sg][nom]*[weak] <0> +[rw/morph] öde[_ADJA][pos][sg][nom][fem][strong_mixed] <0> +[rw/morph] öde[_ADJA][pos][sg][acc][fem]* <0> +[rw/morph] öde[_ADJA][pos][sg][acc][neut][weak] <0> +[rw/morph] öde[_ADJA][pos][pl][nom_acc]*[strong] <0> +[rw/morph] öd~en[_VVFIN][first][sg][pres][ind] <0> +[rw/morph] öd~en[_VVFIN][first][sg][pres][subjI] <0> +[rw/morph] öd~en[_VVFIN][third][sg][pres][subjI] <0> +[rw/morph] öd~en[_VVIMP][sg] <0> ! +[xlit] isLatin1=1 isLatinExt=1 latin1Text=! +[lts] <0> +[morph/safe] 1
Bryan Jurish <jurish@bbaw.de>
Copyright (C) 2009 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
DTA::CAB::Format::Text - Datum parser: verbose human-readable text |