DTA::CAB::Format::XmlPerl - Datum parser|formatter: XML (perl-like)
use DTA::CAB::Format::XmlPerl;
##========================================================================
## Constructors etc.
$fmt = DTA::CAB::Format::XmlPerl->new(%args);
##========================================================================
## Methods: Input
$obj = $fmt->parseNode($nod);
$doc = $fmt->parseDocument();
##========================================================================
## Methods: Output
$xmlnod = $fmt->tokenNode($tok);
$xmlnod = $fmt->sentenceNode($sent);
$xmlnod = $fmt->documentNode($doc);
$body_array_node = $fmt->xmlBodyNode();
$sentence_array_node = $fmt->xmlSentenceNode();
$fmt = $fmt->putToken($tok);
$fmt = $fmt->putSentence($sent);
$fmt = $fmt->putDocument($doc);
DTA::CAB::Format::XmlPerl inherits from DTA::CAB::Format::XmlCommon.
DTA::CAB::Format::XmlPerl registers the filename regex:
/\.(?i:xml-perl|perl[\-\.]xml)$/
with DTA::CAB::Format.
$fmt = CLASS_OR_OBJ->new(%args);
Constructor.
%args, %$fmt:
##-- input
xdoc => $xdoc, ##-- XML::LibXML::Document
xprs => $xprs, ##-- XML::LibXML parser
##
##-- output
encoding => $inputEncoding, ##-- default: UTF-8; applies to output only!
level => $level, ##-- output formatting level (default=0)
##
##-- common
#(nothing here)
@keys = $class_or_obj->noSaveKeys();
Override: returns list of keys not to be saved. Here, returns qw(xdoc xprs)
.
$obj = $fmt->parseNode($nod);
Returns the perl object represented by the XML::LibXML::Node $nod.
$doc = $fmt->parseDocument();
Override: parses buffered XML::LibXML::Document in $fmt->{xdoc}
$xmlnod = $fmt->tokenNode($tok);
Returns an XML::LibXML::Node representing the token $tok.
$xmlnod = $fmt->sentenceNode($sent);
Returns an XML::LibXML::Node representing the sentence $sent.
$xmlnod = $fmt->documentNode($doc);
Returns an XML::LibXML::Node representing the document $doc.
$body_array_node = $fmt->xmlBodyNode();
Gets or creates buffered array node representing document body.
$sentence_array_node = $fmt->xmlSentenceNode();
Gets or creates buffered array node representing (current) document sentence.
$fmt = $fmt->putToken($tok);
Override: write token $tok to output buffer.
$fmt = $fmt->putSentence($sent);
Override: write sentence $sent to output buffer.
$fmt = $fmt->putDocument($doc);
Override: write document $doc to output buffer.
An example file in the format accepted/generated by this module is:
<?xml version="1.0" encoding="UTF-8"?>
<m ref="DTA::CAB::Document">
<l key="body">
<m>
<a key="lang">de</a>
<l key="tokens">
<m>
<m key="moot">
<a key="lemma">wie</a>
<a key="word">wie</a>
<a key="tag">PWAV</a>
</m>
<l key="lang">
<a>de</a>
</l>
<a key="hasmorph">1</a>
<a key="msafe">1</a>
<a key="text">wie</a>
<a key="exlex">wie</a>
<a key="errid">ec</a>
<m key="xlit">
<a key="latin1Text">wie</a>
<a key="isLatinExt">1</a>
<a key="isLatin1">1</a>
</m>
</m>
<m>
<a key="text">oede</a>
<a key="msafe">0</a>
<m key="moot">
<a key="word">öde</a>
<a key="tag">ADJD</a>
<a key="lemma">öde</a>
</m>
<m key="xlit">
<a key="latin1Text">oede</a>
<a key="isLatin1">1</a>
<a key="isLatinExt">1</a>
</m>
</m>
<m>
<m key="moot">
<a key="lemma">!</a>
<a key="tag">$.</a>
<a key="word">!</a>
</m>
<a key="msafe">1</a>
<a key="text">!</a>
<a key="exlex">!</a>
<a key="errid">ec</a>
<m key="xlit">
<a key="latin1Text">!</a>
<a key="isLatin1">1</a>
<a key="isLatinExt">1</a>
</m>
</m>
</l>
</m>
</l>
</m>
Bryan Jurish <moocow@cpan.org>
Copyright (C) 2009-2019 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.
Hey! The above document had some coding errors, which are explained below:
Non-ASCII character seen before =encoding in 'key="word">öde</a>'. Assuming UTF-8