DTA::CAB::Format::XmlNative - Datum parser|formatter: XML (native)
use DTA::CAB::Format::XmlNative;
##========================================================================
## Methods
$fmt = DTA::CAB::Format::XmlNative->new(%args);
$obj = $fmt->parseNode($nod);
$doc = $fmt->parseDocument();
$fmt = $fmt->putDocument($doc);
##========================================================================
## Utilities
$nod = $fmt->xmlNode($thingy,$name);
$val = PACKAGE::_pushValue(\%hash, $key, $val); ##-- $hash{$key}=$val;
DTA::CAB::Format::XmlNative is a DTA::CAB::Format subclass for document I/O using a native XML dialect. It inherits from DTA::CAB::Format::XmlCommon.
$fmt = CLASS_OR_OBJ->new(%args);
%$fmt, %args:
##-- input: inherited
xdoc => $xdoc, ##-- XML::LibXML::Document
xprs => $xprs, ##-- XML::LibXML parser
##
##-- input: new
parseXmlData => $bool, ##-- if specified and true, _xmldata key will be populated by parseNode() (default=unspecified:true)
##
##-- input+output: new
xml2key => \%xml2key, ##-- maps xml keys to internal keys
ignoreKeys => \%key2undef, ##-- keys to ignore for i/o
##
##-- output: new
arrayEltKeys => \%akey2ekey, ##-- maps array keys to element keys for output
arrayImplicitKeys => \%akey2undef, ##-- pseudo-hash of array keys NOT mapped to explicit elements
key2xml => \%key2xml, ##-- maps keys to XML-safe names
xml2key => \%xml2key, ##-- maps xml keys to internal keys
##
##-- output: inherited
encoding => $inputEncoding, ##-- default: UTF-8; applies to output only!
level => $level, ##-- output formatting level (default=0)
$doc = $fmt->parseDocument();
Parses buffered XML::LibXML::Document into a buffered DTA::CAB::Document.
Returns "official" short name for this format, here just 'xml'.
$fmt = $fmt->putDocument($doc);
Formats the DTA::CAB::Document $doc as XML to the in-memory buffer $fmt->{xdoc}.
$obj = $fmt->parseNode($nod);
Returns a perl object represented by the XML::LibXML::Node $nod; attempting to map xml to perl structure "sensibly".
DTA::CAB::Datum nodes (document, sentence, token) get some additional baggage:
_xmldata => $data, ##-- unparsed content (raw string)
$nod = $fmt->xmlNode($thingy,$name);
Returns an xml node for the perl scalar $thingy using $name as its key, used in constructing XML output documents.
$val = PACKAGE::_pushValue(\%hash, $key, $val); ##-- $hash{$key}=$val;
$val = PACKAGE::_pushValue(\@array, $key, $val); ##-- push(@array,$val)
Convenience routine used by parseNode() when constructing perl data structures from XML input.
An example file in the format accepted/generated by this module is:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<s lang="de">
<w exlex="wie" hasmorph="1" msafe="1" errid="ec" t="wie" lang="de">
<moot word="wie" lemma="wie" tag="PWAV"/>
<xlit latin1Text="wie" isLatin1="1" isLatinExt="1"/>
</w>
<w msafe="0" t="oede">
<moot tag="ADJD" lemma="öde" word="öde"/>
<xlit isLatinExt="1" isLatin1="1" latin1Text="oede"/>
</w>
<w msafe="1" errid="ec" t="!" exlex="!">
<moot lemma="!" word="!" tag="$."/>
<xlit isLatinExt="1" isLatin1="1" latin1Text="!"/>
</w>
</s>
</doc>
Bryan Jurish <moocow@cpan.org>
Copyright (C) 2010-2019 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.
dta-cab-convert.perl(1), DTA::CAB::Format::XmlCommon(3pm), DTA::CAB::Format::Builtin(3pm), DTA::CAB::Format(3pm), DTA::CAB(3pm), perl(1), ...
Hey! The above document had some coding errors, which are explained below:
Non-ASCII character seen before =encoding in 'lemma="öde"'. Assuming UTF-8