NAME

DTA::CAB::Format::XmlPerl - Datum parser|formatter: XML (perl-like)

SYNOPSIS

 use DTA::CAB::Format::XmlPerl;
 
 ##========================================================================
 ## Constructors etc.
 
 $fmt = DTA::CAB::Format::XmlPerl->new(%args);
 
 ##========================================================================
 ## Methods: Input
 
 $obj = $fmt->parseNode($nod);
 $doc = $fmt->parseDocument();
 
 ##========================================================================
 ## Methods: Output
 
 $xmlnod = $fmt->tokenNode($tok);
 $xmlnod = $fmt->sentenceNode($sent);
 $xmlnod = $fmt->documentNode($doc);
 $body_array_node = $fmt->xmlBodyNode();
 $sentence_array_node = $fmt->xmlSentenceNode();
 $fmt = $fmt->putToken($tok);
 $fmt = $fmt->putSentence($sent);
 $fmt = $fmt->putDocument($doc);

DESCRIPTION

Globals

Variable: @ISA

DTA::CAB::Format::XmlPerl inherits from DTA::CAB::Format::XmlCommon.

Filenames

DTA::CAB::Format::XmlPerl registers the filename regex:

 /\.(?i:xml-perl|perl[\-\.]xml)$/

with DTA::CAB::Format.

Constructors etc.

new
 $fmt = CLASS_OR_OBJ->new(%args);

Constructor.

%args, %$fmt:

 ##-- input
 xdoc => $xdoc,                          ##-- XML::LibXML::Document
 xprs => $xprs,                          ##-- XML::LibXML parser
 ##
 ##-- output
 encoding => $inputEncoding,             ##-- default: UTF-8; applies to output only!
 level => $level,                        ##-- output formatting level (default=0)
 ##
 ##-- common
 #(nothing here)

Methods: Persistence

noSaveKeys
 @keys = $class_or_obj->noSaveKeys();

Override: returns list of keys not to be saved. Here, returns qw(xdoc xprs).

Methods: Input

parseNode
 $obj = $fmt->parseNode($nod);

Returns the perl object represented by the XML::LibXML::Node $nod.

parseDocument
 $doc = $fmt->parseDocument();

Override: parses buffered XML::LibXML::Document in $fmt->{xdoc}

Methods: Output

tokenNode
 $xmlnod = $fmt->tokenNode($tok);

Returns an XML::LibXML::Node representing the token $tok.

sentenceNode
 $xmlnod = $fmt->sentenceNode($sent);

Returns an XML::LibXML::Node representing the sentence $sent.

documentNode
 $xmlnod = $fmt->documentNode($doc);

Returns an XML::LibXML::Node representing the document $doc.

xmlBodyNode
 $body_array_node = $fmt->xmlBodyNode();

Gets or creates buffered array node representing document body.

xmlSentenceNode
 $sentence_array_node = $fmt->xmlSentenceNode();

Gets or creates buffered array node representing (current) document sentence.

putToken
 $fmt = $fmt->putToken($tok);

Override: write token $tok to output buffer.

putSentence
 $fmt = $fmt->putSentence($sent);

Override: write sentence $sent to output buffer.

putDocument
 $fmt = $fmt->putDocument($doc);

Override: write document $doc to output buffer.

EXAMPLE

An example file in the format accepted/generated by this module is:

 <?xml version="1.0" encoding="UTF-8"?>
 <m ref="DTA::CAB::Document">
   <l key="body">
     <m>
       <a key="lang">de</a>
       <l key="tokens">
         <m>
           <m key="moot">
             <a key="lemma">wie</a>
             <a key="word">wie</a>
             <a key="tag">PWAV</a>
           </m>
           <l key="lang">
             <a>de</a>
           </l>
           <a key="hasmorph">1</a>
           <a key="msafe">1</a>
           <a key="text">wie</a>
           <a key="exlex">wie</a>
           <a key="errid">ec</a>
           <m key="xlit">
             <a key="latin1Text">wie</a>
             <a key="isLatinExt">1</a>
             <a key="isLatin1">1</a>
           </m>
         </m>
         <m>
           <a key="text">oede</a>
           <a key="msafe">0</a>
           <m key="moot">
             <a key="word">öde</a>
             <a key="tag">ADJD</a>
             <a key="lemma">öde</a>
           </m>
           <m key="xlit">
             <a key="latin1Text">oede</a>
             <a key="isLatin1">1</a>
             <a key="isLatinExt">1</a>
           </m>
         </m>
         <m>
           <m key="moot">
             <a key="lemma">!</a>
             <a key="tag">$.</a>
             <a key="word">!</a>
           </m>
           <a key="msafe">1</a>
           <a key="text">!</a>
           <a key="exlex">!</a>
           <a key="errid">ec</a>
           <m key="xlit">
             <a key="latin1Text">!</a>
             <a key="isLatin1">1</a>
             <a key="isLatinExt">1</a>
           </m>
         </m>
       </l>
     </m>
   </l>
 </m>

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2009-2019 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 502:

Non-ASCII character seen before =encoding in 'key="word">öde</a>'. Assuming UTF-8