DTA::CAB::Format::Raw - Document parser/formatter: raw untokenized text (dispatch)
use DTA::CAB::Format::Raw;
##========================================================================
## Methods
$class = $DTA::CAB::Format::Raw::DEFAULT_SUBCLASS;
$fmt = DTA::CAB::Format::Raw->new(%args);
DTA::CAB::Format::Raw is an input-only DTA::CAB::Format subclass for untokenized raw string input. This class really justs acts as a wrapper for the actual default tokenizing class, $DTA::CAB::Format::Raw::DEFAULT_SUBCLASS
.
As an output format, DTA::CAB::Format::Raw writes canonical surface forms to the output stream using DTA::CAB::Format::Raw::Base. Each output sentence is terminated by a single newline ("\n"
), and output tokens are separated by a single space character.
Default tokenizing subclass which this class wraps. Defaults to the value of the environment variable DTA_CAB_FORMAT_RAW_DEFAULT_SUBCLASS
if set, or to "DTA::CAB::Format::Raw::Waste"
otherwise.
Prior to v1.92, always defaulted to "DTA::CAB::Format::Raw::HTTP"
.
$fmt = CLASS_OR_OBJ->new(%args);
%args:
##-- Input
class => $class, ##-- actual subclass to generate
Bryan Jurish <moocow@cpan.org>
Copyright (C) 2010-2019 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.
dta-cab-convert.perl(1), DTA::CAB::Format::Raw::HTTP(3pm), DTA::CAB::Format::Raw::Waste(3pm), DTA::CAB::Format::Raw::Perl(3pm), DTA::CAB::Format::Builtin(3pm), DTA::CAB::Format(3pm), DTA::CAB(3pm), perl(1), ...