dtatw-cids2local.perl - convert //c/@xml:id attributes to page-local encoding
dtatw-cids2local.perl [OPTIONS] [XMLFILE(s)...]
Options:
-help # this help message
-output FILE # specify output file (default='-' (STDOUT))
-trace TRACEFILE # send trace output to file (default=none)
-xmlns , -noxmlns # do/don't prepend 'xml:' to output id attributes (default=don't)
Not yet written.
Converts //c/@xml:id
attributes to page-local encoding.
New IDs are computed page-locally, where the page element associated with each //c
is given by the XPath preceding::pb[1]
, abbreviated hereafter as $pb. The associated $pb supplies a (unique) prefix $pbid
to all //c elements on the given page. The prefix is determined according to the following rules:
If $pb has a @facs attribute, it is used to define $pbid by removing any prefix matching the regex /#?f?0*/
and prefixing a "p", e.g. the following //pb elements all map to $pbid="p42":
<pb facs="42"/>
<pb facs="#42"/>
<pb facs="#f0042"/>
<pb facs="f00042"/>
<pb facs="000042"/>
Otherwise, a global counter over all //pb elements is used (whose value is initialized to "0" (zero) before the initial //pb), prefixed by "pz".
<!-- before first page: $pbid="pz0" -->
<pb /> <!-- first page, no @facs: $pbid="pz1" -->
<pb /> <!-- second page, no @facs: $pbid="pz2" -->
<pb facs="42"/> <!-- third page, with @facs: $pbid="p42" -->
<pb /> <!-- fourth page: $pbid="pz4" -->
Finally, //c/@xml:id attributes are computed by a page-local counter $ci
, to be of the form ${pbid}.c${ci}
, e.g.:
<!-- before first page -->
<c xml:id="pz0.c1"/>
<c xml:id="pz0.c2"/>
<!-- ... -->
<!-- first page, with @facs -->
<pb facs="#f0042"/>
<c xml:id="p42.c1"/>
<c xml:id="p42.c2"/>
<!-- ... -->
<!-- second page, no @facs -->
<pb/>
<c xml:id="pz2.c1"/>
<c xml:id="pz2.c2"/>
<!-- ... -->
dta-tokwrap.perl(1), dtatw-add-c.perl(1), dtatw-rm-c.perl(1), ...
Bryan Jurish <jurish@bbaw.de>