Getting a handle on <bookmap> CSS selectors
Posted: Fri May 10, 2019 5:29 pm
Hi everyone,
Some of our books use <part> volume dividers in the bookmaps and some do not. It's been a bit challenging to write CSS selectors that work in both cases.
To more easily see what attributes are available, I wrote a little perl script called show_html_structure.pl that processes the temporary HTML file and shows a summary of the structure:
Before you use the script, you must install the XML::Twig and HTML::TreeBuilder perl CPAN modules, which you can do with:
Use the script by running it on the merged HTML file in the out/pdf-css-html5/ directory. For example, a test book with the following ditamap:
results in this output:
You can comment or uncomment lines in the perl script to show or hide various things. For example, you could keep titles so you can see what classes they can be selected with:
Note that I'm using a beta 21.1 build of PDF Chemistry provided to resolve some issues. It defines a new @topicrefclass attribute on articles that preserves the topicref context of articles that come from <bookmap> constructs.
Some of our books use <part> volume dividers in the bookmaps and some do not. It's been a bit challenging to write CSS selectors that work in both cases.
To more easily see what attributes are available, I wrote a little perl script called show_html_structure.pl that processes the temporary HTML file and shows a summary of the structure:
Code: Select all
#!/usr/bin/perl
use warnings;
use strict;
use XML::Twig;
my $twig = XML::Twig->new->parsefile_html(shift);
$twig->root->first_child('head')->delete;
$_->cut_children for $twig->get_xpath('//div[@class = "- topic/body body"]'); # empty topic body contents
$_->set_att('HERE', 1) for $twig->get_xpath('//div[@class =~ /glossgroup/]'); # delete glossary entries (subtopics)
$_->cut_children('article') for $twig->get_xpath('//article[@class =~ /glossgroup/]'); # delete glossary entries (subtopics)
$_->delete for $twig->descendants('h1|h2|h3|h4|h5|h6|h7|h8|h9'); # delete titles
$_->delete for $twig->root->get_xpath('//div[@class =~ /wh_related_links/]');
$_->delete for $twig->root->get_xpath('//div[@class =~ /topic.body /]'); # delete empty topic body placeholders
$twig->root->strip_att($_) for qw(id nd:nd-id oid xmlns:nd aria-labelledby break-before cascade ditaarch:ditaarchversion lang xml:lang xmlns:ditaarch); # delete attributes of non-interest
$twig->print(pretty_print => 'indented');
Code: Select all
sudo cpan -i App::cpanminus ;# if cpanm is not installed
sudo cpanm install XML::Twig HTML::TreeBuilder
Code: Select all
<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="urn:oasis:names:tc:dita:rng:bookmap.rng" schematypens="http://relaxng.org/ns/structure/1.0"?>
<bookmap>
<title>Sample Book</title>
<frontmatter>
<notices href="notices.dita" id="notices"/>
<preface href="preface.dita" id="preface"/>
</frontmatter>
<part id="partdiv_id7" navtitle="Part 1">
<chapter href="chapter1.dita" id="chapter1">
<topicref href="topic1.dita" id="topic1"/>
</chapter>
</part>
<part id="partdiv_id9" navtitle="Part 2">
<chapter href="chapter2.dita" id="chapter2">
<topicref href="topic2.dita" id="topic2"/>
</chapter>
</part>
<appendix href="appendix.dita" id="appendix"/>
<appendix href="glossary.dita" id="glossary"/>
</bookmap>
Code: Select all
$ show_html_structure.pl OPENME.merged.html
<html xtrf="file:/C:/Users/...deleted.../OPENME.ditamap">
<body class="wh_topic_page">
<div class="wh_content_area">
<div class="wh_topic_body">
<div class="wh_topic_content">
<div class="- map/map bookmap/bookmap map bookmap" domains="(map bookmap) (topic abbrev-d) (topic delay-d) (map ditavalref-d) (topic hazard-d) (topic hi-d) (topic indexing-d) (map mapgroup-d) (topic markup-d xml-d) (topic marku
p-d) (topic pr-d) (topic relmgmt-d) (topic sw-d) (topic ui-d) (topic ut-d) (topic xnal-d) a(props deliveryTarget)">
<div class="- front-page/front-page front-page">
<div class="- front-page/front-page-title front-page-title">
<div class="- topic/title title">Sample Book</div>
</div>
</div>
<article class="- topic/topic topic nested0" topicrefclass="- map/topicref bookmap/notices "></article>
<article class="- topic/topic topic nested0" topicrefclass="- map/topicref bookmap/preface "></article>
<article class="+ topic/topic pdf2-d/placeholder topic placeholder nested0" is-chapter="true" is-part="true">
<article class="- topic/topic topic nested1" is-chapter="true" topicrefclass="- map/topicref bookmap/chapter ">
<article class="- topic/topic topic nested2" topicrefclass="- map/topicref "></article>
</article>
</article>
<article class="+ topic/topic pdf2-d/placeholder topic placeholder nested0" is-chapter="true" is-part="true">
<article class="- topic/topic topic nested1" is-chapter="true" topicrefclass="- map/topicref bookmap/chapter ">
<article class="- topic/topic topic nested2" topicrefclass="- map/topicref "></article>
</article>
</article>
<article class="- topic/topic topic nested0" is-chapter="true" topicrefclass="- map/topicref bookmap/appendix "></article>
<article class="- topic/topic concept/concept glossgroup/glossgroup topic concept glossgroup nested0" is-chapter="true" topicrefclass="- map/topicref bookmap/appendix "></article>
</div>
</div>
</div>
</div>
</body>
</html>
Code: Select all
$ show_html_structure.pl OPENME.merged.html
<html xtrf="file:/C:/Users/...deleted.../OPENME.ditamap">
<body class="wh_topic_page">
<div class="wh_content_area">
<div class="wh_topic_body">
<div class="wh_topic_content">
<div class="- map/map bookmap/bookmap map bookmap" domains="(map bookmap) (topic abbrev-d) (topic delay-d) (map ditavalref-d) (topic hazard-d) (topic hi-d) (topic indexing-d) (map mapgroup-d) (topic markup-d xml-d) (topic marku
p-d) (topic pr-d) (topic relmgmt-d) (topic sw-d) (topic ui-d) (topic ut-d) (topic xnal-d) a(props deliveryTarget)">
<div class="- front-page/front-page front-page">
<div class="- front-page/front-page-title front-page-title">
<div class="- topic/title title">Sample Book</div>
</div>
</div>
<article class="- topic/topic topic nested0" topicrefclass="- map/topicref bookmap/notices ">
<h1 class="- topic/title title topictitle1">Notices</h1>
</article>
<article class="- topic/topic topic nested0" topicrefclass="- map/topicref bookmap/preface ">
<h1 class="- topic/title title topictitle1">Preface</h1>
</article>
<article class="+ topic/topic pdf2-d/placeholder topic placeholder nested0" is-chapter="true" is-part="true">
<h1 class="- topic/title title topictitle1">Part 1</h1>
<article class="- topic/topic topic nested1" is-chapter="true" topicrefclass="- map/topicref bookmap/chapter ">
<h2 class="- topic/title title topictitle2">Chapter</h2>
<article class="- topic/topic topic nested2" topicrefclass="- map/topicref ">
<h3 class="- topic/title title topictitle3">Topic</h3>
</article>
</article>
</article>
<article class="+ topic/topic pdf2-d/placeholder topic placeholder nested0" is-chapter="true" is-part="true">
<h1 class="- topic/title title topictitle1">Part 2</h1>
<article class="- topic/topic topic nested1" is-chapter="true" topicrefclass="- map/topicref bookmap/chapter ">
<h2 class="- topic/title title topictitle2">Chapter</h2>
<article class="- topic/topic topic nested2" topicrefclass="- map/topicref ">
<h3 class="- topic/title title topictitle3">Topic</h3>
</article>
</article>
</article>
<article class="- topic/topic topic nested0" is-chapter="true" topicrefclass="- map/topicref bookmap/appendix ">
<h1 class="- topic/title title topictitle1">Appendix</h1>
</article>
<article class="- topic/topic concept/concept glossgroup/glossgroup topic concept glossgroup nested0" is-chapter="true" topicrefclass="- map/topicref bookmap/appendix ">
<h1 class="- topic/title title topictitle1">Glossary</h1>
</article>
</div>
</div>
</div>
</div>
</body>
</html>