Hi,
If you transform the DITA Map to PDF at some stage the DITA Open Toolkit will merge all referenced topics into one huge XML.
If you edit the transformation scenario used to transform the DITA Map to PDF in the
Parameters tab you can toggle the
clean.temp parameter so that the temporary files folder is not deleted anymore.
Then from the temporary files folder you can open a file called:
ditaMapFileName_MERGED.xml.
That file contains the content from all DITA Topics.
Then you could apply an XSLT 2.0 stylesheet (using Saxon 9 EE) to it like the one below:
Code: Select all
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="/">
<counts>
<xsl:apply-templates/>
</counts>
</xsl:template>
<xsl:template match="text()"/>
<xsl:template match="*[contains(@class, 'map/map')]">
<xsl:variable name="text">
<xsl:apply-templates mode="getText" select="node()"/>
</xsl:variable>
<count>
<xsl:value-of
select="count(tokenize(lower-case($text),'(\s|[,.!:;]|[n][b][s][p][;])+')[string(.)])"
/>
</count>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="*[contains(@class, 'map/map')]"
mode="getText"/>
</xsl:stylesheet>
The result should be an estimate word count.
Another way to do this:
Open the DITA Map in the Oxygen DITA Maps Manager, choose from the toolbar "Open Map in Editor with resolved topics".
When the Map opens in the main editor open the Find/Replace dialog, check the
Regular expression checkbox, search for
\b\w+\b and press
Find All. This should find all words and also give you a count. But it will probably take longer than the first method.
About the same trick could be done using
Find/Replace in Files on the entire DITA Map and choosing only to search in element names.
Regards,
Radu