Help with transforming xml

Here should go questions about transforming XML with XSLT and FOP.
gjledger2k
Posts: 16
Joined: Tue Aug 01, 2006 2:56 am
Location: Chicago

Help with transforming xml

Post by gjledger2k »

Hi,
Newbie here (sorry). I'm opening Word docs in Open Office, which builds a nice little xml document called content.xml. One of the weird things that happens tho, is Open Office might use a paragraph or character style as is, or it MIGHT turn it into an automatic paragraph style with a parent-style-name attribute whose value is the original style name. Like so:

Code: Select all

<office:automatic-styles>
<style:style style:name="P1" style:family="paragraph" style:parent-style-name="ChapterTitle"
style:master-page-name="Standard"></style:style>
...
</office:automatic-styles>
In the body of the document, there is either a reference to the actual stylename or a reference to the automatic stylename:

Code: Select all

<!--text:style-name should have been "ChapterTitle" but Open Office turned it into "P1"-->
<text:p text:style-name="P1">some text</text:p>
...
<!--sometimes Open Office leaves the style name alone-->
<text:p text:style-name="ChapterTitle">some text</text:p>
Also, there is no way to predict if and when Open Office will turn styles to automatic styles, nor can you count on the automatic-stylename to always be P1. The number is arbitrarily assigned. It could be P2, or P3, or Pn. So you can't reference it literally in the xslt.

What I'd like to do is write a transform so that if when the text:stylename (Pn) is the same as the automatic stylename, I can create an element based on the parent-style-name attribute, then put the contents of the text node inside that element. So here was (one of many) tries:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0">
<xsl:output method="xml" indent="yes"></xsl:output>

<xsl:template match="/">
<xsl:apply-templates select="//text:p"></xsl:apply-templates>
</xsl:template>


<xsl:template match="text:p[@text:style-name=preceding::office:automatic-styles/style:style/@style:name]">
<xsl:element name="{preceding::office:automatic-styles/style:style/@style:parent-style-name}">
<xsl:value-of select=".">
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Unfortunately, all I get is the value of the first parent style name put into the element name:

Code: Select all

<ChapterTitle>some text</ChapterTitle> <!--right--the parent style for P1>
<ChapterTitle>some other text</ChapterTitle> <!--wrong, should have been the parent style for P2-->
I tried putting the xsl:for-each element around different tags, but that did not work. Any suggestions, pointers, or hints? I'd really appreciate it. Thanks in advance.

Greg
gjledger2k
Posts: 16
Joined: Tue Aug 01, 2006 2:56 am
Location: Chicago

Here is a solution that seems to work

Post by gjledger2k »

Okay, I figured out this much. (Trial and error, no logic involved.) Yet it worked!

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0">
<xsl:output method="xml" indent="yes"></xsl:output>

<xsl:template match="/">
<xsl:apply-templates select="//text:p"></xsl:apply-templates>
</xsl:template>


<xsl:template match="text:p[@text:style-name=preceding::office:automatic-styles/style:style/@style:name]">
<xsl:variable name="autostyle">
<xsl:choose>
<xsl:when test="@text:style-name=preceding::office:automatic-styles/style:style/@style:name">
<xsl:value-of select="@text:style-name"></xsl:value-of></xsl:when>
</xsl:choose>
</xsl:variable>
<xsl:element name="{preceding::office:automatic-styles/style:style[@style:name=$autostyle]/@style:parent-style-name}">
<xsl:apply-templates></xsl:apply-templates>
</xsl:element>
</xsl:template>

<xsl:template
match="text:p[not(@text:style-name=preceding::office:automatic-styles/style:style/@style:name)]">
<xsl:element name="{@text:style-name}">
<xsl:value-of select="."></xsl:value-of>
</xsl:element>

</xsl:template>
</xsl:stylesheet>
Does anyone have suggestions for making this more elegant? Once again, any help would be appreciated.

Greg
Post Reply