Empty xmlns attribute in XSLT output
Posted: Mon Mar 29, 2021 1:17 am
I have an XSLT conversion that is resulting in elements with xmlns="" for reasons I don't understand. I don't think this is the result of an Oxygen bug, but more likely I have an incomplete understanding of how Oxygen (or the Saxon processor) handles namespaces.
The conversion giving the unexpected result converts instances of the JATS DTD to HTML. A typical JATS article for this conversion begins with
and is converted to HTML beginning
This is an abbreviated version of the XSL stylesheet:
I had been running this conversion on instances of a predecessor of the JATS DTD and hadn't had any problems. Now I'm preparing a version to run on instances of JATS, and I find that every <sup> or <sub> element from the JATS instance appears in the HTML intance as <sup xmlns=""> or <sub xmlns="">.
This was baffling to me because those are the only two elements that would appear in the HTML output with the empty 'xmlns' attribute. I eventually realized that these were the only two elements in the JAT input that were being processed by the default template name "process-element", shown in the code above. When I created templates specifically for those elements, like so--
--the resulting <sup> and <sub> elements appeared in the HTML without the empty 'xmlns' attribute. As it happens, 'sup' and 'sub' are two of a small number of element names that are found in both the JATS and HTML tag sets. I found I was able to generate more instances of the empty 'xmlns' attribute by commenting out the template for the 'p' element and letting that be handled by the default template as well.
After searching online, I found that this problem had been discussed on StackOverflow a few times, for example at https://stackoverflow.com/questions/206 ... sformation:
And the output I got from Oxygen for every element handled by the default template was "PROCESS-ELEMENT called for element '[whatever]' in namespace ''"
At the same time that I was puzzling over this JATS-to-HTML conversion, I had a JATS-to-JATS conversion that used the same default template and whose output never included the empty 'xmlns' attribute.
Am I troubleshooting this the wrong way? I know how to fix the problem for the moment--create templates explicitly for the elements that are getting the empty 'xmlns' attribute--but I'm still not sure why it's necessary for one conversion for not for the other.
Sorry for such a long-winded description.
The conversion giving the unexpected result converts instances of the JATS DTD to HTML. A typical JATS article for this conversion begins with
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article
PUBLIC "-//NLM//DTD UCP JATS (Z39.96) Journal Publishing DTD with OASIS Tables with MathML3 v1.2 20190208//EN" "JATS-journalpublishing-oasis-article1-mathml3.ucp.dtd">
<article
xmlns:ali="http://www.niso.org/schemas/ali/1.0/"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
xmlns:oasis="http://www.niso.org/standards/z39-96/ns/oasis-exchange/table"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
article-type="research-article" dtd-version="1.2" xml:lang="en">
Code: Select all
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:epub="http://www.idpf.org/2007/ops"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
xmlns:xlink="http://www.w3.org/1999/xlink"
lang="en">
Code: Select all
<xsl:stylesheet version="2.0" xmlns="http://www.w3.org/1999/xhtml"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="xs xsl">
<xsl:output method="xml" indent="no" omit-xml-declaration="no"/>
<xsl:strip-space elements="*"/>
<xsl:preserve-space elements="string-name"/>
<xsl:template match="article">
<html xmlns:epub="http://www.idpf.org/2007/ops" lang="en">
<head>
[snip]
</head>
<body>
<article>
<xsl:apply-templates/>
</article>
</body>
</html>
</xsl:template>
[plus many other templates]
<xsl:template match="*" name="process-element">
<xsl:copy copy-namespaces="no">
<xsl:for-each select="@*">
<xsl:copy/>
</xsl:for-each>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:styleheet>
This was baffling to me because those are the only two elements that would appear in the HTML output with the empty 'xmlns' attribute. I eventually realized that these were the only two elements in the JAT input that were being processed by the default template name "process-element", shown in the code above. When I created templates specifically for those elements, like so--
Code: Select all
<xsl:template match="sup">
<xsl:element name="sup">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
After searching online, I found that this problem had been discussed on StackOverflow a few times, for example at https://stackoverflow.com/questions/206 ... sformation:
However, I'm not sure if this applies to my problem, because I didn't think there was a default namespace already in force or that I had done anything to remove it. To check this, I added a log message to my default template:If you have xmlns="" appearing on an element in your output then it means that you added an element with no namespace to the tree at a point where there was a default namespace already in force. In order to output such a structure the serializer must countermand that default with xmlns="".
Code: Select all
<xsl:template match="*" name="process-element">
<xsl:message>
<xsl:text>PROCESS-ELEMENT called for element '</xsl:text>
<xsl:value-of select="name(.)"/>
<xsl:text>' in namespace '</xsl:text>
<xsl:value-of select="namespace-uri()"/>
<xsl:text>'</xsl:text>
</xsl:message>
<xsl:copy copy-namespaces="no">
<xsl:for-each select="@*">
<xsl:copy/>
</xsl:for-each>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
At the same time that I was puzzling over this JATS-to-HTML conversion, I had a JATS-to-JATS conversion that used the same default template and whose output never included the empty 'xmlns' attribute.
Am I troubleshooting this the wrong way? I know how to fix the problem for the moment--create templates explicitly for the elements that are getting the empty 'xmlns' attribute--but I'm still not sure why it's necessary for one conversion for not for the other.
Sorry for such a long-winded description.