Transformation causing illegal html chara in *some* files

Oxygen general issues.
bcalla8
Posts: 6
Joined: Thu Nov 26, 2020 1:21 am

Transformation causing illegal html chara in *some* files

Post by bcalla8 »

I run about 200 xml files against a single xsl for transformation. Many of them transform just fine - but a number of them are returning the following error:

Illegal HTML character: decimal 151

It's fatal and causes the transformation to fail. It points to the following series of tags in my xsl. The "value-of select" is the location of the error that is being produced - but I'm including the entire tag so you have the full context):

</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="."/>
</xsl:template>


It goes on to specify a number of inputs in the xml including:
<xsl:template match="controlaccess/p">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="scopecontent/p">
<span class="para"><xsl:apply-templates/></span>
</xsl:template>

And some others - don't want to bog down the forum with a whole lot of code, but can provide more if it's helpful.

I didn't write this xsl, so I'm not sure if this is good/bad code or if there is something obviously wrong with it. But figured I'd pop it in here to see if anyone had any ideas. I would describe my skill level as "able to mostly manipulate what exists, but not able to troubleshoot non-obvious problems".

Unfortunately, Oxygen isn't indicating to me what line of the xml files is triggering the xsl error (I have a general idea of the general area where it must be, but not enough to narrow down (for me) what might be the specific problem). My assumption would be it's the same thing in each of the xml files that fail to transform.

My instinct is that the period in the value-of select is the issue. Or possibly that 'match' is requiring an entry in the xml - but a lot of these fields (like controlledaccess) are optional in the xml. Should we be using 'if's instead?

As you can tell - I'm a little out of my depth here - so will turn it over to folks with more expertise than me, if you've got any thoughts. Thanks in advance!
Radu
Posts: 9049
Joined: Fri Jul 09, 2004 5:18 pm

Re: Transformation causing illegal html chara in *some* files

Post by Radu »

Hi,

Probably your XSLT tries to write HTML documents and some characters from the original XML are invalid to be used in HTML, maybe you can look at this post on Stack Overflow: https://stackoverflow.com/questions/231 ... ers-in-xsl

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
bcalla8
Posts: 6
Joined: Thu Nov 26, 2020 1:21 am

Re: Transformation causing illegal html chara in *some* files

Post by bcalla8 »

I'll take a look. Thanks!
bcalla8
Posts: 6
Joined: Thu Nov 26, 2020 1:21 am

Re: Transformation causing illegal html chara in *some* files

Post by bcalla8 »

This problem wound up being really easy to solve - so updating for anyone else with a similar error.

The impacted xml files all had &#151; trying to sub in for an em dash in the xml files themselves. I updated those to just regular dashes, as they were all within <p>s, and they transform now.

The transformation error pointing me to the xsl file rather than the xml file threw me off - but I think Oxygen was just trying to tell me that the xml was incompatible with the transformation. But it made it look more complicated than it was.

But I often find that a problem finds a solution almost immediately after I've given up and posted to a forum for help! Appreciate the link, which did send me in the right direction. Thanks!
Post Reply