Page 1 of 1

Handling of special characters

Posted: Tue Jul 27, 2004 7:02 am
by fastjack
Hi there,

I am now using oxygenxml for some weeks and am quite pleased. Well done. But I have one thing to ask for. I am german and I happen to have to write german texts from time to time. So I have to use theses ugly Umlauts like ä, ö, ü etc. XML as such doesn't care much if they are encoded in utf-8, as they are, but after the transformation to say XHTML I would like to have them replaced by entities like ö and so on, automatically.

My suggestion to stay flexible:
Add a shortkey-table where I can configure, which keystrokes result in which inserted text. Then I could set the ä to become ä as I type. I also could define other replacements like € to become € (or (c) to become © if you like the idea of multi-character-replacements).

Another example: If I frequently have to type the same certain text fragments I would maybe even want to define own shortcuts like CTRL-SHIFT-L which might then result in a certain license text that I happen to have to include often enough to like that kind of shortcut.

If you don't like that flexible-shortcut-key-table-idea please add at least an option like "replace language specific special characters on the fly" using a hardcoded replacement table which gives me my ä -> ä conversion.

Would be nice :)

Thanks :)

Regards
Daniel

Posted: Tue Jul 27, 2004 10:01 am
by george
Hi Daniel,

How characters appear in the transformation result is not influenced by the way they are represented in the source document but by the XSLT output properties.

There is a copy.xsl in the samples/xhtml folder of oXygen. If you change the output/@method from xml to html you will see that the ä, ö, ü characters are represented in the transformation result as ä, ö, ü respectively. Also if you keep the xml as method and add encoding="ASCII" you will see that you will get them as ä, ö, ü in the result.

In XSLT 2.0 you can define a character map as in the example below:

Code: Select all



<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml" use-character-maps="map"/>
<xsl:character-map name="map" >
<xsl:output-character character="ä" string="&auml;"/>
<xsl:output-character character="ö" string="&ouml;"/>
<xsl:output-character character="ü" string="&uuml;"/>
</xsl:character-map>
<!-- Match document -->
<xsl:template match="/">
<xsl:apply-templates mode="copy" select="."/>
</xsl:template>
<!-- Deep copy template -->
<xsl:template match="*|text()|@*" mode="copy">
<xsl:copy>
<xsl:apply-templates mode="copy" select="@*"/>
<xsl:apply-templates mode="copy"/>
</xsl:copy>
</xsl:template>
<!-- Handle default matching -->
<xsl:template match="*"/>
</xsl:stylesheet>
This will get &auml;, &ouml;, &uuml; in the result for ä, ö, ü in the input.

You can use XSLT 2.0 with oXygen if you configure Saxon 8 through JAXP. There is an article that describes how to do this step by step:
http://www.oxygenxml.com/doc/HowToConfi ... former.pdf

Hope that helps,
George

Posted: Tue Jul 27, 2004 1:25 pm
by fastjack
Thanks I think it did. I see entities now. Thats good :)
Regards
Daniel