Handling of special characters

Are you missing a feature? Request it's implementation here.
fastjack
Posts: 13
Joined: Tue Jul 27, 2004 6:41 am
Location: Bremen/Germany
Contact:

Handling of special characters

Post by fastjack » Tue Jul 27, 2004 7:02 am

Hi there,

I am now using oxygenxml for some weeks and am quite pleased. Well done. But I have one thing to ask for. I am german and I happen to have to write german texts from time to time. So I have to use theses ugly Umlauts like ä, ö, ü etc. XML as such doesn't care much if they are encoded in utf-8, as they are, but after the transformation to say XHTML I would like to have them replaced by entities like ö and so on, automatically.

My suggestion to stay flexible:
Add a shortkey-table where I can configure, which keystrokes result in which inserted text. Then I could set the ä to become ä as I type. I also could define other replacements like € to become € (or (c) to become © if you like the idea of multi-character-replacements).

Another example: If I frequently have to type the same certain text fragments I would maybe even want to define own shortcuts like CTRL-SHIFT-L which might then result in a certain license text that I happen to have to include often enough to like that kind of shortcut.

If you don't like that flexible-shortcut-key-table-idea please add at least an option like "replace language specific special characters on the fly" using a hardcoded replacement table which gives me my ä -> ä conversion.

Would be nice :)

Thanks :)

Regards
Daniel

george
Site Admin
Posts: 2102
Joined: Thu Jan 09, 2003 2:58 pm

Post by george » Tue Jul 27, 2004 10:01 am

Hi Daniel,

How characters appear in the transformation result is not influenced by the way they are represented in the source document but by the XSLT output properties.

There is a copy.xsl in the samples/xhtml folder of oXygen. If you change the output/@method from xml to html you will see that the ä, ö, ü characters are represented in the transformation result as ä, ö, ü respectively. Also if you keep the xml as method and add encoding="ASCII" you will see that you will get them as ä, ö, ü in the result.

In XSLT 2.0 you can define a character map as in the example below:

Code: Select all



<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml" use-character-maps="map"/>
<xsl:character-map name="map" >
<xsl:output-character character="ä" string="&auml;"/>
<xsl:output-character character="ö" string="&ouml;"/>
<xsl:output-character character="ü" string="&uuml;"/>
</xsl:character-map>
<!-- Match document -->
<xsl:template match="/">
<xsl:apply-templates mode="copy" select="."/>
</xsl:template>
<!-- Deep copy template -->
<xsl:template match="*|text()|@*" mode="copy">
<xsl:copy>
<xsl:apply-templates mode="copy" select="@*"/>
<xsl:apply-templates mode="copy"/>
</xsl:copy>
</xsl:template>
<!-- Handle default matching -->
<xsl:template match="*"/>
</xsl:stylesheet>
This will get &auml;, &ouml;, &uuml; in the result for ä, ö, ü in the input.

You can use XSLT 2.0 with oXygen if you configure Saxon 8 through JAXP. There is an article that describes how to do this step by step:
http://www.oxygenxml.com/doc/HowToConfi ... former.pdf

Hope that helps,
George

fastjack
Posts: 13
Joined: Tue Jul 27, 2004 6:41 am
Location: Bremen/Germany
Contact:

Post by fastjack » Tue Jul 27, 2004 1:25 pm

Thanks I think it did. I see entities now. Thats good :)
Regards
Daniel

Post Reply