Convert ASCII to UTF char
Questions about XML that are not covered by the other forums should go here.
-
- Posts: 3
- Joined: Mon Nov 28, 2011 6:36 pm
Convert ASCII to UTF char
Post by gardefjord »
Hi all,
I have html-files with a bunch of ASCII-signs inside. Like so:
If anyone knows how i can switch all the ASCII to normal UTF?
ål = å
I have html-files with a bunch of ASCII-signs inside. Like so:
Code: Select all
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="sv">
<head>
<title>De andra</title>
<link rel="stylesheet" href="Styles.css" type="text/css" />
<link rel="stylesheet" type="application/vnd.adobe-page-template+xml" href="page-template.xpgt" />
</head>
<body>
<div class="booksection">
<h1 id="ch001"><a id="page_011"></a>Molly Beslutet</h1>
<p class="noindent_j1">När Molly vaknade sträckte hon ut ena armen mot den andra kudden. Den var lika tom som den varit det senaste halvåret. Ingen kind att smeka, ingen kropp att krypa intill. Pelle fanns helt enkelt inte där.</p>
<p class="indent_j">Hon satte sig upp och släppte ner fötterna i fårskinnsfällen. Den mjuka, lockiga känslan fick hennes kropp att långsamt vakna. Hon tog ett par steg fram till fönstret, öppnade det och drog försiktigt in den kalla luften i lungorna. Även om vintern höll på att släppa sitt grepp och det mesta av snön hade smält undan var morgnarna fortfarande svartmålade. Molly huttrade och drog igen fönstret.</p>
<p class="indent_j">I köket slängde hon några vedklampar i spisen och kaminen. Det kändes som om hon inte hade gjort något annat den sista tiden än huggit ved och eldat upp den igen.</p>
ål = å
-
- Posts: 9434
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Convert ASCII to UTF char
Hi Adam,
Just open the document in the Oxygen Text page, select all, right click, choose Source->Unescape selection and check the Unescape Characters check box.
Or if you want to automate this you could apply an XSLT stylesheet on it which just copies the XML content to the output.
An XSLT stylesheet which just copies the XML content has the following content:
and it automatically expands character entities to characters.
Regards,
Radu
Just open the document in the Oxygen Text page, select all, right click, choose Source->Unescape selection and check the Unescape Characters check box.
Or if you want to automate this you could apply an XSLT stylesheet on it which just copies the XML content to the output.
An XSLT stylesheet which just copies the XML content has the following content:
Code: Select all
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<!-- Match document -->
<xsl:template match="/">
<xsl:apply-templates mode="copy" select="."/>
</xsl:template>
<!-- Deep copy template -->
<xsl:template match="*|text()|@*" mode="copy">
<xsl:copy>
<xsl:apply-templates mode="copy" select="@*"/>
<xsl:apply-templates mode="copy"/>
</xsl:copy>
</xsl:template>
<!-- Handle default matching -->
<xsl:template match="*"/>
</xsl:stylesheet>
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 3
- Joined: Mon Nov 28, 2011 6:36 pm
Re: Convert ASCII to UTF char
Post by gardefjord »
Thanks Radu!
Loving Oxygen XML Editor so far...
One more question though, how do I handle "&" signs?
Example:
I get this error:
Loving Oxygen XML Editor so far...
One more question though, how do I handle "&" signs?
Example:
Code: Select all
<p class="indent_j">– Jag skall ta strid mot dig var du än dyker upp, Ragnar & Surteson, viskade hon och spegelbilden upprepade löftet or dagrant.</p>
Code: Select all
F [Xerces] The entity name must immediately follow the '&' in the entity reference.
-
- Posts: 9434
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Convert ASCII to UTF char
Hi Adam,
According to the XML standard specification the "&" character is not allowed in XML content and should always be escaped to &.
When unescaping the entire Text content the Unescape selection dialog has a checkbox which can be unchecked to leave &'s untouched.
Regards,
Radu
According to the XML standard specification the "&" character is not allowed in XML content and should always be escaped to &.
When unescaping the entire Text content the Unescape selection dialog has a checkbox which can be unchecked to leave &'s untouched.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 3
- Joined: Mon Nov 28, 2011 6:36 pm
Return to “General XML Questions”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service