[epub] Split xhtml file at headings

Are you missing a feature? Request its implementation here.
rustic
Posts: 7
Joined: Sun Jun 12, 2011 10:38 pm

[epub] Split xhtml file at headings

Post by rustic »

Hi,
I hope the subject is clear enough to convey what I mean: I would like to be able to split an xhtml document at headings (h1, h2 and so on). Could you tell us if there is a way to do that properly with the already available features or is this an idea of a functionality you might implement in a future version of oxygen xml editor?

Regards.
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: [epub] Split xhtml file at headings

Post by george »

There is no built-in function for that but what you want can be easily achieved with XSLT 2.0 and oXygen has full support for that.

Best Regards,
George
George Cristian Bina
rustic
Posts: 7
Joined: Sun Jun 12, 2011 10:38 pm

Re: [epub] Split xhtml file at headings

Post by rustic »

Hi George,

Sorry George but I don't see how as my knowledge of xslt is very limited. Perhaps this is not the right spot to ask this kind of questions or you may move that question to xslt and fop forum.
Regards.
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: [epub] Split xhtml file at headings

Post by george »

Please provide a sample XHTML file and describe the desired result and I will try to show you how that can be obtained with XSLT.

Best Regards,
George
George Cristian Bina
rustic
Posts: 7
Joined: Sun Jun 12, 2011 10:38 pm

Re: [epub] Split xhtml file at headings

Post by rustic »

Sure!

Here is a sample with two titles (heading tags h1).

Code: Select all

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<link href="stylesheet.css" type="text/css" rel="stylesheet" />
</head>

<body style="">
<h1>Title 1</h1>

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed feugiat eleifend sapien, eu posuere justo pharetra non. Mauris id cursus magna.</p>

<p>Praesent laoreet ipsum vel mi rhoncus egestas. Donec eu nisi libero.</p>

<h1>Title 2</h1>

<p>Fusce ante magna, ornare ac fringilla auctor, porta sodales ipsum. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Donec molestie pharetra malesuada.</p>
</body>
</html>
I want to split that file into filename001.xhtml (just before title 2):

Code: Select all

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<link href="stylesheet.css" type="text/css" rel="stylesheet" />
</head>

<body style="">
<h1>Title 1</h1>

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed feugiat eleifend sapien, eu posuere justo pharetra non. Mauris id cursus magna.</p>

<p>Praesent laoreet ipsum vel mi rhoncus egestas. Donec eu nisi libero.</p>
</body>
</html>
and filename002.xhtml:

Code: Select all

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<link href="stylesheet.css" type="text/css" rel="stylesheet" />
</head>

<body style="">
<h1>Title 2</h1>

<p>Fusce ante magna, ornare ac fringilla auctor, porta sodales ipsum. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Donec molestie pharetra malesuada.</p>
</body>
</html>
with the possibility to keep everything before body tag like the css stylesheet declaration from the original .xhtml file.

Thank you very much.
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: [epub] Split xhtml file at headings

Post by george »

Here it is a stylesheet that splits the initial file creating two files in the sample case, file1.xhtml and file2.xhtml with the desired content:

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0" xpath-default-namespace="http://www.w3.org/1999/xhtml">
<xsl:template match="/">
<xsl:apply-templates select="html/body"/>
</xsl:template>
<xsl:template match="body">
<xsl:for-each-group select="node()" group-starting-with="h1" >
<xsl:variable name="filename" select="concat('file', position()-1)"/>
<xsl:if test="count(current-group()[self::*]) > 0">
<xsl:result-document doctype-public="-//W3C//DTD XHTML 1.1//EN"
doctype-system="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"
indent="yes" href="{$filename}.xhtml">
<html xmlns="http://www.w3.org/1999/xhtml">
<xsl:copy-of select="/html/@*"/>
<xsl:for-each select="/html/node()">
<xsl:choose>
<xsl:when test="not(self::body)">
<xsl:copy-of select="."/>
</xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:copy-of select="current-group()"/>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</html>
</xsl:result-document>
</xsl:if>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
Best Regards,
George
George Cristian Bina
rustic
Posts: 7
Joined: Sun Jun 12, 2011 10:38 pm

Re: [epub] Split xhtml file at headings

Post by rustic »

Thanks George but is it possible to apply this inside an epub archive and add the new files in it?

I've tried your xslt script in a single (not compressed) xhtml file, files are succesfully created.

However inside an archive (epub), files are saved somewhere (where?) but not inside the epub archive.

I hope you get my point.

Regards.
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: [epub] Split xhtml file at headings

Post by george »

Just change the href on result-document as below:

Code: Select all


<xsl:result-document doctype-public="-//W3C//DTD XHTML 1.1//EN"
doctype-system="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"
indent="yes"
href="{document-uri(/)}_{$filename}.xhtml">
Best Regards,
George
George Cristian Bina
rustic
Posts: 7
Joined: Sun Jun 12, 2011 10:38 pm

Re: [epub] Split xhtml file at headings

Post by rustic »

Thanks for that George, you've been most helpful.
Post Reply