[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
RE: [xsl] Parsing complex line (mixed text and markup)
Subject: RE: [xsl] Parsing complex line (mixed text and markup) From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Thu, 14 Feb 2008 23:14:30 -0000 |
This problem has come up in the past and it's not particularly easy. There seem to be two main approaches: (a) convert the string delimiters into element markup, and then use grouping facilities (xsl:for-each-group) to analyze the overall structure (b) convert the markup into string delimiters, and then use xsl:analyze-string. Both work, but I think (a) is probably a bit easier. Do all the delimiters (commas) occur in top-level text nodes, or can they occur nested within elements? I'll assume the former. Start by making a copy of the data in which the commas are replaced by <comma/> elements: <xsl:template match="tbentry"> <xsl:variable name="temp"> <xsl:apply-templates mode="replace-commas"/> </xsl:variable> ..[G].. </xsl:template> <xsl:template match="*" mode="replace-commas"> <xsl:copy-of select="."/> </xsl:template> <xsl:template match="text()" mode="replace-commas"> <xsl:analyze-string select="." regex=","> <xsl:matching-substring><comma/></xsl:matching-substring> <xsl:non-matching-substring><xsl:value-of select="."/></xsl:non-matching-substring> </xsl:analyze-string> </xsl:template> Then (at [G] above) process the new tbentry using grouping <xsl:for-each-group select="$temp/child::node()" group-starting-with="comma"> <entry><xsl:copy-of select="current-group()"/></entry> <xsl:for-each-group> Not tested! Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Ilya Lifshits [mailto:chehlo@xxxxxxxxx] > Sent: 14 February 2008 22:38 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] Parsing complex line (mixed text and markup) > > Hello experts, > > I'm using xslt 2.0 processor both saxon and and altova. > > I'm trying to parse complex line like: > <tbentry>Some text, Some more text <xref linkend="somelink"> > even more text , , ,</tbentrys> > > and get following output : > > <row> > <entry>Some text</entry> > <entry>Some more text <xref > linkend="ut_man_related_docs"> and even more text </entry> </row> > > Number of entries is not constant. > > I have easily find the solution of this without mixing the > text and markup by using tokenize function. > But failed to separate text and markup using this approach. > Example can be found here : http://pastebin.com/m40fd204f > > To formalize the goal: I want to simplify life of our tech > writes by creating wrappers on top of DocBook that will > help transform from my defined syntax to standard Docbook code. > So if there is another more appropriate way (which is not WYSIWYG > editor) to achieve this, i can completely change the source line: > <tblrow>Some text, Some more text <xref linkend="somelink"> > even more text </tblrow> as soon as it's still easy to write > :) The only solution i found is pass linkend entry as an > attribute to tblrow and another attribute which will specify > the entry number. > But this is very limited solution and will not allow me to > use xref in 2 entries for example. > Additional note, I'm absolutely newby in XML. > > Thanks in advance, > Ilya.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Parsing complex line (mixed t, Ilya Lifshits | Thread | RE: [xsl] Parsing complex line (mix, Michael Kay |
[xsl] Parsing complex line (mixed t, Ilya Lifshits | Date | RE: [xsl] Parsing complex line (mix, Michael Kay |
Month |