[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] remove tags + CDATA tag out of big xml file


Subject: Re: [xsl] remove tags + CDATA tag out of big xml file
From: bw <bwakkie@xxxxxxxxx>
Date: Mon, 1 Feb 2010 15:06:59 +0100

Hi Michael,

This is exactly why I want to remove it ;-). I was even thinking about
some fancy perl script command to remove it now.

On 29/01/2010, Michael Ludwig <milu71@xxxxxx> wrote:
> bw schrieb am 29.01.2010 um 12:02:10 (+0100):
>> Hello,
>>
>> I have a big xml feed out of my content management system that
>> includes wysiwyg html tags inside CDATA tags.
>>
>> I am looking for a way to remove the CDATA and only get the text.
>
>>          <content><![CDATA[
>> <p>The <strong>keyword</strong> is nice to have but is not needed to
>> include in a solr feed</p> ...
>
> Looks like this feed is for Solr (an indexer), which won't do anything
> useful with the markup anyway. Someone has defined <title> and <content>
> as fields for the indexer but has forgotten to strip the markup from the
> source. That source markup in CDATA has no purpose in a feed for Solr
> and should not have been included in the first place.
>
> --
> Michael Ludwig
>
>


-- 
[Bb](astia{2}n)?\s?[Ww](ak{2}ie)?$


Current Thread
Keywords
xml