Strip out HTML from element
Posted: Tue Feb 10, 2009 2:02 am
Hi
I've got a large XML file, that I would like to convert to a csv file.
One of the problem's i'm running into is the XML element is a copy of a HTML email.
How do I grab just the text? Or strip away the html coding before the say the body tag.
Thanks
I've got a large XML file, that I would like to convert to a csv file.
One of the problem's i'm running into is the XML element is a copy of a HTML email.
How do I grab just the text? Or strip away the html coding before the say the body tag.
Code: Select all
<?xml version="1.0" encoding="utf-8" ?><RegistrantExport>
<Registrant><ProjectID>33</ProjectID><RegistrantNo>6</RegistrantNo><RatingID>132</RatingID><SourceTypeID>260</SourceTypeID><SecondarySourceTypeID></SecondarySourceTypeID><Status>Normal</Status><ExcludeFromTraffic>No</ExcludeFromTraffic><RegistrationDate>2005-09-04</RegistrationDate><EnteredBy></EnteredBy><LastContactDate></LastContactDate><LastContactType></LastContactType><PersonalID>1238599</PersonalID><City>Savona</City><Province>BC</Province><PostalCode>V0K 2J0</PostalCode><Country>Canada</Country><IsPrimary>1</IsPrimary></Address><IsPrimary>1</IsPrimary></Email></Emails><Questions><Question><Title>I heard about Tobiano from</Title><Answers><Answer>Other</Answer></Answers></Question></Questions><History><HistoryEntry><PersonalID>1238599</PersonalID><Project>Tobiano</Project><HistoryType>Mass Mail</HistoryType><SalesRep>Andrew Karpiak</SalesRep><Date>2008-12-08 12:51:06</Date><Subject><![CDATA[Tobiano - wins again!]]></Subject><Body><![CDATA[<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Tobiano | Live, rest, and play</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<style type="text/css"><!--
body {
margin-left: 0px;
margin-top: 0px;
margin-right: 0px;
margin-bottom: 0px;
background-color: #F6F5F0;
border-top-style: none;
border-right-style: none;
border-bottom-style: none;
border-left-style: none;
}
Thanks