[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Re: [xsl] Xpath Syntax Issue
Subject: Re: [xsl] Xpath Syntax Issue From: Nathan Tallman <ntallman@xxxxxxxxx> Date: Sun, 24 Jun 2012 12:26:25 -0400 |
Sorry, here's my XSLT (remove.xsl): <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:s="http://www.sitemaps.org/schemas/sitemap/0.9" exclude-result-prefixes="s" > <xsl:output method="xml" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <!-- Standard copy --> <xsl:template match="*"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="s:urlset/s:url[normalize-space(s:loc) = 'URL']"/> </xsl:stylesheet> XML Snippet (sitemap1.xml): <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> <url> <loc>URL</loc> <lastmod>2012-06-23T13:37:27+00:00</lastmod> <changefreq>monthly</changefreq> <priority>1.0</priority> </url> .... </urlset> Command used in Linux: xsltproc -o sitemapb.xml remove.xsl sitemap1.xml (In case anyone is wondering why I want to remove URLs from a sitemap, there are a few pages generated by a script, purely for crawling reasons, as the pages don't crawl well otherwise. The sitemap feeds the indexing engine for our website and I don't want these artificial pages cluttering up search results. So after the sitemap is generated, I want to run this XSLT to remove the URLs before the indexer starts.) Thanks, Nathan On Sun, Jun 24, 2012 at 11:31 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote: > > > On 24/06/2012 15:35, Nathan Tallman wrote: >> >> Is there any reason why this transformation works in Oxygen, using >> Saxon and xsltproc, yet doesn't work from the Linux command line using >> xsltproc? When running from the command line, all the attributes from >> urlset are removed, but the unwanted URLs remain. > > > I for one haven't followed this thread in detail, so I'm not sure what "this > transformation" refers to. > > Michael Kay > Saxonica > >> >> On Sat, Jun 23, 2012 at 10:56 PM, Nathan Tallman<ntallman@xxxxxxxxx> >> wrote: >>> >>> Thanks Chris. I had just found this explanation on >>> >>> <http://stackoverflow.com/questions/3836121/xslt-does-not-work-when-i-include -xmlns-http-www-sitemaps-org-schemas-sitemap> >>> when your email came in. This takes care of it. >>> >>> Much appreciation. >>> Nathan >>> >>> On Sat, Jun 23, 2012 at 10:51 PM, Christopher R. Maden<crism@xxxxxxxxx> >>> wrote: >>>> >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> On 06/23/2012 10:38 PM, Nathan Tallman wrote: >>>>> >>>>> I still wasn't getting the results in my application, so I created >>>>> pets.xml and sure enough the template worked. It only works with >>>>> my original document if I remove attributes found in the root >>>>> element. >>>>> >>>>> The original first 6 lines:<?xml version="1.0" encoding="UTF-8"?> >>>>> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" >>>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>>>> xsi:schemaLocation=" http://www.sitemaps.org/schemas/sitemap/0.9 >>>>> http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> >>>>> >>>>> I had to remove all attributes from<urlset> before the XSL would >>>>> work. Do I need to reference the schema in my XSL? >>>> >>>> Ahh... the good ol namespace FAQ. >>>> >>>> Every element type name is a pair: namespace URI and local name. >>>> >>>> What you thought was null-namespace plus species is in fact >>>> http://www.sitemaps.org/schemas/sitemap/0.9 plus species (often >>>> written as {http://www.sitemaps.org/schemas/sitemap/0.9}species). An >>>> XPath expression matching just species matches {}species, which is a >>>> *different name* than >>>> {http://www.sitemaps.org/schemas/sitemap/0.9}species. >>>> >>>> You need, in your XSLT, to declare something like >>>> xmlns:sitemap="http://www.sitemaps.org/schemas/sitemap/0.9" and then >>>> use sitemap:species in your XPath. (A shorter prefix might be in >>>> order, but a prefix is required for XSLT 1.0 and recommended (IMO) for >>>> clarity for XSLT 2.0.) >>>> >>>> ~Chris >>>> - -- >>>> Chris Maden, text nerd<URL: http://crism.maden.org/> >>>> LIVE FREE: vote for Gary Johnson, Libertarian for President. >>>> <URL: http://garyjohnson2012.com/> <URL: http://lp.org/> >>>> GnuPG fingerprint: DB08 CF6C 2583 7F55 3BE9 A210 4A51 DBAC 5C5C 3D5E >>>> >>>> >>>> -----BEGIN PGP SIGNATURE----- >>>> Version: GnuPG v1.4.10 (GNU/Linux) >>>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ >>>> >>>> iQEcBAEBAgAGBQJP5oCxAAoJEEpR26xcXD1eHSUH/0E0F49MPJJJ1j/1lB9Zw0zK >>>> gNBxalYi/zVpHCgSYNzdXYrdvYWZFIDkQng4opPXBLA5nbWvaJ4qpObrMbB80cmN >>>> unUmPhrb5IkuYx1adgCvNzxlRuabdG06jUUbO11kq8HPbyWH74tEsFP5+IPrTOpn >>>> /xmZTkR5Z0kO93yl6osUbyeq42dF34HmyQKVwWQD0dXHVM8q5BUbVesnxmjdGoE9 >>>> 7zZTJH+r3K0WhGbM0Iq91wZ4LF3qTT25gih+TBF3cMAzsBCGaxzzFlRoJj0qDVj2 >>>> q6DW/awQW+JU8VxRavaoQG1rk1No/k/GkStSv+UXCBdl3qwdwbVIXWdXaliZ0/o= >>>> =YGiD >>>> -----END PGP SIGNATURE-----
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Xpath Syntax Issue, Michael Kay | Thread | Re: [xsl] Xpath Syntax Issue, Martin Honnen |
Re: [xsl] Xpath Syntax Issue, Michael Kay | Date | Re: [xsl] Xpath Syntax Issue, Martin Honnen |
Month |