xml editor

Products

Features

  EPUB
Supported platforms

Supports Windows 7 & Mac OS X Lion

Ready for XML Editor data server software
W3C Member

advanced find and replace issue

Questions about XML that are not covered by the other forums should go here.

advanced find and replace issue

Postby ziborium » Sat Feb 26, 2011 4:33 pm

Dear friends, as an amateur to scripting I would be very thankful for any help with the following issue

I got a big word file of about 1100 pages in Chinese, a digitalized encyclopedia from around 1900AD. This file has been transformed into a xml file in a first step to make it fulltext searchable for a University project. During the conversion process wavy underlines under key terms have been lost. My task now is to search for these terms in the word file, localize the same term in the xml file and wrap it with a <title></title> tag. In order to save (a lot of) time I thought of automatizing this workflow.
I guess I need an external tool in order to do this i.e. search the word file for the term, find it in the xml file and wrap it with the title tag, or is there a way to do this with oxygen?

The tricky thing is this: Some terms appear several times in different contexts whilst they are only once or twice underlined. Therefore I need to formulate a query which takes into consideration some surrounding Chinese characters in order to identify exactly which one to wrap.

I would appreciate any help very much! Even if it is only a hint on how to get started with it.
ziborium
 
Posts: 1
Joined: Sat Feb 26, 2011 4:20 pm

Re: advanced find and replace issue

Postby sorin » Fri Mar 11, 2011 4:33 pm

Hello,

Just open the XML file in Oxygen and use the Find/Replace dialog box to locate all occurrences of the terms and surround them with the title element tags. The Text to find and Replace with boxes allow you to enter Unicode text so you can use Chinese characters. For example if you want to locate all occurrences of termToReplace with termToReplace you have to type termToReplace in the Text to find and <title>termToReplace</title> in the Replace with box.

You deal with the tricky part by formulating an XPath expression that restricts the scope of the find/replace to the context(s) where you want to do the surrounding. The XPath expression will have to match only the XML elements or attributes of the desired context(s) and will be typed in the XPath box of the Find/Replace dialog. For example if termToReplace appears in the elements para1, para2 and in the subpara3 child elements of the para3 elements but you want to surround it with title tags only in the para2 and subpara3 elements then you will type in the XPath box:

Code: Select all
para2 | para3/subpara3



Regards,
Sorin
sorin
 
Posts: 3228
Joined: Fri Mar 28, 2003 2:12 pm


Return to General XML Questions

Who is online

Users browsing this forum: No registered users and 0 guests

XML Editor | XML Author | WYSIWYG Editors | Schema Editor | XSD Documentation | XSL/XSLT Editor | XQuery | XML Databases | SVN Client
© 2002-2011 SyncRO Soft Ltd. All rights reserved. | Sitemap | Privacy Policy | This website was created & generated with <oXygen/>® XML Editor