Allow insert xml fragment that is invalid

Post here questions and problems related to oXygen frameworks/document types.
MikeH
Posts: 5
Joined: Tue Apr 19, 2022 1:28 pm

Allow insert xml fragment that is invalid

Post by MikeH »

Hi,
I'm using JavaScript:
I am copying some <iframe> tag from a site that doesn't format the tag correctly - some dubious attributes.
But this will not allow the paste due to some minor attribute issues in the copied tag.
So, I am using JavaScript:

Code: Select all

authorAccess.getDocumentController().insertXMLFragment(myFrag, caretOffset);
But that won't allow me to insert without first fixing the xml.
the xml is a string taken from the clipboard.
if i try to convert it to a fragment I get the same error due to invalid xml.

The ability to force the insert to be allowed without losing the protection from other invalid inserts would be great.
is this possible?
Even the ability to convert the string to a fragment of invalid XML would at least give me a chance to fix the fragment before I insert it into the doc.
As a fragment - I can address the attributes easier than trying to pattern-match on a string.

Thanks
Radu
Posts: 9059
Joined: Fri Jul 09, 2004 5:18 pm

Re: Allow insert xml fragment that is invalid

Post by Radu »

Hi,

The internal nodes model in the Author editor mode can contain only well formed XML content. Also an AuthorDocumentFragment can be created only from well formed XML content, there is no workaround for this as the content you are pasting may be broken in any number of ways.

There are a number of Java libraries which can take not well formed HTML content and convert it to XHTML.
For example Oxygen is already bundled with (contains in its "lib" folder) the Nekohtml library.
A Java code to use that library directly would look like this:

Code: Select all

  public static void main(String[] args) throws SAXException, IOException {
    
    DOMFragmentParser parser = new org.cyberneko.html.parsers.DOMFragmentParser();
    parser.setProperty("http://cyberneko.org/html/properties/names/elems", "lower");
    org.apache.xerces.dom.DocumentImpl document = new org.apache.xerces.dom.DocumentImpl();
    DocumentFragment fragment = document.createDocumentFragment();
    parser.parse(new org.xml.sax.InputSource(new StringReader("<iframe href=def>")), fragment);
    Properties props = new java.util.Properties();
    props.put("method", "xml");
    props.put("omit-xml-declaration", "yes");
    props.put("indent", "no");
    Serializer serializer = org.apache.xml.serializer.SerializerFactory.getSerializer(props);
    StringWriter sw = new java.io.StringWriter();
    serializer.setWriter(sw);
    DOMSerializer domSerializer = serializer.asDOMSerializer();
    domSerializer.serialize(fragment);
    System.err.println(sw.toString());
  }
and it could be converted to the Javascript equivalent, so maybe create a Javascript method which receives a potentially not wellformed HTML string and returns a wellformed XHTML string.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply