Implementing and extending autocorrect functionality

Post here questions and problems related to oXygen frameworks/document types.
ScandWe
Posts: 2

Implementing and extending autocorrect functionality

Mon Jun 01, 2015 4:04 pm

I'm trying to implement an autocorrect feature in Oxygen 16.1 because an upgrade to 17 is not possible at this time. To facilitate parts of it, I'm looking for ways to reuse existing spellchecking (16.1) or autocorrect (17) mechanisms but I didn't find any interfaces in the API to intervene before or after them.

The intended functionality is as follows:
* correct words normally but enforce certain casing (i.e. change 'someword' to 'Some WORD')
* surround word with XML if not present (i.e. change 'someword' to '<tag>Some WORD</tag>' but only change '<tag>someword</tag>' to '<tag>Some WORD</tag>' without duplicating XML)
* offer option to autocorrect a whole document in the same way

Currently I'm extending Oxygen 16.1 Eclipse Plugin by using the AuthorDocumentFilter to autocorrect the words manually and then checking the parent node to determine when to add XML. The operation for the whole document does much of the same but uses a TextContentIterator to walk over everything.

My 3 questions:

1. Would it be possible to somehow use the integrated spellchecking functions of Oxygen 16.1 to load a custom dictionary and handle the words or do I need to add another library like Lucene to do that myself?

2. When an update to Oxygen 17 is possible some time in the future, could the extension be simplified to reuse some of the normal Oxygen autocomplete functionality?

3. Adding a surrounding XML fragment already takes about a second so the operation for the whole document is quite slow if a lot of fragments need to be added. Any suggestions on how to improve this? Maybe a batch insert?
alex_jitianu
Posts: 604

Re: Implementing and extending autocorrect functionality

Tue Jun 02, 2015 2:25 pm

Hello,

As far as I can tell you have started on the right path by using an AuthorDocumentFilter.

1. For spellcheck we are using Hunspell but unfortunately there is no API to allow you to use it. You could use a library like Lucene.
2. I'll add an issue to add some API on the auto-correct side. Maybe some events when the support is being triggered as well as when it has performed a replacement.
3. I suspect that the biggest part of the time is spent in computing the layout changes. If you want to process multiple fragments I suggest using these two API methods:

ro.sync.ecss.extensions.api.AuthorDocumentController.insertMultipleFragments(AuthorElement, AuthorDocumentFragment[], int[])
ro.sync.ecss.extensions.api.AuthorDocumentController.multipleDelete(AuthorElement, int[], int[])

By using these methods you will benefit from a single layout event after the operation is finished.

We also like your idea of replacing words with XML fragments so I will add an issue to have this support built-in.

Best regards,
Alex
ScandWe
Posts: 2

Re: Implementing and extending autocorrect functionality

Wed Jun 03, 2015 12:06 pm

Thanks for your suggestions. Turns out the multiple calls to DocumentController.surroundWithFragment() made the autocorrect so slow. It is much faster with the insertMultipleFragments and multipleDelete. However, these methods resulted in some other problems.

It is no longer possible to use undo to revert the autocorrections. I used compoundEdit to wrap all delete, insertText and surroundWithFragment calls into one undo which doesn't seem to work with the multi inserts anymore. Can I somehow restore the undo functionality?

Is there a simple way to insert text in an AuthorDocumentFragment? Since I can only insert multiple fragments instead of surrounding some text with it, I have to fill the fragment with the desired text before inserting it into the document. Is there a simple way to insert text at the right position between the marker characters? So far, I'm just putting it in the center which obviously only works on a symmetrical fragment. Oxygen itself usually inserts text into the first leaf node. How do I get that position? The getContentNodes() method of the fragment only gets the fragment root with no way to reach its children.

My current solution for symmetrical fragments (e.g. <b><tm></tm></b>):

Code: Select all

AuthorDocumentFragment fragment = authorAccess.getDocumentController().createNewDocumentFragmentInContext(surroundingXml, currentWordStartPosition);
fragment.getContent().insertChars(fragment.getLength()/2, word.toCharArray(), 0, word.toCharArray().length);
fragmentsToInsert.put(currentWordStartPosition, fragment);
alex_jitianu
Posts: 604

Re: Implementing and extending autocorrect functionality

Wed Jun 03, 2015 12:40 pm

Hello,

1.As long as you surround all the code like this, you should have just one UNDO... This code executes from an AuthorDocumentFilter event, right?

Code: Select all

AuthorDocumentController ctrl = ...;
ctrl.beginCompoundEdit();
try {
  // My code
} finally {
  ctrl.endCompoundEdit();
}


2. From the snippet I see a surroundingXml variable that seems to be bound to <b><tm></tm></b>. You could have a marker like this:

Code: Select all

String surroundingXml = "<b><tm>{marker}</tm></b>";
String toInsert = surroundingXml.replace("{marker}", word);
AuthorDocumentFragment fragment = authorAccess.getDocumentController().createNewDocumentFragmentInContext(
    toInsert, currentWordStartPosition);
fragmentsToInsert.put(currentWordStartPosition, fragment);


If for some reason you can't do that, AuthorDocumentFragment.getContentNodes() returns a List<AuthorNode>. Those AuthorNode(s) might be AuthorParentNode(s) which have a getContentNodes() method too. By iterating over the node hierarchy you can reach a leaf and use fragment.getContent().insertChars() at its offsets.

Best regards,
Alex

Return to “SDK-API, Frameworks - Document Types”

Who is online

Users browsing this forum: No registered users and 0 guests