Implementing and extending autocorrect functionality
Posted: Mon Jun 01, 2015 4:04 pm
I'm trying to implement an autocorrect feature in Oxygen 16.1 because an upgrade to 17 is not possible at this time. To facilitate parts of it, I'm looking for ways to reuse existing spellchecking (16.1) or autocorrect (17) mechanisms but I didn't find any interfaces in the API to intervene before or after them.
The intended functionality is as follows:
* correct words normally but enforce certain casing (i.e. change 'someword' to 'Some WORD')
* surround word with XML if not present (i.e. change 'someword' to '<tag>Some WORD</tag>' but only change '<tag>someword</tag>' to '<tag>Some WORD</tag>' without duplicating XML)
* offer option to autocorrect a whole document in the same way
Currently I'm extending Oxygen 16.1 Eclipse Plugin by using the AuthorDocumentFilter to autocorrect the words manually and then checking the parent node to determine when to add XML. The operation for the whole document does much of the same but uses a TextContentIterator to walk over everything.
My 3 questions:
1. Would it be possible to somehow use the integrated spellchecking functions of Oxygen 16.1 to load a custom dictionary and handle the words or do I need to add another library like Lucene to do that myself?
2. When an update to Oxygen 17 is possible some time in the future, could the extension be simplified to reuse some of the normal Oxygen autocomplete functionality?
3. Adding a surrounding XML fragment already takes about a second so the operation for the whole document is quite slow if a lot of fragments need to be added. Any suggestions on how to improve this? Maybe a batch insert?
The intended functionality is as follows:
* correct words normally but enforce certain casing (i.e. change 'someword' to 'Some WORD')
* surround word with XML if not present (i.e. change 'someword' to '<tag>Some WORD</tag>' but only change '<tag>someword</tag>' to '<tag>Some WORD</tag>' without duplicating XML)
* offer option to autocorrect a whole document in the same way
Currently I'm extending Oxygen 16.1 Eclipse Plugin by using the AuthorDocumentFilter to autocorrect the words manually and then checking the parent node to determine when to add XML. The operation for the whole document does much of the same but uses a TextContentIterator to walk over everything.
My 3 questions:
1. Would it be possible to somehow use the integrated spellchecking functions of Oxygen 16.1 to load a custom dictionary and handle the words or do I need to add another library like Lucene to do that myself?
2. When an update to Oxygen 17 is possible some time in the future, could the extension be simplified to reuse some of the normal Oxygen autocomplete functionality?
3. Adding a surrounding XML fragment already takes about a second so the operation for the whole document is quite slow if a lot of fragments need to be added. Any suggestions on how to improve this? Maybe a batch insert?