Count words

Having trouble deploying Oxygen XML Web Author? Got a bug to report? Post it all here.
shilpa
Posts: 68
Joined: Mon Jul 04, 2022 8:42 am

Count words

Post by shilpa »

Hi Team,

I would like to know how many words are there in XML document.
Is there any shortcut key or method exist to check the word count?
I am using Oxygen web author.

Please do the needful..

Thanks & Regards
Shilpa.P
cristi_talau
Posts: 511
Joined: Thu Sep 04, 2014 4:22 pm

Re: Count words

Post by cristi_talau »

Hello,
The action you mentioned is not available in Web Author out of the box. However, it can be implemented in a plugin. The main steps are the following:
1. Implement a custom action: https://www.oxygenxml.com/maven/com/oxy ... ction.html
2. Count the number of words using an AuthorOperationWithResult: https://www.oxygenxml.com/maven/com/oxy ... esult.html
3. Present the result in a dialog: https://www.oxygenxml.com/maven/com/oxy ... ialog.html
We have a sample with a similar functionality here: https://github.com/oxygenxml/web-author ... slt-report .
Best,
Cristian
shilpa
Posts: 68
Joined: Mon Jul 04, 2022 8:42 am

Re: Count words

Post by shilpa »

Hi Christian,

Thank you for the response..
Custom implementation of action in plugin i have already implemented.
I want know is there any api to count number of words present in document?
or is there any way to do in js side to know the number of words in document?

Regards
Shilpa.P
cristi_talau
Posts: 511
Joined: Thu Sep 04, 2014 4:22 pm

Re: Count words

Post by cristi_talau »

Hello,
In the JS API you can use the AuthorEditingSupport.getDocument() [1] method to get a reference to a Document node in the XML DOM. Then you can use standard DOM methods to obtain the textContent of the root element.
Best,
Cristian
[1] https://www.oxygenxml.com/maven/com/oxy ... nt__anchor
shilpa
Posts: 68
Joined: Mon Jul 04, 2022 8:42 am

Re: Count words

Post by shilpa »

Thanks Christian for the update.

Now i am able to get the word counts but its not giving me proper count.

Please find the below example.
image.png
image.png (18.06 KiB) Viewed 1216 times
The code i used is below.
var xmlDoc = jaEditor.getEditingSupport().getDocument().getElementsByTagName('n-load')[0];
var textContent = xmlDoc.getTextContent();
The value of textContent is 'Testing shortcut keyTesting word count'
var wordArr = textContent.split(' ');
Now the length of wordArr is 5.

Here i am expecting the word count should be 6.
When we get text content there is no space between two element words as it is considering as single word.
As it should consider as two different words.

Please suggest on this.
cristi_talau
Posts: 511
Joined: Thu Sep 04, 2014 4:22 pm

Re: Count words

Post by cristi_talau »

Hello,

The code you wrote does not take into account whether elements are "block" or "inline". To improve the behavior, you need to traverse the entire XML DOM and take into account the element boundaries when you count the words. You can use node.childNodes to get the children of a DOM node.
Note that word splitting is not a trivial task in general. Some asian languages do not use spaces between words.

Best,
Cristian
mihaela
Posts: 500
Joined: Wed May 20, 2009 2:40 pm

Re: Count words

Post by mihaela »

Hello,

There is a specific operation called ro.sync.ecss.extensions.commons.operations.text.CountWordsOperation, that was designed to be used in Oxygen XML Editor but it can also work in Web Author.
Try to use it and see if it is proper for your use case.

Best Regards,
Mihaela
Mihaela Calotescu
http://www.oxygenxml.com
Post Reply