Page 1 of 1

Count words

Posted: Thu Mar 23, 2023 3:22 pm
by shilpa
Hi Team,

I would like to know how many words are there in XML document.
Is there any shortcut key or method exist to check the word count?
I am using Oxygen web author.

Please do the needful..

Thanks & Regards
Shilpa.P

Re: Count words

Posted: Thu Mar 23, 2023 7:18 pm
by cristi_talau
Hello,
The action you mentioned is not available in Web Author out of the box. However, it can be implemented in a plugin. The main steps are the following:
1. Implement a custom action: https://www.oxygenxml.com/maven/com/oxy ... ction.html
2. Count the number of words using an AuthorOperationWithResult: https://www.oxygenxml.com/maven/com/oxy ... esult.html
3. Present the result in a dialog: https://www.oxygenxml.com/maven/com/oxy ... ialog.html
We have a sample with a similar functionality here: https://github.com/oxygenxml/web-author ... slt-report .
Best,
Cristian

Re: Count words

Posted: Fri Mar 24, 2023 12:36 pm
by shilpa
Hi Christian,

Thank you for the response..
Custom implementation of action in plugin i have already implemented.
I want know is there any api to count number of words present in document?
or is there any way to do in js side to know the number of words in document?

Regards
Shilpa.P

Re: Count words

Posted: Fri Mar 24, 2023 1:15 pm
by cristi_talau
Hello,
In the JS API you can use the AuthorEditingSupport.getDocument() [1] method to get a reference to a Document node in the XML DOM. Then you can use standard DOM methods to obtain the textContent of the root element.
Best,
Cristian
[1] https://www.oxygenxml.com/maven/com/oxy ... nt__anchor

Re: Count words

Posted: Mon Mar 27, 2023 2:23 pm
by shilpa
Thanks Christian for the update.

Now i am able to get the word counts but its not giving me proper count.

Please find the below example.
image.png
image.png (18.06 KiB) Viewed 842 times
The code i used is below.
var xmlDoc = jaEditor.getEditingSupport().getDocument().getElementsByTagName('n-load')[0];
var textContent = xmlDoc.getTextContent();
The value of textContent is 'Testing shortcut keyTesting word count'
var wordArr = textContent.split(' ');
Now the length of wordArr is 5.

Here i am expecting the word count should be 6.
When we get text content there is no space between two element words as it is considering as single word.
As it should consider as two different words.

Please suggest on this.

Re: Count words

Posted: Mon Mar 27, 2023 3:02 pm
by cristi_talau
Hello,

The code you wrote does not take into account whether elements are "block" or "inline". To improve the behavior, you need to traverse the entire XML DOM and take into account the element boundaries when you count the words. You can use node.childNodes to get the children of a DOM node.
Note that word splitting is not a trivial task in general. Some asian languages do not use spaces between words.

Best,
Cristian

Re: Count words

Posted: Tue Mar 28, 2023 9:43 am
by mihaela
Hello,

There is a specific operation called ro.sync.ecss.extensions.commons.operations.text.CountWordsOperation, that was designed to be used in Oxygen XML Editor but it can also work in Web Author.
Try to use it and see if it is proper for your use case.

Best Regards,
Mihaela