Term Lists (.tdi) file usage

Are you missing a feature? Request its implementation here.
amjc
Posts: 3
Joined: Fri Mar 31, 2023 3:18 pm

Term Lists (.tdi) file usage

Post by amjc »

Hi, first post, love the product.
We're using .tdi file, shared on a network drive, and have questions:
* what are "forbidden" words?
* is there a comments syntax format for the .tdi?
* is the best practice to use the .tdi to collect exceptions to the standard dictionaries, but ultimately move the terms to a custom Hunspell dictionary? Maybe this would allow case-sensitivity?
Thanks, John
Attachments
image.png
image.png (2.29 KiB) Viewed 624 times
Radu
Posts: 9058
Joined: Fri Jul 09, 2004 5:18 pm

Re: Term Lists (.tdi) file usage

Post by Radu »

Hi John,
Thanks for the kind words, please see some answers below:
We're using .tdi file, shared on a network drive, and have questions:
* what are "forbidden" words?
Usually the .tdi file would keep words which are not in the dictionary but you do not want the spell checker to report as invalid.
The forbidden words are the other way around, they are words which are in the dictionary but you want them repored as invalid anyway.
For example for technical documentation writing you want only present tense to be used, so maybe you want to mark "will" as forbidden as a primitive way to forbid future tenses.
Related to forbidden words we also have a free terminology checker add-on which allows you to define sequences of words which are forbidden, it also comes with support for using Vale rules and to define your own forbidden words in a special XML file format:
https://www.oxygenxml.com/doc/versions/ ... addon.html
For example for the Oxygen user's guide we use the Microsoft Style Guide vale rules which flag various sequences of words which may be problematic:
https://github.com/oxygenxml/userguide/ ... rm-checker
* is there a comments syntax format for the .tdi?
No, the format is very simple, one entry on each line, no support for comments.
* is the best practice to use the .tdi to collect exceptions to the standard dictionaries, but ultimately move the terms to a custom Hunspell dictionary?
We do not have such a best practice.
Maybe this would allow case-sensitivity?
Could you give me a small example about your current problem?
We have a remark here in our user's manual:
https://www.oxygenxml.com/doc/versions/ ... rp_bgk_54b
When such problems are reported, they cannot be learned and ignored by the application as words stored in dictionaries, term lists, and the list of learned words are not handled as case-sensitive.
so it seems that in general for now we do not handle the words we check in a case sensitive manner.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
chrispitude
Posts: 907
Joined: Thu May 02, 2019 2:32 pm

Re: Term Lists (.tdi) file usage

Post by chrispitude »

Hi John,

Your thought about accumulating learned words in the term list, then periodically moving them to a Hunspell dictionary is an interesting one! I think I will implement this in our environment. To start, I will configure the term file location to be in our Oxygen/Git project (${pd}/...), then writers can commit and push new learned words to the Git repository.

The nice thing about this approach is that the .tdi file serves as a "holding area" where words can be reviewed for correctness by senior writers first, then updated to handle case/prefixes/suffixes correctly as they are converted to Hunspell dictionary entries.

Speaking of Hunspell dictionary files, I have a post here about making a starter Hunspell dictionary from common unknown words in your content, if it is helpful.
Post Reply