Quickly creating an initial custom dictionary file
Post here questions and problems related to editing and publishing DITA content.
-
- Posts: 922
- Joined: Thu May 02, 2019 2:32 pm
Quickly creating an initial custom dictionary file
Post by chrispitude »
I have a set of ten books that have many terms specific to our industry. I wanted to create a custom Hunspell dictionary (.dic, .aff) that contains the words I want to be valid for our books.
If you have access to a linux-like environment (I use WSL in Windows 10), you can do the following:
1. Run "Edit > Check Spelling in Files" from the menu, and check spelling on the desired scope (in my case, the directory containing all DITA files).
2. In the results window, choose "Save Results" from the right-click context menu and save to a "spell.txt" file.
3. Run a script to post-process the results:
This will give you a file that contains the "misspelled" words sorted by frequency, like this:
Edit this file to include only the words that are valid. Now save the file and run the following command to remove the count values:
Now create an empty matching .aff file:
This gives you a very simple dictionary that contains the list of words you want to consider valid. It doesn't use any of Hunspell's powerful stemming, affixation, or suggestion capabilities, but it's better than starting from an empty file!
If you have access to a linux-like environment (I use WSL in Windows 10), you can do the following:
1. Run "Edit > Check Spelling in Files" from the menu, and check spelling on the desired scope (in my case, the directory containing all DITA files).
2. In the results window, choose "Save Results" from the right-click context menu and save to a "spell.txt" file.
3. Run a script to post-process the results:
Code: Select all
sed -n 's!^Description: Misspelled word: "\([^"]*\)"\..*$!\1!p' spell.txt | sort | uniq -c | sort -n > my.dic
Code: Select all
1 uniquification
...
17 multivoltage
22 post-DFT
24 black-box
37 clock-gating
Code: Select all
sed -i 's!^ *[0-9]* *!!' my.dic
Code: Select all
touch my.aff
-
- Posts: 417
- Joined: Mon May 09, 2016 9:37 am
Re: Quickly creating an initial custom dictionary file
Post by sorin_carbunaru »
Thank you for your contribution, Chris! We appreciate it!
Return to “DITA (Editing and Publishing DITA Content)”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service