Using User Dictionaries with TX Spell .NET

A user dictionary is a separate dictionary that can be used to add custom words such as brand names, trademarks or a specific vocabulary. TX Spell .NET uses plain text word lists as a user dictionary with the specific file extension *.txd. Using TX Spell .NET, an unlimited number of dictionaries and user dictionaries can be used at the same time. If a user dictionary is added to the DictionaryCollection, the Add to dictionary... button of the spell check dialog and the context menu is enabled. Where the spell check dialog and the user dictionary manager enable users to add and remove words manually, this sample shows how to use the API to access these user dictionaries.

The source code is contained in the following directories:

  • %USERPROFILE%\My Documents\TX Spell 7.0.NET for Windows Forms\Samples\WinForms\CSharp\UserDictionary
  • %USERPROFILE%\My Documents\TX Spell 7.0.NET for Windows Forms\Samples\WinForms\VB.NET\UserDictionary

Creating a User Dictionary

The UserDictionary constructor can be used to load an existing user dictionary from a specific path or the default dictionary path. Using only the dictionary name without any path specifications, the corresponding dictionary file inside the default dictionary folder will be used. The name of the specified user dictionary file must end in ".txd". To create a new user dictionary, no constructor may be used. The following code shows how to create a new user dictionary and how to add some custom words to it:

userDic = new TXTextControl.Proofing.UserDictionary();
txSpellChecker1.Dictionaries.Add(userDic);

userDic.Name = "User Dictionary";
userDic.AddWord("TXTextControl");
userDic.AddWord("TXSpell");
userDic = New TXTextControl.Proofing.UserDictionary()
txSpellChecker1.Dictionaries.Add(userDic)

userDic.Name = "User Dictionary"
userDic.AddWord("TXTextControl")
userDic.AddWord("TXSpell")

Every Dictionary (OpenOfficeDictionary or UserDictionary) is enabled for spell checking and for creating suggestions by default. Therefore, the Dictionary.IsSpellCheckingEnabled and Dictionary.IsGetSuggestionsEnabled properties are set to true. To disable the usage of a specific dictionary, set the Dictionary.IsSpellCheckingEnabled to false.

Adding and Removing Words from the User Dictionary

In order to edit user dictionaries, the UserDictionary.IsEditable property must be set to true. This sample consists of a simple interface to add and remove words to and from the user dictionary. A ListBox is synchronized to show the content of the user dictionary. The following code is used to add a word to the user dictionary using the UserDictionary.AddWord method.

userDic.AddWord(tbNewWord.Text);
userDic.AddWord(tbNewWord.Text)

To remove a word from the user dictionary, the UserDictionary.RemoveWord method is used:

userDic.RemoveWord(listBox1.Text);
userDic.RemoveWord(listBox1.Text)

The following screenshot shows the interface of the sample application:

Image

Using the flags SET, WORDCHARS, BREAK and ICONV

Both dictionaries of type OpenOfficeDictionary and UserDictionary support flags that provide specific settings to interpret the incoming string when spell checking text. For a user dictionary the flags SET, WORDCHARS, BREAK and ICONV are handled when defined at the beginning of the loaded user dictionary file or string array. When using these flags, they have to be defined as follows:

The SET flag represents the encoding that is used to load the user dictionary. Setting this flag is required to create a valid user dictionary file. To determine such an encoding, the user dictionary line must begin with SET, followed by a whitespace and ends with the encoding to be specified. Valid values are all encoding names that are supported by the .NET framework.

By default, each string that is matched by the regular expression \w+ is interpreted as a word. Additional characters can be defined by the WORDCHARS flag. That definition starts with WORDCHARS, followed by a whitespace and ends with a string that represents all additional characters.

To specify characters that are both word characters and delimiters, the BREAK flag can be used. Such a sequence of characters has to be set behind the BREAK flag, separated by a whitespace.

In some cases the text to spell check contains characters that are related but not equal to a character that is part of the dictionary's word list. To handle that character anyway, the ICONV flag can be specified. That flag is followed by two characters, where the first one represents the character that has to be replaced by the second character before spell checking a word.

As follows the beginning of a sample user dictionary is shown that includes all introduced flags:

SET UTF-8
WORDCHARS '´-+.-
BREAK '´-+.-
ICONV ´ '
ICONV - -

character
doesn't
hyphen-minus
include
text
that
the

The defined flags of that dictionary determines that its word list has to be loaded with the UTF-8 encoding (represented by the SET flag). Furthermore the characters ''-+.- are specified as additional word (WORDCHARS) and delimiter (BREAK) characters. The both ICONV flags determines that the character ' has to be replaced by ' and the character - by - before spell checking a string. When spell checking the text That text doesn't include the hyphen-minus character '-'. with that dictionary, no incorrect words will be detected. Without determining the flags WORDCHARS, BREAK and ICONV, the strings doesn't, hyphen and minus are identified as misspelled words.