Xerox
®
DocuMate
®
3115
User’s Guide
130
• User Dictionary—a user dictionary is your personal dictionary with words that you want the
OCR engine to reference for better accuracy when converting the document into editable
text. For example, if you scan documents with highly technical terms or acronyms not found
in typical dictionaries, you can add them to your personal dictionary. You can also add names
that you expect to be in the documents too. This way, as the OCR process recognizes each
letter or symbol, there is a higher chance that the technical term or name will be correctly
spelled in the final document. You can create multiple user dictionaries. See the section
Creating Your Own Dictionaries on page 130.
Click the menu arrow and select a user dictionary from the list.
If you select
[none] as the user dictionary, the text will be validated using the terms in the
dictionaries for the selected languages, as well as any professional dictionaries if they are
selected.
The label
[current] is next to the currently-select user dictionary.
•
Professional Dictionaries—these are legal and medical dictionaries containing highly
specialized words and phrases. The options are: Dutch Legal, Dutch Medical, English
Financial, English Legal, English Medical, French Legal, French Medical, German Legal, and
German Medical. Select the appropriate dictionary for the OCR engine to use to validate the
scanned text.
•
Reject Character—this is the character that the OCR process inserts for an unrecognizable text
character. For example, if the OCR process cannot recognize the J in REJECT, and ~ is the
reject character, the word would appear as RE~ECT in your document. The ~ is the default
reject character.
Type the character you want to use in the
Reject Character box. Try to choose a character that
will not appear in your documents.
•
Missing Character—this is the character that the OCR process inserts for a missing text
character. A missing text character is one that the OCR process recognizes, but cannot
represent because that character is not available for the selected language. For example, if
the document contains the text symbol “Ç” but the OCR process cannot represent that
character, then every place “Ç” appears, the OCR process substitutes the missing character
symbol. The caret (^) is the default symbol for the missing character.
Type the character you want to use in the
Missing Character box. Try to choose a character
that will not appear in your documents.
•
Recognition Quality—drag the slider to the left or right to set the degree of accuracy for the
OCR process. The higher the accuracy, the longer the OCR process requires to complete. For
clean, highly-legible documents, you can set the recognition quality to a lower level to
produce results more quickly.
2. Click
OK or Apply.
These options will now apply to the OCR processing when you select any text format as the page
format.
Creating Your Own Dictionaries
You can create multiple dictionaries for your personal use. For example, you might have different
dictionaries for separate work projects, especially if each project uses different acronyms and
terminology.