OmniWare Pro 12 ScanSoft Manual

A SERVICE OF

next previous

Chapter 4

Training 69

Training

Training is the process of changing the OCR solutions assigned to

character shapes in the image. It is useful for uniformly degraded

documents or when an unusual typeface is used throughout a document.

Training will be less useful for texts with random distortions. Here is an

example, based on the letter “g”, which can be printed in different ways:

The first two examples do not need training, because both shapes are

normal for the letter “g” and the program can handle them. The third

example could benefit from training because the shape of “g” is unusual,

and all instances of “g” in the text are likely to look like this. The fourth

example is not good for training, because the first “g” is poorly printed, and

this shape is unlikely to appear again in the document.

You can use training to improve recognition of special symbols such as @,

® and © or to recognize supported accented letters more reliably. The

purpose of training is not to teach the program to read characters from

non-supported languages or alphabets.

OmniPage Pro 12 offers two types of training: manual training and

automatic training (IntelliTrain). Data coming from both types of

training are combined and available for saving to a training file.

When you leave a page on which training data was generated, you will be

asked how to apply it to other existing pages in the document.

Manual training

To do manual training, place the insertion point in front of the character

you want to train, or select a group of characters (up to one word) and

choose Train Character... from the Tools menu or the shortcut menu. You

will see an enlarged view of the character(s) to be trained, along with the

current OCR solution. Change this to the desired solution and click OK.

The program takes this training and examines the rest of the page. If it