White Speckle



Next: Broken-Chars Zone Up: Feature Metrics Design Previous: Feature Metrics Design

White Speckle

To detect minimally open holes (Observation 1), the White Speckle metric was designed. White speckle is defined as any 8-connected white region whose size is less than or equal to 3 pixel high and wide.

The White Speckle Factor is defined as:

This metric weights the amount of white-speckle present. We expect the image quality to go down as this ratio goes up. Likewise, a page with a low white speckle factor would probably have its lakes wide open and that, provided there are no other problems with the image, would translate to high OCR accuracy.

It is important to point out that this metric is not appropiate for small typesizes. For small sizes, this metric would incorrectly consider normal ``lakes'' in letters to be white speckle, since their size is below 3x3 pixels. Furthermore, our training data does not include small fonts. As a result, the effect of this feature on pages containing small fonts must be investigated.