Statistical Pattern Recognition

Instead of a heuristic approach, a more traditional statistical approach could be used. Issues to explore in such a case would be:

More training data. In this research, only 24 pages were used as the training dataset. In order to implement a statistical classifier, more data is needed for the training and designing phases.
Risk concept. The introduction of the risk concept can be included and the thresholds updated according to the related specifications. This approach would be the correct way to handle the two different misclassifications (B G and G B) with two different weights. In this thesis, the BG misclassification has been determined to be of higher risk than the other type of classification; this determination has driven the design of the classifier. However, in order to cope with higher classifier accuracy, a better risk model is required.
Confidence. Instead of producing just a single binary output, it would be interesting to provide degrees of confidence to the classifier responses. A response of ``Good'' or ``Bad'' would be followed by a degree of ``certainty''.
Quantizied output. Instead of reporting ``good'' or ``bad'', the degree of image defects for each would be more convenient. In such a model, a rate of 0 would mean, for instance, ``no defect'' while a rate of 1 would be ``severe defect''. Rates for touchiness and brokeness should be reported separately as they could both be present within some pages.