Separation of text from mixed text/graphic and description of graphics
A generalized computer-based automated documentation system which processes engineering documents is extremely desirable. Since document archives is memory intensive, data compression algorithms are becoming increasingly important A revolutionary technique which separates text from mixed text/graphic documents and succinctly describes graphics has been introduced. This thesis introduces two new main algorithms. The first one focuses on the separation of text from mixed text/graphic documents (Chapter 3). It includes an Edge Expanding Search (EES) algorithm for the searching of character-shaped objects, Neighborhood Checking (NC) algorithm for the checking of the neighborhood of the object, and Touching Character Recognition (TCR) algorithm for the identification of the character touching on a graphic. The second algorithm relates to the description of graphics (Chapter 4). The performance of these algorithms, both in terms of their effectiveness and efficiency, is evaluated with fifteen mixed text/graphic engineering documents. The superior performance of these algorithms as compared to other techniques as described in Chapter 2 is clear from the evaluation results.