Computational Vision Lab IIT Kharagpur

Text Detection, Text Matching And Image Matching Within Gui Application Screen Images

"What is text?" Can it be interpreted as "structured edges"? "a group of strokes" ? "connected components"? "stable extremal regions"? "high frequency components"? "a kind of texture"? or a combination of all? However, there are so many stable, structured edge objects and high frequency textures in natural scenes, such as leaves, fences, brick walls, twigs bearing similar characteristics, strokes or texture properties as that of text, making it difficult to design effective feature representation to discriminate text, which might again vary widely in color, size, fonts, style, appearance, layout, aspect ratio, orientation, alignment, clutter background, distortion, language context, resolution. Text recognition confronts challenges beyond those in general object recognition and is yet an unsolved problem. My research focus has been the application of novel integrated approaches to detect, localize and recognize text considering text as "character composite". More specifically, most of my work so far has been on text extraction from cluttered, shaded, textured, complex, low contrast background where there are so many possible sources of variation. The work on text detection had also been extended to address the problem of logo recognition in natural scenes by first detecting and suppressing text of varying color, font size and orientation in the natural images containing logos. Consequently, it was followed by clustering of remaining stable extremal regions (ERs) forming logo region proposals based on spatial proximity and subsequent logo detection and recognition using deep learning. Prior segmentation of text regions not only reduces the computational overhead but also facilitates the additional task of optical character recognition (OCR). This is useful in cases where logo detection mechanism fails and thereby augments the accuracy of the overall logo recognition system.