Title :
Web Interface Interpretation Using Graph Grammars
Author :
Kong, Jun ; Barkol, Omer ; Bergman, Ruth ; Pnueli, Ayelet ; Schein, Sagi ; Zhang, Kang ; Zhao, Chunying
Author_Institution :
Dept. of Comput. Sci., North Dakota State Univ., Fargo, ND, USA
fDate :
7/1/2012 12:00:00 AM
Abstract :
With the advent of the Internet, it is desirable to interpret and extract useful information from the Web. One major challenge in Web interface interpretation is to discover the semantic structure underlying a Web interface. Many heuristic approaches have been developed to discover and group semantically related interface objects. However, those approaches cannot solve the problem of nonuniformity satisfactorily and are not able to tag the semantic role of each object. Distinct from existing approaches, this paper develops a robust and formal approach to recovering interface semantics using graph grammars. Because of the distinct capability of spatial specifications in the abstract syntax, the spatial graph grammar (SGG) is selected to perform the semantic grouping and interpretation of segmented screen objects. Instead of analyzing HTML source codes, we apply an efficient image-processing technology to recognize atomic interface objects from the screenshot of an interface and produce a spatial graph, which records significant spatial relations among recognized objects. A spatial graph is more concise than its corresponding document object model structure and, thus, facilitates interface analysis and interpretation. Based on the spatial graph, the SGG parser recovers the hierarchical relations among interface objects.
Keywords :
Internet; document image processing; graph grammars; information retrieval; user interfaces; HTML source codes; Internet; SGG; Web interface interpretation; abstract syntax; atomic interface objects; document object model structure; image-processing technology; segmented screen objects; semantic grouping; spatial graph grammar; Grammar; Graphical user interfaces; HTML; Image segmentation; Semantics; Visualization; Web pages; Data extraction; graph grammar; image processing; page segmentation;
Journal_Title :
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
DOI :
10.1109/TSMCC.2011.2171335