Abstract :
Summary form only given. A number of problems in information retrieval and natural language processing can be approached using graph theory. Some representative examples in IR include Brin and Page´s Pagerank and Kleinberg´s HITS for document ranking using graph-based random walk models. In NLP, one could mention Pang and Lee´s work on sentiment analysis using graph min- cuts, Mihalcea´s work on word sense disambiguation, Zhu et al.´s label propagation algorithms, Toutanova et al.´s prepositional attachment algorithm, and McDonald et al.´s dependency parsing algorithm using minimum spanning trees. In this talk I will quickly summarize three graph-based algorithms developed recently at the University of Michigan: (a) lexrank, a method for multidocument summarization based on random walks on lexical centrality graphs, (b) TUMBL, a generic method using bipartite graphs for semi-supervised learning, and (c) biased lexrank, a semi-supervised technique for passage ranking for information retrieval and discuss the applicability of such techniques to other problems in Natural Language Processing and Information Retrieval.
Keywords :
graph theory; information retrieval; learning (artificial intelligence); natural language processing; NLP; bipartite graphs; graph theory; graph-based random walk models; information retrieval; lexical centrality graphs; minimum spanning trees; multidocument summarization; natural language processing; semi-supervised learning; Algorithm design and analysis; Bipartite graph; Graph theory; Information retrieval; Natural language processing; Semisupervised learning; Tree graphs;