Title :
Automatic topic segmentation and labeling in multiparty dialogue
Author :
Hsueh, P.-Y. ; Moore, J.D.
Author_Institution :
Sch. of Inf., Edinburgh Univ., Edinburgh
Abstract :
This study concerns how to segment a scenario-driven multiparty dialogue and how to label these segments automatically. We apply approaches that have been proposed for identifying topic boundaries at a coarser level to the problem of identifying agenda-based topic boundaries in scenario-based meetings. We also develop conditional models to classify segments into topic classes. Experiments in topic segmentation show that a supervised classification approach that combines lexical and conversational features outperforms the unsupervised lexical chain-based approach, achieving 20% and 12% improvement on segmentating top-level and sub-topic segments respectively. Experiments in topic classification suggest that it is possible to automatically categorize segments into appropriate topic classes given only the transcripts. Training with features selected using the Log Likelihood ratio improves the results by 13.3%.
Keywords :
pattern classification; speech processing; agenda-based topic boundaries; automatic topic labeling; automatic topic segmentation; log likelihood ratio; scenario-based meetings; scenario-driven multiparty dialogue; supervised classification; topic boundaries; Ambient intelligence; Data mining; Decision trees; Frequency; Informatics; Labeling; Predictive models; Speech; Text categorization; Training data;
Conference_Titel :
Spoken Language Technology Workshop, 2006. IEEE
Conference_Location :
Palm Beach
Print_ISBN :
1-4244-0872-5
DOI :
10.1109/SLT.2006.326826