Title :
A Signal Denoising Method for Text Meaning Vectors
Author :
Hernandez, S. ; Sallis, Philip ; Garden, Kathy
Author_Institution :
Lab. de Proc. de Inf. Geoespacial, Univ. Catolica del Maule, Talca, Chile
Abstract :
The extraction of meaning or at least inter-dependencies using data and text mining methods is well understood. Numerous approaches have been taken to select relevant information from often very large data sets. The discarding of items that are not relevant to a parameterized retrieval is usually based on an ´include or do not include´ decision imbedded in some kind of branch-and-bound algorithm, made to a varying extent sophisticated by the use of machine learning techniques. This paper addresses the discarding process as noise elimination within the context of well-established signal processing methods. It proposes an entropy-based approach using a value-weighted matrix for word relevance matching, where whole text is partitioned according to whether there is a direct relevance of word pairs to the declared meaning being sought, which is expressed as a set of parameters and the noise is considered as errors in the data stream. The resulting non-noisy data is depicted as a text meaning vector, where terms of direct relevance to the initial parameter values are stored.
Keywords :
data mining; entropy; signal denoising; text analysis; tree searching; branch-and-bound algorithm; data mining methods; entropy based approach; noise elimination; signal denoising method; text meaning vectors; text mining methods; value weighted matrix; word relevance matching; Data mining; Entropy; Markov processes; Media; Noise measurement; Probabilistic logic; Random variables; entropy; sentiment analysis; social media; text mining; topic modeling;
Conference_Titel :
Modelling Symposium (AMS), 2011 Fifth Asia
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4577-0193-1
DOI :
10.1109/AMS.2011.16