DocumentCode :
2548130
Title :
Online construction of subsequence automata for multiple texts
Author :
Hoshino, Hiromasa ; Shinohara, Ayumi ; Takeda, Masayuki ; Arikawa, Setsuo
Author_Institution :
Dept. of Inf., Kyushu Univ., Fukuoka, Japan
fYear :
2000
fDate :
2000
Firstpage :
146
Lastpage :
152
Abstract :
We consider a deterministic finite automaton which accepts all subsequences of a set of texts, called subsequence automaton. We show an online algorithm for constructing a subsequence automaton for a set of texts. It runs in O(|Σ|(m+k)+N) time using O(|Σ|m) space, where |Σ| is the size of alphabet, m is the size of the resulting subsequence automaton, k is the number of texts, and N is the total length of texts. It can be used to preprocess a given set S of texts in such a way that for any query ω ∈ Σ*, returns in O(|ω|) time the number of texts in S which contain ω as a subsequence. We also show an upper bound of the size of automaton compared to the minimum automaton
Keywords :
computational complexity; deterministic automata; finite automata; query processing; set theory; text analysis; alphabet; deterministic finite automaton; minimum automaton; multiple texts; online algorithm; online construction; preprocessing; subsequence automata; subsequence automaton; Automata; Computational complexity; Data structures; Gain measurement; Informatics; Machine learning; Machine learning algorithms; Text recognition; Upper bound;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
String Processing and Information Retrieval, 2000. SPIRE 2000. Proceedings. Seventh International Symposium on
Conference_Location :
A Curuna
Print_ISBN :
0-7695-0746-8
Type :
conf
DOI :
10.1109/SPIRE.2000.878190
Filename :
878190
Link To Document :
بازگشت