Title :
Scalability of a distributed neural information retrieval system
Author :
Weeks, Michael ; Hodge, Victoria J. ; Austin, Jim
Author_Institution :
Dept. of Comput. Sci., York Univ., UK
Abstract :
Summary form only given. AURA (Advanced Uncertain Reasoning Architecture) is a generic family of techniques and implementations intended for high-speed approximate search and match operations on large unstructured datasets. AURA technology is fast, economical, and offers unique advantages for finding near-matches not available with other methods. AURA is based upon a high-performance binary neural network called a correlation matrix memory (CMM). Typically, several CMM elements are used in combination to solve soft or fuzzy pattern-matching problems. AURA takes large volumes of data and constructs a special type of compressed index. AURA finds exact and near-matches between indexed records and a given query, where the query itself may have omissions and errors. The degree of nearness required during matching can be varied through thresholding techniques. The PCI-based PRESENCE (Parallel Structured Neural Computing Engine) card is a hardware-accelerator architecture for the core CMM computations needed in AURA-based applications. The card is designed for use in low-cost workstations and incorporates 128 MByte of low-cost DRAM for CMM storage. To investigate the scalability of the distributed AURA system, we implement a word-to-document index of an AURA-based information retrieval system, called MinerTaur, over a distributed PRESENCE CMM.
Keywords :
indexing; inference mechanisms; information retrieval systems; neural nets; parallel architectures; pattern matching; reconfigurable architectures; uncertainty handling; workstation clusters; 128 MByte; AURA; Advanced Uncertain Reasoning Architecture; DRAM; MinerTaur; PCI-based PRESENCE card; Parallel Structured Neural Computing Engine; compressed index; correlation matrix memory; distributed neural information retrieval system; exact matches; fuzzy pattern matching problems; hardware-accelerator architecture; high-performance binary neural network; high-speed approximate match operations; high-speed approximate search operations; indexed records; large unstructured datasets; low-cost workstations; near-matches; query; scalability; soft pattern matching problems; thresholding techniques; word-to-document index; Aircraft; Artificial neural networks; Computer architecture; Computer science; Concurrent computing; Coordinate measuring machines; Environmental economics; Information retrieval; Neural networks; Scalability;
Conference_Titel :
High Performance Distributed Computing, 2002. HPDC-11 2002. Proceedings. 11th IEEE International Symposium on
Print_ISBN :
0-7695-1686-6
DOI :
10.1109/HPDC.2002.1029953