DocumentCode
2414445
Title
dQCOB: managing large data flows using dynamic embedded queries
Author
Plale, Beth ; Schwan, Karsten
Author_Institution
Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA
fYear
2000
fDate
2000
Firstpage
263
Lastpage
270
Abstract
The dQUOB system satisfies client need for specific information from high-volume data streams. The data streams we speak of are the flow of data existing during large-scale visualizations, video streaming to large numbers of distributed users, and high volume business transactions. We introduce the notion of conceptualizing a data stream as a set of relational database tables so that a scientist can request information with an SQL-like query. Transformation or computation that often needs to be performed on the data en-route can be conceptualized as computation performed on consecutive views of the data, with computation associated with each view. The dQUOB system moves the query code into the data stream as a quoblet; as compiled code. The relational database data model has the significant advantage of presenting opportunities for efficient reoptimizations of queries and sets of queries. Using examples from global atmospheric modeling, we illustrate the usefulness of the dQUOB system. We carry the examples through the experiments to establish the viability of the approach for high performance computing with a baseline benchmark. We define a cost-metric of end-to-end latency that can be used to determine realistic cases where optimization should be applied. Finally, we show that end-to-end latency can be controlled through a probability assigned to a query that a query will evaluate to true
Keywords
SQL; data models; query processing; relational databases; scientific information systems; software performance evaluation; SQL; business transactions; compiled code; cost-metric; dQUOB system; data model; dynamic embedded queries; experiment; global atmospheric modeling; high performance computing; high-volume data streams; large data flows; large-scale visualizations; query optimization; quoblet; relational database tables; video streaming; Bandwidth; Data models; Data visualization; Delay; Educational institutions; Filters; Large-scale systems; Relational databases; Software libraries; Streaming media;
fLanguage
English
Publisher
ieee
Conference_Titel
High-Performance Distributed Computing, 2000. Proceedings. The Ninth International Symposium on
Conference_Location
Pittsburgh, PA
ISSN
1082-8907
Print_ISBN
0-7695-0783-2
Type
conf
DOI
10.1109/HPDC.2000.868658
Filename
868658
Link To Document