Title :
A Big Data Modeling Methodology for Apache Cassandra
Author :
Chebotko, Artem ; Kashlev, Andrey ; Shiyong Lu
Author_Institution :
DataStax Inc., USA
Abstract :
Apache Cassandra is a leading distributed database of choice when it comes to big data management with zero downtime, linear scalability, and seamless multiple data center deployment. With increasingly wider adoption of Cassandra for online transaction processing by hundreds of Web-scale companies, there is a growing need for a rigorous and practical data modeling approach that ensures sound and efficient schema design. This work i) proposes the first query-driven big data modeling methodology for Apache Cassandra, ii) defines important data modeling principles, mapping rules, and mapping patterns to guide logical data modeling, iii) presents visual diagrams for Cassandra logical and physical data models, and iv) demonstrates a data modeling tool that automates the entire data modeling process.
Keywords :
Big Data; Internet; data models; distributed databases; transaction processing; Apache Cassandra; Cassandra logical data model; Web-scale company; big data management; data modeling approach; data modeling principle; data modeling process; data modeling tool; distributed database; linear scalability; logical data modeling; mapping pattern; mapping rule; online transaction processing; physical data model; query-driven big data modeling methodology; seamless multiple data center deployment; visual diagram; zero downtime; Big data; Data models; Distributed databases; Electronic mail; Indexes; Radiation detectors; Apache Cassandra; CQL; Chebotko Diagrams; KDM; automation; big data; data modeling; database design;
Conference_Titel :
Big Data (BigData Congress), 2015 IEEE International Congress on
Conference_Location :
New York, NY
Print_ISBN :
978-1-4673-7277-0
DOI :
10.1109/BigDataCongress.2015.41