0% found this document useful (0 votes)

4 views19 pages

04 Introduction To CassandraDB

Apache Cassandra is a highly scalable, high-performance NoSQL database designed for handling large amounts of data across many servers with high availability. It features elastic scalability, fast linear-scale performance, and flexible data storage, supporting various data formats and replication across data centers. Developed at Facebook and open-sourced in 2008, Cassandra uses its own query language (CQL) and operates on a unique data model that differs significantly from traditional relational databases.

Uploaded by

sidhukola28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views19 pages

04 Introduction To CassandraDB

Uploaded by

sidhukola28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Introduction to CassandraDB

Tushar B. Kute,
http://tusharkute.com
Apache CassandraDB

• Apache Cassandra is a highly scalable, high-

performance distributed database designed to
handle large amounts of data across many
commodity servers, providing high availability
with no single point of failure.
• It is a type of NoSQL database.
NoSQL

• A NoSQL database (sometimes called as Not Only SQL)

is a database that provides a mechanism to store and
retrieve data other than the tabular relations used in
relational databases.
• These databases are schema-free, support easy
replication, have simple API, eventually consistent, and
can handle huge amounts of data.
• The primary objective of a NoSQL database is to have
– simplicity of design,
– horizontal scaling, and
– finer control over availability.
NoSQL vs. RDBMS
Popular NoSQL Databases

• Apache HBase:
– HBase is an open source, non-relational, distributed
database modeled after Google’s BigTable and is written
in Java. It is developed as a part of Apache Hadoop
project and runs on top of HDFS, providing BigTable-like
capabilities for Hadoop.
• MongoDB:
– MongoDB is a cross-platform document-oriented
database system that avoids using the traditional table-
based relational database structure in favor of JSON-like
documents with dynamic schemas making the integration
of data in certain types of applications easier and faster.
Features of Cassandra

• Elastic scalability: Cassandra is highly scalable; it allows to add

more hardware to accommodate more customers and more data
as per requirement.
• Always on architecture: Cassandra has no single point of failure
and it is continuously available for business-critical applications
that cannot afford a failure.
• Fast linear-scale performance: Cassandra is linearly scalable, i.e., it
increases your throughput as you increase the number of nodes in
the cluster. Therefore it maintains a quick response time.
• Flexible data storage: Cassandra accommodates all possible data
formats including: structured, semi-structured, and unstructured.
It can dynamically accommodate changes to your data structures
according to your need.
Features of Cassandra

• Flexible data storage: Cassandra accommodates all possible

data formats including: structured, semi-structured, and
unstructured. It can dynamically accommodate changes to your
data structures according to your need.
• Easy data distribution: Cassandra provides the flexibility to
distribute data where you need by replicating data across
multiple datacenters.
• Transaction support: Cassandra supports properties like
Atomicity, Consistency, Isolation, and Durability (ACID).
• Fast writes: Cassandra was designed to run on cheap
commodity hardware. It performs blazingly fast writes and can
store hundreds of terabytes of data, without sacrificing the
read efficiency.
History of Cassandra

• Cassandra was developed at Facebook for inbox

search.
• It was open-sourced by Facebook in July 2008.
• Cassandra was accepted into Apache Incubator
in March 2009.
• It was made an Apache top-level project since
February 2010.
Data replication in Cassandra

• In Cassandra, one or more of the nodes in a

cluster act as replicas for a given piece of data.
• If it is detected that some of the nodes
responded with an out-of-date value, Cassandra
will return the most recent value to the client.
• After returning the most recent value,
Cassandra performs a read repair in the
background to update the stale values.
Data replication in Cassandra
Cassandra QL

• Users can access Cassandra through its nodes

using Cassandra Query Language (CQL). CQL
treats the database (Keyspace) as a container of
tables.
• Programmers use cqlsh: a prompt to work with
CQL or separate application language drivers.
• Clients approach any of the nodes for their
read-write operations. That node (coordinator)
plays a proxy between the client and the nodes
holding the data.
Data Model

• The data model of Cassandra is significantly

different from what we normally see in an RDBMS.
• Cassandra database is distributed over several
machines that operate together.
• The outermost container is known as the Cluster.
For failure handling, every node contains a replica,
and in case of a failure, the replica takes charge.
• Cassandra arranges the nodes in a cluster, in a ring
format, and assigns data to them.
Data Model

• Keyspace is the outermost container for data in Cassandra. The basic

attributes of a Keyspace in Cassandra are:
• Replication factor:
– It is the number of machines in the cluster that will receive copies of the
same data.
• Replica placement strategy:
– It is nothing but the strategy to place replicas in the ring. We have
strategies such as simple strategy (rack-aware strategy), old network
topology strategy (rack-aware strategy), and network topology strategy
(datacenter-shared strategy).
• Column families:
– Keyspace is a container for a list of one or more column families. A column
family, in turn, is a container of a collection of rows. Each row contains
ordered columns. Column families represent the structure of your data.
Each keyspace has at least one and often many column families.
Creating keyspace

• CREATE KEYSPACE Keyspace name WITH

replication = {'class':
'SimpleStrategy',
'replication_factor' : 3};
Keyspace
Column family

• A column family is a container for an ordered

collection of rows. Each row, in turn, is an ordered
collection of columns.
• A Cassandra column family has the following
attributes:
– keys_cached It represents the number of locations
to keep cached per SSTable.
– rows_cached It represents the number of rows
whose entire contents will be cached in memory.
– preload_row_cache: It specifies whether you want
to pre-populate the row cache.
Column family
RDBMS vs. Cassandra
Thank you
This presentation is created using LibreOffice Impress 4.2.8.2, can be used freely as per GNU General Public License

Web Resources Blogs

http://mitu.co.in http://digitallocha.blogspot.in
http://tusharkute.com http://kyamputar.blogspot.in

[email protected]

Class 3 Cassandra
No ratings yet
Class 3 Cassandra
64 pages
Cassandra Presentation Final
100% (3)
Cassandra Presentation Final
71 pages
Column Oriented Database
No ratings yet
Column Oriented Database
45 pages
Unit-5 Python
100% (1)
Unit-5 Python
36 pages
Cassandra CQL Commands
No ratings yet
Cassandra CQL Commands
16 pages
EMC NetWorker Module For Databases and Applications (NMDA) 1.2 Administration Guide
No ratings yet
EMC NetWorker Module For Databases and Applications (NMDA) 1.2 Administration Guide
384 pages
Cassandra Architecture PDF
No ratings yet
Cassandra Architecture PDF
112 pages
LabTask CassendraCRUDoperations
No ratings yet
LabTask CassendraCRUDoperations
45 pages
Cassandra
No ratings yet
Cassandra
25 pages
Cassandra Data Base1
No ratings yet
Cassandra Data Base1
9 pages
Cassandra
No ratings yet
Cassandra
31 pages
Unit 2
No ratings yet
Unit 2
18 pages
Cassandra Data Model
No ratings yet
Cassandra Data Model
17 pages
Ch3 Nosql Wordpress
No ratings yet
Ch3 Nosql Wordpress
15 pages
Learn Cassandra
100% (2)
Learn Cassandra
37 pages
App Ache
No ratings yet
App Ache
55 pages
4 Unit
No ratings yet
4 Unit
10 pages
The Cassandra Data Model
No ratings yet
The Cassandra Data Model
4 pages
5 Part2
No ratings yet
5 Part2
7 pages
Intro To NoSQL
No ratings yet
Intro To NoSQL
18 pages
Cassandra - Module5
No ratings yet
Cassandra - Module5
37 pages
Module 4
No ratings yet
Module 4
22 pages
Data Governance On Unity Catalog - Jul 2024
No ratings yet
Data Governance On Unity Catalog - Jul 2024
56 pages
RBS Log 1 Merpati 2G
No ratings yet
RBS Log 1 Merpati 2G
913 pages
Cassandra Query Language
No ratings yet
Cassandra Query Language
7 pages
Unit2 Cassandra
No ratings yet
Unit2 Cassandra
15 pages
Cassendra
100% (1)
Cassendra
21 pages
Cassandra PPT Final
No ratings yet
Cassandra PPT Final
23 pages
Cassandra Article Review
No ratings yet
Cassandra Article Review
10 pages
3 - Key Values Database
No ratings yet
3 - Key Values Database
6 pages
BDA - Lab Manual
No ratings yet
BDA - Lab Manual
78 pages
Features of Cassandra
No ratings yet
Features of Cassandra
6 pages
Apache Cassandra Nosql SonuJha 04
No ratings yet
Apache Cassandra Nosql SonuJha 04
14 pages
Facebook Cassandra
No ratings yet
Facebook Cassandra
10 pages
Thanks: With More Than 1000 Students/ Professors, Subject Experts and Editors Contributing To It Every Day
No ratings yet
Thanks: With More Than 1000 Students/ Professors, Subject Experts and Editors Contributing To It Every Day
27 pages
DSX Developer Ebook4 FINAL PDF
No ratings yet
DSX Developer Ebook4 FINAL PDF
27 pages
Cassandra Preview
No ratings yet
Cassandra Preview
9 pages
Introduction To Cassandra
No ratings yet
Introduction To Cassandra
37 pages
Netbackup Commands
No ratings yet
Netbackup Commands
9 pages
Cassandra Complete Notes
No ratings yet
Cassandra Complete Notes
5 pages
Cassandr 1
No ratings yet
Cassandr 1
8 pages
Apache Cassandra: Database
No ratings yet
Apache Cassandra: Database
55 pages
Whitepaper - Data Modeling in Apache Cassandra
No ratings yet
Whitepaper - Data Modeling in Apache Cassandra
21 pages
Nosql Cassandra Database: What Is Apache Cassandra?
No ratings yet
Nosql Cassandra Database: What Is Apache Cassandra?
4 pages
Cassandra Design Patterns - Sample Chapter
No ratings yet
Cassandra Design Patterns - Sample Chapter
32 pages
Cassandra Datastax
100% (1)
Cassandra Datastax
10 pages
Intro To Data Science - Week 10 - LAQ's
No ratings yet
Intro To Data Science - Week 10 - LAQ's
4 pages
Dbms 2 Marks
100% (1)
Dbms 2 Marks
17 pages
Cassandra Certification Study Guide DataStax
13% (8)
Cassandra Certification Study Guide DataStax
20 pages
Query Processing and Optimisation - Intr
No ratings yet
Query Processing and Optimisation - Intr
41 pages
Microstrategy Introduction
No ratings yet
Microstrategy Introduction
34 pages
Cassandra: Wa'el Belkasim Arash Akhlaghi Badrinath Jayakumar
No ratings yet
Cassandra: Wa'el Belkasim Arash Akhlaghi Badrinath Jayakumar
37 pages
CPCS204 01 Introduction
No ratings yet
CPCS204 01 Introduction
47 pages
Chapter 3 Linked Lists (Part 1)
No ratings yet
Chapter 3 Linked Lists (Part 1)
32 pages
Apache Cassandra: Het Patel Kajal Patel
No ratings yet
Apache Cassandra: Het Patel Kajal Patel
8 pages
Cassandra Quick Guide
No ratings yet
Cassandra Quick Guide
60 pages
An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
No ratings yet
An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
20 pages
Dzone Refcard 153 Apache Cassandra 2020
No ratings yet
Dzone Refcard 153 Apache Cassandra 2020
11 pages
Learning Apache Cassandra - Sample Chapter
No ratings yet
Learning Apache Cassandra - Sample Chapter
20 pages
Debug Info
No ratings yet
Debug Info
46 pages
Cassandra As Used by Facebook
100% (1)
Cassandra As Used by Facebook
12 pages
Cassandra
No ratings yet
Cassandra
6 pages
Kunal Gir BodyRecomposition Workout Plan
No ratings yet
Kunal Gir BodyRecomposition Workout Plan
7 pages
Data Platform & Analytics Foundational For Data Platform Competency (MPN14354)
100% (1)
Data Platform & Analytics Foundational For Data Platform Competency (MPN14354)
15 pages
Apache Cassandra: by Chethan Gowda
No ratings yet
Apache Cassandra: by Chethan Gowda
12 pages
Relational Model Normalization
No ratings yet
Relational Model Normalization
24 pages
DBMS Micro-Project 1
No ratings yet
DBMS Micro-Project 1
15 pages
Name Shivam Prasad Reg No. 15BCE1196
No ratings yet
Name Shivam Prasad Reg No. 15BCE1196
8 pages
Data and Business Intelligence: Bidgoli, MIS, 10th Edition. © 2021 Cengage
No ratings yet
Data and Business Intelligence: Bidgoli, MIS, 10th Edition. © 2021 Cengage
19 pages
BCA4002
No ratings yet
BCA4002
16 pages
Ch-10 (Comp) - Database Management
No ratings yet
Ch-10 (Comp) - Database Management
22 pages
Leetcode Questions - Public
No ratings yet
Leetcode Questions - Public
19 pages
Apache Cassandra Database - Instaclustr
No ratings yet
Apache Cassandra Database - Instaclustr
8 pages
A Study of Cassandra
No ratings yet
A Study of Cassandra
2 pages
April2025 AzureOpen (AI) PromptEngineering en
No ratings yet
April2025 AzureOpen (AI) PromptEngineering en
10 pages
Cassandra Interview Questions Answers
No ratings yet
Cassandra Interview Questions Answers
10 pages
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
No ratings yet
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
4 pages
Wireless Communications and Mobile Computing - 2022 - Hussain - Face Mask Detection Using Deep Convolutional Neural Network
No ratings yet
Wireless Communications and Mobile Computing - 2022 - Hussain - Face Mask Detection Using Deep Convolutional Neural Network
10 pages
Apache Cassandra
No ratings yet
Apache Cassandra
7 pages
Automated Low-Level Analysis and Description of Diverse Intelligence (Aladdin) Video
No ratings yet
Automated Low-Level Analysis and Description of Diverse Intelligence (Aladdin) Video
11 pages
Netbackup Backup and Restore Processes - 20 Unique Mcqs
No ratings yet
Netbackup Backup and Restore Processes - 20 Unique Mcqs
6 pages
Experiment No 8
No ratings yet
Experiment No 8
6 pages
LoopCV Guide
No ratings yet
LoopCV Guide
5 pages
Calloused Mind Training Journal
No ratings yet
Calloused Mind Training Journal
3 pages
BPM Analytical Empowerment Pty LTD: Pivot Tables Examples
No ratings yet
BPM Analytical Empowerment Pty LTD: Pivot Tables Examples
5 pages
Bus Ticket Booking Documentation
No ratings yet
Bus Ticket Booking Documentation
4 pages
University of Gujrat: Important Instructions
No ratings yet
University of Gujrat: Important Instructions
2 pages
TSM Backup Retention Policies
No ratings yet
TSM Backup Retention Policies
4 pages
Daily Failure Visualization Script Huberman
No ratings yet
Daily Failure Visualization Script Huberman
2 pages
Resume Vishal
No ratings yet
Resume Vishal
2 pages
Squares and Cubes - 7041083 - 2025 - 02 - 21 - 23 - 27
No ratings yet
Squares and Cubes - 7041083 - 2025 - 02 - 21 - 23 - 27
3 pages
Database Options: Availability
No ratings yet
Database Options: Availability
1 page
Application Form For Grant of Condonation-IV B.Tech. II Semester
No ratings yet
Application Form For Grant of Condonation-IV B.Tech. II Semester
1 page
Parisodhana 2025 Template
No ratings yet
Parisodhana 2025 Template
1 page
Teradata Administration Track
No ratings yet
Teradata Administration Track
2 pages
Prune Days and Change Capture in Data Warehouse Application Console (DAC)
100% (2)
Prune Days and Change Capture in Data Warehouse Application Console (DAC)
3 pages
Cassandra
No ratings yet
Cassandra
7 pages
SAN Zoning Example
No ratings yet
SAN Zoning Example
2 pages
Mastering Apache Cassandra - Second Edition
From Everand
Mastering Apache Cassandra - Second Edition
Nishant Neeraj
No ratings yet
Learn Cassandra in 24 Hours
From Everand
Learn Cassandra in 24 Hours
Alex Nordeen
No ratings yet

Uploaded by

Uploaded by

Introduction to CassandraDB

• Apache Cassandra is a highly scalable, high-

• A NoSQL database (sometimes called as Not Only SQL)

• Elastic scalability: Cassandra is highly scalable; it allows to add

• Flexible data storage: Cassandra accommodates all possible

• Cassandra was developed at Facebook for inbox

• In Cassandra, one or more of the nodes in a

• Users can access Cassandra through its nodes

• The data model of Cassandra is significantly

• Keyspace is the outermost container for data in Cassandra. The basic

• CREATE KEYSPACE Keyspace name WITH

• A column family is a container for an ordered

Web Resources Blogs

You might also like