0% found this document useful (0 votes)
110 views11 pages

Dzone Refcard 153 Apache Cassandra 2020

Uploaded by

Pardha Saradhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views11 pages

Dzone Refcard 153 Apache Cassandra 2020

Uploaded by

Pardha Saradhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

BROUGHT TO YOU IN PARTNERSHIP WITH

Apache
CONTENTS

∙ Introduction
∙ Who is Using Cassandra?
∙ Data Model Overview

Cassandra
∙ Schema
∙ Table
∙ Rows
∙ Columns
∙ Partitioning
∙ Replication
∙ And More...
WRITTEN BY BRIAN O’NEIL
ARCHITECT, IRON MOUNTAIN

UPDATED BY MILAN MILOSEVIC


LEAD DATA ENGINEER, SMARTCAT

INTRODUCTION Consistency No tunable consistency in the ACID Favors consistency


Apache Cassandra is a high-performance, extremely scalable, fault- sense. Can be tuned to provide over availability
more consistency or to provide tunable via
tolerant (i.e., no single point of failure), distributed non-relational
more availability. The consistency isolation levels.
database solution. Cassandra combines all the benefits of Google is configured per request. Since
Bigtable and Amazon Dynamo to handle the types of database Cassandra is a distributed
management needs that traditional RDBMS vendors cannot support. database, traditional locking and
transactions are not possible
(there is, however, a concept
WHO IS USING CASSANDRA? of lightweight transaction that
should be used very carefully).
Cassandra is in use at Apple (75,000+ nodes), Spotify (3,000+ nodes),
eBay, Capital One, Macy's, Bank of America,  Netflix, Twitter, Urban TABLE CONTINUES ON PAGE THREE

Airship, Constant Contact, Reddit, Cisco, OpenX, Rackspace, Ooyala,


and more companies that have large active data sets. The largest
known Cassandra cluster has more than 300 TB of data across more
than 400 machines (cassandra.apache.org).

RDBMS VS. CASSANDRA

CASSANDRA RDBMS

Atomicity Success or failure for inserts/ Enforced at every


deletes in a single partition (one or scope, at the cost
more rows in a single partition). of performance and
scalability.

Sharding Native share-nothing architecture, Often forced when


inherently partitioned by a scaling, partitioned
configurable strategy. by key or function.

1
The future of scale-out,
cloud-native data is
Apache Cassandra™

DataStax Astra makes


Cassandra easy. Launch
your database in the cloud
with just a few clicks.

Get Started

"Astra is hands-down the best solution for Cassandra developer productivity. It eliminates all of the overhead involved in
setting up Cassandra. With Astra, developers can fully automate their CI/CD pipelines for Cassandra support. This means
they can concentrate on more important tasks."

Robert Reeves, CTO, Datical

#DataStax
REFCARD | APACHE CASSANDRA

key consists of a partition key and clustering columns. The partition


Durability Writes are durable to a replica Typically, data
node being recorded in memory is written to a key defines data locality in the cluster, and the data with the same
and the commit log before single master partition key will be stored together on a single node. The clustering
acknowledged. In the event of a node, sometimes
columns define how the data will be ordered on the disk within a
crash, the commit log replays on configured with
restart to recover any lost writes synchronous partition. The client application provides rows that conform to the
before data is flushed to a disk. replication at the schema. Each row has the same fixed subset of columns.
cost of performance
and cumbersome
As values for these properties, Cassandra provides the following CQL
data restoration.
data types for columns (DataStax documentation):
Multi- Native and out-of-the-box Typically, only
Datacenter capabilities for data replication limited long- TYPE PURPOSE STORAGE
Replication over lower bandwidth, higher distance replication
latency, and less reliable to read-only ascii Efficient storage for simple ASCII Arbitrary number
connections. slaves receiving strings of ASCII bytes (i.e.,
asynchronous values are 0-127)
updates.
boolean True or false Single byte
Security Coarse-grained and primitive, but Fine-grained access
blob Arbitrary byte content Arbitrary number of
authorization, authentication, control to objects.
roles, and data encryption are bytes
provided out of the box. counter Used for counters, which are 8 bytes
cluster-wide incrementing values

DATA MODEL OVERVIEW timestamp Stores time in milliseconds 8 bytes


Cassandra has a tabular schema comprising keyspaces, tables,
time Value is encoded as a 64-bit 64-bit signed integer
partitions, rows, and columns. Note that, since Cassandra 3.x
signed integer representing
terminology is altered due to changes in the storage engine, a the number of nanoseconds
"column family" is now a table and a "row" is now a partition. since midnight. Values can be
represented as strings, such as
13:30:54.234.
RDBMS OBJECT
DEFINITION
ANALOGY EQUIVALENT date Value is a date with no 32-bit integers
corresponding time value;
Schema/ A collection of tables Schema/ Set
Cassandra encodes date as a
Keyspace Database
32-bit integer representing days
since epoch (January 1, 1970).
Table/ A set of partitions Table Map
Dates can be represented in
Column
queries and inserts as a string,
Family
such as 2015-05-03 (yyyy-mm-dd).
Partition A set of rows that share N/A N/A
decimal Stores BigDecimals 4 bytes to store
the same partition key
the scale, plus an
Row An ordered (inside of a Row OrderedMap arbitrary number of
partition) set of columns bytes to store the
value
Column A key-value pair and Column (Key, Value, double Stores Doubles 8 bytes
timestamp (Name, Value) Timestamp)
float Stores Floats 4 bytes

SCHEMA tinyint Stores 1-byte integer 1 byte


The keyspace is akin to a database or schema in RDBMS, contains a smallint Stores 2-byte integer 2 bytes
set of tables, and is used for replication. A keyspace is also the unit for
int Stores 4-byte integer 4 bytes
Cassandra's access control mechanism. When enabled, users must
authenticate to access and manipulate data in a schema or table. varint Stores variable precision integer An arbitrary number
of bytes used to store
the value
TABLE
bigint Stores Longs 8 bytes
A table, previously known as a column family, is a map of rows.
Similar to RDBMS, a table is defined by a primary key. The primary TABLE CONTINUES ON NEXT PAGE

3 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

text, and stores the bytes in column keys. The timestamp portion of the
Stores text as UTF-8 UTF-8
varchar column is used to sequence mutations. The timestamp is defined and
timeuuid Version 1 UUID only 16 bytes specified by the client. Newer versions of Cassandra drivers provide
this functionality out of the box. Client application servers should
uuid Suitable for UUID storage 16 bytes
have synchronized clocks.
frozen A frozen value serializes multiple N/A
components into a single value. Columns may optionally have a time-to-live (TTL), after which
Non-frozen types allow updates
Cassandra asynchronously deletes them. Note that TTLs are defined
to individual fields. Cassandra
treats the value of a frozen type as per cell, so each cell in the row has an independent time-to-live and
a blob. The entire value must be is handled by the Cassandra independently.
overwritten.

inet IP address string in IPv4 or IPv6 N/A HOW DATA IS STORED ON DISK
format, used by the python-cql Using the sstabledump tool, you can inspect how the data is
driver and CQL native protocols
stored on the disk. This is very important if you want to develop
list A collection of one or more N/A intuition about data modeling, reads, and writes in Cassandra.
ordered elements: [literal,
literal, literal]
Given the table defined by:

map A JSON-style array of literals: { N/A CREATE TABLE IF NOT EXISTS symbol_history (
literal : literal, literal : symbol text,
literal ... }
year int,
set A collection of one or more N/A month int,
elements: { literal, literal, day int,
literal } volume bigint,
close double,
tuple A group of 2-3 fields N/A
open double,
low double,
high double,
ROWS
idx text static,
Cassandra 3.x supports tables defined with composite primary keys. PRIMARY KEY ((symbol, year), month, day)
The first part of the primary key is a partition key. Remaining columns ) with CLUSTERING ORDER BY (month desc, day desc);
are clustering columns and define the order of the data on the disk.
For example, let’s say there is a table called users_by_location The data (when deserialized into JSON using the sstabledump tool)
with the following primary key: is stored on the disk in this form:

[
((country, town), birth_year, user_id)
{
"partition" : {
In that case, the (country, town) pair is a partition key (a "key" : [ "CORP", "2016" ],
composite one). All users with the same (country, town) values will "position" : 0
},
be stored together on a single node and replicated together based on
"rows" : [
the replication factor. The rows within the partition will be ordered by {
birth_year and then by user_id. The user_id column provides "type" : "static_block",
uniqueness for the primary key. "position" : 48,
"cells" : [
If the partition key is not separated by parentheses, then the first { "name" : "idx", "value" : "NASDAQ",
"tstamp" : 1457484225583260, "ttl" : 604800,
column in the primary key is considered a partition key. For example,
"expires_
if the primary key is defined by (country, town, birth_year,
at" : 1458089025, "expired" : false }
user_id), then country would be the partition key and town ]
},
would be a clustering column.
{
"type" : "row",
COLUMNS "position" : 48,
A column is a triplet: key, value, and timestamp. The validation "clustering" : [ "1", "5" ],

and comparator on the column family define how Cassandra sorts "deletion_info" : { "deletion_time" :

CODE CONTINUES ON NEXT PAGE

4 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

1457484273784615, "tstamp" : 1457484273 } { "name" : "high", "value" : "9.57" },


}, { "name" : "low", "value" : "9.21" },
{ { "name" : "open", "value" : "9.55" },
"type" : "row", { "name" : "volume", "value" : "1054342" }
"position" : 66, ]
"clustering" : [ "1", "4" ], }
"liveness_info" : { "tstamp" : ]
}
1457484225586933, "ttl" : 604800, "expires_at" :
]
1458089025, "expired" : false },
"cells" : [
{ "name" : "close", "value" : "8.54" },
{ "name" : "high", "value" : "8.65" },
CASSANDRA ARCHITECTURE
{ "name" : "low", "value" : "8.2" }, Cassandra uses a masterless ring architecture. The ring represents a
{ "name" : "open", "value" : "8.2" }, cyclic range of token values (i.e., the token space). Since Cassandra
{ "name" : "volume", "value" : "1054342" } 2.0, each node is responsible for a number of small token ranges
]
}, defined by the num_tokens property in cassandra.yml.
{
"type" : "row", PARTITIONING
"position" : 131,
Keys are mapped into the token space by a partitioner. The important
"clustering" : [ "1", "1" ],
distinction between the partitioners is order preservation (OP). Users
"liveness_info" : { "tstamp" :
1457484225583260, "ttl" : 604800, "expires_at" : can define their own partitioners by implementing IPartitioner,
1458089025, "expired" : false }, or they can use one of the native partitioners.
"cells" : [
{ "name" : "close", "value" : "8.2" }, CASSANDRA RDBMS OP
{ "name" : "high", "deletion_time" :
Murmur3Partitioner MurmurHash 64-bit hash value No
1457484267, "tstamp" : 1457484267368678 },
{ "name" : "low", "value" : "8.02" }, MD5 BigInteger No
RandomPartitioner
{ "name" : "open", "value" : "9.33" },
{ "name" : "volume", "value" : "1055334" } BytesOrderPartitioner Identity Bytes Yes
]
}
] The following examples illustrate this point.
},
{
RANDOM PARTITIONER
"partition" : {
Since RandomPartitioner uses an MD5 hash function to map keys
"key" : [ "CORP", "2015" ],
"position" : 194 into tokens, on average those keys will evenly distribute across the
}, cluster. The row key determines the node placement:
"rows" : [
{ ROW KEY
"type" : "static_block",
lisa state: CA graduated: 2008 gender: F
"position" : 239,
"cells" : [ owen state: TX gender: M
{ "name" : "idx", "value" : "NYSE",
collin state: UT gender: M
"tstamp"
: 1457484225578370, "ttl" : 604800, "expires_at" :
This may result in the following ring formation, where "collin,"
1458089025, "expired" : false }
] "owen," and "lisa" are row keys.
},
{
"type" : "row",
"position" : 239,
"clustering" : [ "12", "31" ],
"liveness_info" : { "tstamp" :
1457484225578370, "ttl" : 604800, "expires_at" :
1458089025, "expired" : false },
"cells" : [
{ "name" : "close", "value" : "9.33" },

CODE CONTINUES IN NEXT COLUMN

5 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

With Cassandra’s storage model, where each node owns the Hot tip: If possible, it is best to design your data model to use
preceding token space, this results in the following storage allocation Murmur3Partitioner to take advantage of the automatic load
based on the tokens: balancing and decreased administrative overhead of manually
managing token assignment.
ROW KEY MD5 HASH NODE

collin CC982736AD62AB 3 REPLICATION


owen 9567238FF72635 2 Cassandra provides continuous, high availability and fault tolerance
lisa 001AB62DE123FF 1
through data replication. The replication uses the ring to determine
nodes used for replication. Replication is configured on the keyspace
Notice that the keys are not in order. With RandomPartitioner, level. Each keyspace has an independent replication factor, n.
the keys are evenly distributed across the ring using hashes, but
When writing information, the data is written to the target node as
you sacrifice order, which means any range query needs to query all
determined by the partitioner and n-1 subsequent nodes along the
nodes in the ring.
ring.
MURMUR3 PARTITIONER
There are two replication strategies: SimpleStrategy and
Murmur3Partitioner is the default partitioner since Cassandra
NetworkTopologyStrategy.
1.2. The Murmur3Partitioner provides faster hashing and
improved performance than the RandomPartitioner. The
SIMPLESTRATEGY
Murmur3Partitioner can be used with vnodes.
The SimpleStrategy is the default strategy and blindly writes
the data to subsequent nodes along the ring. This strategy is NOT
ORDER PRESERVING PARTITIONERS (OPP)
RECOMMENDED for a production environment.
The Order Preserving Partitioners preserve the order of the row keys
as they are mapped into the token space.
In the previous example with a replication factor of 2, this would
result in the following storage allocation:
In our example, since:

"collin" < "lisa" < "owen" REPLICA 1 REPLICA 2


ROW KEY (AS DETERMINED BY (FOUND BY TRAVERSING
PARTITIONER) THE RING)
Then:
collin 3 1
token("collin") < token("lisa") < token("owen")
owen 2 3

With OPP, range queries are simplified, and a query may not need lisa 1 2
to consult each node in the ring. This seems like an advantage, but
it comes at a price. Since the partitioner is preserving order, the NETWORKTOPOLOGYSTRATEGY
ring may become unbalanced unless the row keys are naturally The NetworkTopologyStrategy is useful when deploying to
distributed across the token space, as illustrated below: multiple datacenters. It ensures data is replicated across datacenters.

Effectively, the NetworkTopologyStrategy executes the


SimpleStrategy independently for each datacenter, spreading
replicas across distant racks. Cassandra writes a copy in each
datacenter as determined by the partitioner.

Data is written simultaneously along the ring to subsequent nodes


within that datacenter with preference for nodes in different racks to
offer resilience to hardware failure. All nodes are peers and data files
can be loaded through any node in the cluster, eliminating the single
point of failure inherent in master-slave architecture and making
To manually balance the cluster, you can set the initial token for each
Cassandra fully fault-tolerant and highly available.
node in the Cassandra configuration.

6 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

See the following ring and deployment topology:


EACH_QUORUM A write is successfully acknowledged by at least
n/2+1 replicas within each datacenter.

ALL A write is successfully acknowledged by all


n replicas. This is useful when absolute read
consistency and/or fault tolerance are necessary
(e.g., online disaster recovery).

READ

LEVEL EXPECTATION

ONE Returns a response from the closest replica, as


determined by the snitch.

TWO Returns the most recent data from two of the closest
replicas.
With blue nodes deployed to one datacenter (DC1), green nodes
THREE Returns the most recent data from three of the
deployed to another datacenter (DC2), and a replication factor of two closest replicas.
per each datacenter, one row will be replicated twice in Data Center 1
QUORUM Returns the record after a quorum (n/2 +1) of
(R1, R2) and twice in Data Center 2 (R3, R4). replicas from all datacenters that responded.

Note: Cassandra attempts to write data simultaneously to all target LOCAL_QUORUM Returns the record after a quorum of replicas in the
nodes, then waits for confirmation from the relevant number of nodes current datacenter, as the coordinator has reported.
Avoids latency of communication among datacenters.
needed to satisfy the specified consistency level.
EACH_QUORUM Not supported for reads.
CONSISTENCY LEVELS ALL The client receives the most current data once all
One of the unique characteristics of Cassandra that sets it apart from replicas have responded.
other databases is its approach to consistency. Clients can specify
the consistency level on both read and write operations, trading off
between high availability, consistency, and performance. NETWORK TOPOLOGY
As input into the replication strategy and to efficiently route
WRITE communication, Cassandra uses a snitch to determine the
datacenter and rack of the nodes in the cluster. A snitch is a
LEVEL EXPECTATION
component that detects and informs Cassandra about the network
ANY The write was written in at least one node’s commit topology of the deployment. The snitch dictates what is used in the
log. Provides low latency and a guarantee that a
write never fails. Delivers the lowest consistency strategy options to identify replication groups when configuring
and continuous availability. replication for a keyspace.

LOCAL_ONE A write must be sent to, and successfully


The following table shows the snitches provided by Cassandra and
acknowledged by, at least one replica node in the
local datacenter. what you should use in your keyspace configuration for each snitch:

ONE A write is successfully acknowledged by at least one SNITCH SPECIFY


replica (in any DC).
SimpleSnitch Specify only the replication factor in your
TWO A write is successfully acknowledged by at least two strategy options.
replicas.
PropertyFileSnitch Specify the datacenter names from your
THREE A write is successfully acknowledged by at least properties file in the keyspace strategy options.
three replicas.
GossipingProperty Returns the most recent data from three of the
QUORUM A write is successfully acknowledged by at least FileSnitch closest replicas.
n/2+1 replicas, where n is the replication factor.
RackInferringSnitch Specify the second octet of the IPv4 address in
LOCAL_QUORUM A write is successfully acknowledged by at least your keyspace strategy options.
n/2+1 replicas within the local datacenter.
TABLE CONTINUES ON NEXT PAGE
TABLE CONTINUES IN NEXT COLUMN

7 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

THE RACKINFERRINGSNITCH
EC2Snitch Specify the region name in the keyspace
strategy options and dc_suffix in The RackInferringSnitch infers network topology by convention.
cassandra-rackdc.properties. From the IPv4 address (e.g., 9.100.47.75), the snitch uses the
following convention to identify the datacenter and rack:
Ec2MultiRegionSnitch Specify the region name in the keyspace
strategy options and dc_suffix in
cassandra-rackdc.properties. OCTET EXAMPLE INDICATES

GoogleCloudSnitch Specify the region name in the keyspace 1 9 Nothing


strategy options.
2 100 Datacenter

3 47 Rack
SIMPLESNITCH
The SimpleSnitch provides Cassandra no information regarding 4 75 Node

racks or datacenters. It is the default setting and is useful for simple


deployments where all servers are collocated. It is not recommended EC2SNITCH
for a production environment, as it does not provide failure tolerance. The EC2Snitch is useful for deployments to Amazon's EC2. It uses
Amazon's API to examine the regions to which nodes are deployed. It
PROPERTYFILESNITCH then treats each region as a separate datacenter.
The PropertyFileSnitch allows users to be explicit about their
network topology. The user specifies the topology in a properties file, EC2MULTIREGIONSNITCH
cassandra-topology.properties. The file specifies which nodes Use this snitch for deployments on Amazon EC2 where the
belong to which racks and datacenters. Below is an example property cluster spans multiple regions. This snitch treats datacenters and
file for our sample cluster: availability zones as racks within a datacenter and uses public IPs as
broadcast_address to allow cross-region connectivity. Cassandra
# DC1
nodes in one EC2 region can bind to nodes in another region, thus
192.168.0.1=DC1:RAC1
192.168.0.2=DC1:RAC1 enabling multi-datacenter support.
192.168.0.3=DC1:RAC2
Hot tip: Pay attention to which snitch you are using and which
# DC2 file you are using to define the topology. Some of the snitches use
192.168.1.4=DC2:RAC3
cassandra-topology.properties file and the other, newer ones
192.168.1.5=DC2:RAC3
use cassandra-rackdc.properties file.
192.168.1.6=DC2:RAC4

# Default for nodes


QUERYING/INDEXING
default=DC3:RAC5
Cassandra provides simple primitives. Its simplicity allows it to scale
linearly with continuous, high availability and very little performance
GOSSIPINGPROPERTYFILESNITCH degradation. That simplicity allows for extremely fast read and write
This snitch is recommended for production. It uses rack and operations for specific keys, but servicing more sophisticated queries
datacenter information for the local node defined in the cassandra- that span keys requires pre-planning. Using the primitives that
rackdc.properties file and propagates this information to other Cassandra provides, you can construct indexes that support exactly
nodes via gossip. the query patterns of your application. Note, however, that queries
may not perform well without properly designing your schema.
Unlike PropertyFileSnitch, which contains topology for the entire
cluster on every node, GossipingPropertyFileSnitch contains DC and
SECONDARY INDEXES
rack information only for the local node. Each node describes and
To satisfy simple query patterns, Cassandra provides a native
gossips its location to other nodes.
indexing capability called secondary indexes. A column family may
Example contents of the cassandra-rackdc.properties files: have multiple secondary indexes. A secondary index is hash-based
and uses specific columns to provide a reverse lookup mechanism
dc=DC1
from a specific column value to the relevant row keys. Under the
rack=RACK1
hood, Cassandra maintains hidden column families that store the
RackInferringSnitch
index. The strength of secondary indexes is allowing queries by value.

8 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

Secondary indexes are built in the background automatically without INDEX PATTERNS
blocking reads or writes. To create a secondary index using CQL is There are a few design patterns to implement indexes. Each
straightforward. For example, you can define a table of data about services different query patterns. The patterns leverage the fact that
movie fans, then create a secondary index of states where they live: Cassandra columns are always stored in sorted order and all columns
for a single row reside on a single host.
CREATE TABLE fans ( watcherID uuid, favorite_actor
text, address text, zip int, state text PRIMARY KEY
(watcherID) );
INVERTED INDEXES
First, let’s consider the inverted index pattern. In an inverted index,
CREATE INDEX watcher_state ON fans (state);
columns in one row become row keys in another. Consider the
following dataset, in which users IDs are row keys:
Hot tip: Try to avoid indexes whenever possible. It is (almost) always
a better idea to denormalize data and create a separate table that PARTITION
ROWS/COLUMNS
satisfies a particular query than it is to create an index. KEY

BONE42 { name : “Brian”} { zip: 15283} {dob : 09/19/1982}

RANGE QUERIES LKEL76 { name : “Lisa”} { zip: 98612} {dob : 07/23/1993}


It is important to consider partitioning when designing your schema
COW89 { name : “Dennis”} { zip: 98612} {dob : 12/25/2004}
to support range queries.
Without indexes, searching for users in a specific zip code would
RANGE QUERIES WITH ORDER PRESERVATION
mean scanning our Users column family row by row to find the users
Since order is preserved, order preserving partitioners better support
in the relevant zip code. Obviously, this does not perform well. To
range queries across a range of rows. Cassandra only needs to
remedy the situation, we can create a table that represents the query
retrieve data from the subset of nodes responsible for that range. For
we want to perform, inverting rows and columns. This would result in
example, if we are querying against a column family keyed by phone
the following table:
number and we want to find all phone numbers between that begin
with 215-555, we could create a range query with start key 215-555- PARTITION KEY ROWS/COLUMNS
0000 and end key 215-555-9999. 98612 { user_id : LKEL76 }

To service this request with OrderPreservingPartitioning, { user_id : COW89 }

it’s possible for Cassandra to compute the two relevant tokens: 15283 { user_id : BONE42 }

token(215-555-0000) and token(215-555-9999). Then satisfying that


querying simply means consulting nodes responsible for that token Since each partition is stored on a single machine, Cassandra can

range and retrieving the rows/tokens in that range. quickly return all user IDs within a single zip code by returning all
rows within a single partition. Cassandra simply goes to a single host
Hot tip: Try to avoid queries with multiple partitions whenever based on partition key (zip code) and returns the contents of that
possible. The data should be partitioned based on the access patterns, single partition.
so it is a good idea to group the data in a single partition (or several)
if such queries exist. If you have too many range queries that cannot
TIME SERIES DATA
be satisfied by looking into several partitions, you may want to rethink
When working with time series data, consider partitioning data by
whether Cassandra is the best solution for your use case.
time unit (hourly, daily, weekly, etc.), depending on the rate of events.
That way, all the events in a single period (e.g., one hour) are grouped
RANGE QUERIES WITH RANDOM PARTITIONING
together and can be fetched and/or filtered based on the clustering
The RandomPartitioner provides no guarantees of any kind
columns. TimeWindowCompactionStrategy is specifically designed to
between keys and tokens. In fact, ideally row keys are distributed
work with time series data and is recommended in this scenario.
around the token ring evenly. Thus, the corresponding tokens for
a start key and end key are not useful when trying to retrieve the The TimeWindowCompactionStrategy compacts the all the SSTables
relevant rows from tokens in the ring with the RandomPartitioner. in a single partition per time unit. This allows for extremely fast reads
Consequently, Cassandra must consult all nodes to retrieve the of the data in a single time unit because it guarantees that only one
result. Fortunately, there are well-known design patterns to SSTable will be read.
accommodate range queries. These are described next.

9 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

DENORMALIZATION The repair command replicates any updates missed due to downtime
Finally, it is worth noting that each of the indexing strategies as or loss of connectivity. This command ensures consistency across
presented would require two steps to service a query if the request the cluster and obviates the tombstones. You will want to do this
requires the actual column data (e.g., user name). The first step periodically on each node in the cluster (within the window before
would retrieve the keys out of the index. The second step would fetch tombstone purge). The repair process is greatly simplified by using a
each relevant column by row key. We can skip the second step if we tool called Cassandra Reaper (originally developed and open sourced
denormalize the data. by Spotify but taken over and improved by The Last Pickle).

In Cassandra, denormalization is the norm. If we duplicate the data,


MONITORING
the index becomes a true materialized view that is custom tailored to
Cassandra has support for monitoring via JMX, but the simplest
the exact query we need to support.
way to monitor the Cassandra node is by using OpsCenter, which is
designed to manage and monitor Cassandra database clusters. There
INSERTING/UPDATING/DELETING is a free community edition as well as an enterprise edition that
Everything in Cassandra is an insert, typically referred to as a
provides management of Apache SOLR and Hadoop.
mutation. Since Cassandra is effectively a key-value store, operations
are simply mutations of key-value pairs. Simply download mx4j and execute the following:

cp $MX4J_HOME/lib/mx4j-tools.jar $CASSANDRA_HOME/lib
HINTED HANDOFF
Similar to read repair, hinted handoff is a background process that The following are key attributes to track per column family:
ensures data integrity and eventual consistency. If a replica is down
in the cluster, the remaining nodes will collect and temporarily ATTRIBUTE PROVIDES

store the data that was intended to be stored on the downed node. Read Count Frequency of reads against the column family.
If the downed node comes back online soon enough (configured by
Read Latency Latency of reads against the column family.
max_hint_window_in_ms option in cassandra.yml), other nodes
will "hand off" the data to it. This way, Cassandra smooths out short Write Count Frequency of writes against the column family.

network or other outages out of the box. Write Latency Latency of writes against the column family.

Pending Tasks Queue of pending tasks, informative to know if tasks


OPERATIONS AND MAINTENANCE are queuing.
Cassandra provides tools for operations and maintenance. Some of
the maintenance is mandatory because of Cassandra's eventually An open-source alternative to OpsCenter can be achieved by
consistent architecture. Other facilities are useful to support alerting combining a few tools:
and statistics gathering. Use nodetool to manage Cassandra.
1. cassandra-diagnostics or Jolokia for exposing/shipping JMX
metrics
DataStax provides a reference card on nodetool, which is available
here: https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_ 2. Grafana or Prometheus for displaying the metrics
enterprise/tools/nodetool/toolsNodetool.html.
BACKUP
NODETOOL REPAIR OpsCenter facilitates backing up data by providing snapshots of
Cassandra keeps record of deleted values for some time to support the data. A snapshot creates a new hardlink to every live SSTable.
the eventual consistency of distributed deletes. These values Cassandra also provides online backup facilities using nodetool.
are called tombstones. Tombstones are purged after some time
(GCGraceSeconds, which defaults to 10 days). Since tombstones To take a snapshot of the data on the cluster, invoke:

prevent improper data propagation in the cluster, you will want to $CASSANDRA_HOME/bin/nodetool snapshot
ensure that you have consistency before they get purged.
This will create a snapshot directory in each keyspace data directory.
To ensure consistency, run:
Restoring the snapshot is then a matter of shutting down the node,
>$CASSANDRA_HOME/bin/nodetool repair deleting the commitlogs and the data files in the keyspace, and
copying the snapshot files back into the keyspace directory.

10 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | APACHE CASSANDRA

CLIENT LIBRARIES JAVA


Cassandra has a very active community developing libraries in
CLIENT DESCRIPTION
different languages.
DataStax Java driver Industry standard for Java driver: github.com/
C# for Apache Cassandra datastax/java-driver

CLIENT DESCRIPTION Astyanax Inspired by Hector, Astyanax is a client library


developed by the folks at Netflix: github.com/
DataStax This driver is built on top of DataStax C# driver for Apache Netflix/astyanax
Enterprise Cassandra: https://docs.datastax.com/en/developer/
csharp-driver-dse/2.9/
PHP CQL

C/C++ CLIENT DESCRIPTION

CLIENT DESCRIPTION Cassandra-PDO A CQL (Cassandra Query Language) driver for


PHP: code.google.com/a/apache-extras.org/p/
DataStax This driver builds on the DataStax C/C++ driver for Apache cassandra-pdo
Enterprise Cassandra and includes specific features for DSE: https://
docs.datastax.com/en/developer/cpp-driver-dse/1.10/
CQL

PYTHON CLIENT DESCRIPTION

CLIENT DESCRIPTION CQL Cassandra provides an SQL-like query language called the
Cassandra Query Language (CQL). The CQL shell allows you
Pycassa Pycassa is the most well-known Python library for to interact with Cassandra as if it were a SQL database. Start
Cassandra: github.com/pycassa/pycassa the shell with:

$CASSANDRA_HOME/bin/
REST
Datastax has a reference card for CQL available here:
CLIENT DESCRIPTION https://www.datastax.com/sites/default/files/content/
technical-guide/2019-09/cqltop10.final_.pdf
Virgil Virgil is a Java-based REST client for Cassandra:
github.com/hmsonline/virgil
COMMAND LINE INTERFACE (CLI)
RUBY Cassandra also provides a Command Line Interface (CLI), through
which you can perform all schema-related changes. It also allows
CLIENT DESCRIPTION
you to manipulate data. DataStax provides reference cards for both
Ruby Gem Ruby has support for Cassandra via a gem:
CQL and nodetool, available here: https://docs.datastax.com/en/
rubygems.org/gems/cassandra
dse/6.7/cql/cql/cqlQuickReference.html.

UPDATED BY MILAN MILOSEVIC,


LEAD DATA ENGINEER, SMARTCAT

Milan Milosevic is Lead Data Engineer in SmartCat, where he leads the team of data engineers and data scientists
implementing end-to-end machine learning and data-intensive solutions. He is also responsible for designing, implementing,
and automating highly available, scalable, distributed cloud architectures (AWS, Apache Cassandra, Ansible, CloudFormation,
Terraform, MongoDB). His primary focus is on Apache Cassandra and monitoring, performance tuning, and automation around it.

DZone, a Devada Media Property, is the resource software Devada, Inc.


developers, engineers, and architects turn to time and 600 Park Offices Drive
again to learn new skills, solve software development Suite 150
problems, and share their expertise. Every day, hundreds of Research Triangle Park, NC 27709
thousands of developers come to DZone to read about the
888.678.0399 919.678.0300
latest technologies, methodologies, and best practices. That
makes DZone the ideal place for developer marketers to Copyright © 2020 Devada, Inc. All rights reserved. No part
build product and brand awareness and drive sales. DZone of this publication may be reproduced, stored in a retrieval
clients include some of the most innovative technology and system, or transmitted, in any form or by means of electronic,
tech-enabled companies in the world, including Red Hat, mechanical, photocopying, or otherwise, without prior
Cloud Elements, Sensu, and Sauce Labs. written permission of the publisher.

11 BROUGHT TO YOU IN PARTNERSHIP WITH

You might also like