Tag: apache cassandra

Batch Operations in Apache Cassandra

Batches are one of the most misunderstood features of Apache Cassandra. They rarely improve performance. In fact, while using batches, performance may degrade. To set the stage, let’s take a look at how Cassandra handles individual mutations.   Individual mutations…

Read More >

How to Deploy Spark in DataStax Cassandra 5.1

spark datastax cassandra

Spark is an open-source, distributed processing system used to manage big data workloads. Spark uses in-memory caching and optimized query execution for fast analytic queries against any data size. Simply put, Spark is used to process data on a very…

Read More >

Change Your system_auth Replication Factor in Cassandra

Cassandra authentication

Occasionally, clients reach out to us with authentication issues when a node is down. While this scenario shouldn’t happen in a high availability database management system (DBMS), it can if you miss a couple of very relevant lines in the…

Read More >

Cassandra for Beginners: Replication

cassandra for beginners

This post is the continuation of the previous post, Cassandra 101: Understanding What Cassandra Is, in which I’ll highlight a series of topics related to Cassandra for beginners.       Replication Factor The replication factor in Cassandra can be…

Read More >

Replacing Nodes in Cassandra

nodes in cassandra

One of the many things to love about Cassandra is how operationally simple it is to add, remove or even replace nodes in a cluster.     Replacing a node in Cassandra is as easy as setting your configuration files…

Read More >

Cassandra Vulnerability – CVE-2020-13946 – Apache Cassandra RMI Rebind Vulnerability

On September 1, 2020, Apache disclosed a security vulnerability for Apache Cassandra. Summary: It’s possible for a local attacker without access to the Apache Cassandra process or configuration files, to manipulate the RMI registry to perform a man-in-the-middle attack and…

Read More >

The things I hate about Apache Cassandra

Intro First, let me start by saying I do not hate Cassandra. I love Cassandra. In its place, Cassandra is a powerful tool designed well to scale to millions of operations per second over geographically distributed locations operating in a…

Read More >

Examining the Lifecycle of Tombstones in Apache Cassandra

This post is the first part of a series of blog posts regarding the lifecycle and management of tombstones. Deleting and expiring data in Cassandra is something that you should carefully plan. Especially if you’re about to delete a massive…

Read More >

So you have a broken Cassandra SSTable file?

Every few months I have a customer come to me with the following concern: my compactions for one of my Cassandra tables are stuck or my repairs fail when referencing one of the nodes in my Cassandra cluster. I take…

Read More >

Backup strategies in Cassandra

Cassandra is a distributed, decentralized, fault-tolerant system. Data is replicated throughout multiple nodes (centers) across various data centers. The fact that Cassandra is decentralized means that it can survive single or even multi-node failures without losing any data. With Cassandra,…

Read More >
Page 1 of 212