How to Perform (UDC) User-Defined Compactions in Cassandra

User-defined compactions allow us to manually select which files should be compacted. This enables us to reclaim space and limit the size of compaction so it can fit into the remaining space. These compactions are relevant only for SizeTieredCompactionStrategy (STCS)…

Read More >

Let’s Deal with High Read Latencies in Cassandra

High latency values may indicate a cluster at the edge of its processing capacity, issues with the data model—such as poor choice of partition key or high levels of tombstones—or issues with the underlying infrastructure. Below are some major reasons…

Read More >

Incremental Repair: Problems and a Solution

Because incremental repairs can significantly reduce the time and IO cost of performing a repair, they can seem like a great idea. However practical implementation carries a few pitfalls which can cause severe damage to a production cluster, especially when…

Read More >

Upgrading a Large Cassandra Cluster with cstar

I recently did an upgrade of 200+ nodes of Cassandra across multiple environments sitting behind multiple applications using the cstar tool. We chose the cstar tool because, out of all automation options, it has topology awareness specifically to Cassandra. Here…

Read More >

Spark + Cassandra Best Practices

Spark Overview Spark was created in 2009 as a response to difficulties with map-reduce in Hadoop, particularly in supporting machine learning and other interactive data analysis. Spark simplifies the processing and analysis of data, reducing the number of steps and…

Read More >

Global Analytics with Azure Cosmos Db and Synapse Analytics – SQL On The Edge Episode 21

A few months ago, Microsoft revealed that they were looking into adding a capability of querying Cosmos Db data through Spark and this immediately got me thinking into the new scenarios this would enable. The most ambitious is the capability…

Read More >

Testing Cassandra compatible APIs

In this quick blog post, I’m going to assess how the databases that advertise themselves as “Cassandra API-compatible” fare in the compatibility department. But that is all I will do, only API testing, and not an extensive testing, just based…

Read More >

How to build your very own Cassandra 4.0 release

Over the last few months, I have been seeing references to Cassandra 4.0 and some of its new features. When that happens with a technology I am interested in, I go looking for the preview releases to download and test….

Read More >

Handling a Cassandra transactional workload

Overview of Cassandra As previously mentioned in my notes on lightweight transactions, Cassandra does not support ACID transactions. Cassandra was built to support a brisk ingest of writes while being distributed for availability. Follow the link to my previous post…

Read More >

Step-by-bstep monitoring Cassandra with with Prometheus and Grafana

In this blog, I’m going to give a detailed guide on how to monitor a Cassandra cluster with Prometheus and Grafana. For this, I’m using a new VM which I’m going to call “Monitor VM”. In this blog post, I’m…

Read More >
Page 1 of 712345...Last Page »