Upgrading a Large Cassandra Cluster with cstar

I recently did an upgrade of 200+ nodes of Cassandra across multiple environments sitting behind multiple applications using the cstar tool. We chose the cstar tool because, out of all automation options, it has topology awareness specifically to Cassandra. Here…

Read More >

Spark + Cassandra Best Practices

Spark Overview Spark was created in 2009 as a response to difficulties with map-reduce in Hadoop, particularly in supporting machine learning and other interactive data analysis. Spark simplifies the processing and analysis of data, reducing the number of steps and…

Read More >

Global Analytics with Azure Cosmos Db and Synapse Analytics – SQL On The Edge Episode 21

A few months ago, Microsoft revealed that they were looking into adding a capability of querying Cosmos Db data through Spark and this immediately got me thinking into the new scenarios this would enable. The most ambitious is the capability…

Read More >

Testing Cassandra compatible APIs

In this quick blog post, I’m going to assess how the databases that advertise themselves as “Cassandra API-compatible” fare in the compatibility department. But that is all I will do, only API testing, and not an extensive testing, just based…

Read More >

How to build your very own Cassandra 4.0 release

Over the last few months, I have been seeing references to Cassandra 4.0 and some of its new features. When that happens with a technology I am interested in, I go looking for the preview releases to download and test….

Read More >

Handling a Cassandra transactional workload

Overview of Cassandra As previously mentioned in my notes on lightweight transactions, Cassandra does not support ACID transactions. Cassandra was built to support a brisk ingest of writes while being distributed for availability. Follow the link to my previous post…

Read More >

Step-by-bstep monitoring Cassandra with with Prometheus and Grafana

In this blog, I’m going to give a detailed guide on how to monitor a Cassandra cluster with Prometheus and Grafana. For this, I’m using a new VM which I’m going to call “Monitor VM”. In this blog post, I’m…

Read More >

How to tweak the number of num_tokens (vnodes) in live Cassandra cluster

Some clients have asked us to change the number of num_tokens as their requirement changes. For example lower number of num_tokens are recommended is using DSE search etc.. The most important thing during this process is that the cluster stays…

Read More >

So you have a broken Cassandra SSTable file?

Every few months I have a customer come to me with the following concern: my compactions for one of my Cassandra tables are stuck or my repairs fail when referencing one of the nodes in my Cassandra cluster. I take…

Read More >

Proposal for a new Cassandra cluster key compaction strategy

Cassandra storage is generally described as a log-structured merge tree (LSM). In general, LSM storage provides great speed in performing writes, updates and deletes over reads. As a general rule, a write in Cassandra is an order of magnitude faster…

Read More >
Page 1 of 712345...Last Page »