How to build your very own Cassandra 4.0 release

John Schulz

February 13, 2019

Tags: Google Cloud Platform, Technical Track, Cloud, Google Cloud Platform (Gcp), Devops, Serverless

Over the last few months, I have been seeing references to Cassandra 4.0 and some of its new features. When that happens with a technology I am interested in, I go looking for the preview releases to download and test. Unfortunately, so far, there are no such releases. But, I am still interested, so I’ve found it necessary to build my own Cassandra 4.0 release. This is in my humble opinion not the most desirable way to do things since there is no Cassandra 4.0 branch yet. Instead, the 4.0 code is on the trunk. So if you do two builds a commit or two apart, and there are typically at least three or four commits a week right now, you get a slightly different build. It is, in essence, a moving target. All that said and done, I decided if I could do it, then the least I could do is write about how to do it and let everyone who wants to try it learn how to avoid a couple of dumb things I did when I first tried it. Building your very own Cassandra 4.0 release is actually pretty easy. It consists of five steps:

Make sure you have your prerequisites
1. Java SDK 1.8 or Java 1.11 Open Source or Oracle
2. Ant 1.8
3. Git CLI client
4. Python >=2.7<3.0
Download the GIT repository
1. git clone https://gitbox.apache.org/repos/asf/cassandra.git
Build your new Cassandra release
1. Cd cassandra
2. Ant
Run Cassandra
1. Cd ./bin
2. ./cassandra
Have fun
1. ./nodetool status
2. ./cqlsh

I will discuss each step in a little bit more detail:

Step 1) Verify, and if necessary, install your prerequisites

For Java, you can confirm the JDK presence by typing in: john@Lenny:~$javac -version javac 1.8.0_191 For ant: john@Lenny:~$ ant -version Apache Ant(TM) version 1.9.6 compiled on July 20 2018 For git: john@Lenny:~$ git --version git version 2.7.4 For Python: john@Lenny:~$ python --version Python 2.7.12 If you have all of the right versions, you are ready for the next step. If not, you will need to install the required software which I am not going to go into here.

Step 2) Clone the repository

Verify you do not already have an older copy of the repository: john@Lenny:~$ ls -l cassandra ls: cannot access 'cassandra': No such file or directory If you found a Cassandra directory, you will want to delete or move it or your current directory elsewhere. Otherwise:

 john@Lenny:~$ git clone https://git-wip-us.apache.org/repos/asf/cassandra.git Cloning into 'cassandra'... remote: Counting objects: 316165, done. remote: Compressing objects: 100% (51450/51450), done. remote: Total 316165 (delta 192838), reused 311524 (delta 189005) Receiving objects: 100% (316165/316165), 157.78 MiB | 2.72 MiB/s, done. Resolving deltas: 100% (192838/192838), done. Checking connectivity... done. Checking out files: 100% (3576/3576), done.

john@Lenny:~$ du -sh * 294M cassandra At this point, you have used up 294 MB on your host and you have an honest-for-real git repo clone on your host - in my case, a Lenovo laptop running Windows 10 Linux subsystem. And your repository looks something like this:


  john@Lenny:~$ ls -l cassandra
  total 668
  drwxrwxrwx 1 john john 512 Feb 6 15:54 bin
  -rw-rw-rw- 1 john john 260 Feb 6 15:54 build.properties.default
  -rw-rw-rw- 1 john john 101433 Feb 6 15:54 build.xml
  -rw-rw-rw- 1 john john 4832 Feb 6 15:54 CASSANDRA-14092.txt
  -rw-rw-rw- 1 john john 390460 Feb 6 15:54 CHANGES.txt
  drwxrwxrwx 1 john john 512 Feb 6 15:54 conf
  -rw-rw-rw- 1 john john 1169 Feb 6 15:54 CONTRIBUTING.md
  drwxrwxrwx 1 john john 512 Feb 6 15:54 debian
  drwxrwxrwx 1 john john 512 Feb 6 15:54 doc
  -rw-rw-rw- 1 john john 5895 Feb 6 15:54 eclipse_compiler.properties
  drwxrwxrwx 1 john john 512 Feb 6 15:54 examples
  drwxrwxrwx 1 john john 512 Feb 6 15:54 ide
  drwxrwxrwx 1 john john 512 Feb 6 15:54 lib
  -rw-rw-rw- 1 john john 11609 Feb 6 15:54 LICENSE.txt
  -rw-rw-rw- 1 john john 123614 Feb 6 15:54 NEWS.txt
  -rw-rw-rw- 1 john john 2600 Feb 6 15:54 NOTICE.txt
  drwxrwxrwx 1 john john 512 Feb 6 15:54 pylib
  -rw-rw-rw- 1 john john 3723 Feb 6 15:54 README.asc
  drwxrwxrwx 1 john john 512 Feb 6 15:54 redhat
  drwxrwxrwx 1 john john 512 Feb 6 15:54 src
  drwxrwxrwx 1 john john 512 Feb 6 15:54 test
  -rw-rw-rw- 1 john john 17215 Feb 6 15:54 TESTING.md
  drwxrwxrwx 1 john john 512 Feb 6 15:54 tools

Step 3) Build your new Cassandra 4.0 release

Remember what I said in the beginning? There is no branch for Cassandra 4.0 at this point, so building from the trunk is quite simple:

 john@Lenny:~$ cd cassandra john@Lenny:~/cassandra$ ant Buildfile: /home/john/cassandra/build.xml … BUILD SUCCESSFUL Total time: 1 minute 4 seconds

That went quickly enough. Let's take a look and see how much larger the directory has gotten: john@Lenny:~$ du -sh * 375M cassandra Our directory grew by 81MB pretty much all in the new build directory which now has 145 new files including ./build/apache-cassandra-4.0-SNAPSHOT.jar. I am liking that version 4.0 right in the middle of the filename.

Step 4) Start Cassandra up. This one is easy if you do the sensible thing

 john@Lenny:~/cassandra$ cd .. john@Lenny:~$ cd cassandra/bin john@Lenny:~/cassandra/bin$ ./cassandra john@Lenny:~/cassandra/bin$ CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.deserializeLargeSubset (Lorg/apache/cassandra/io/util/DataInputPlus;Lorg/apache/cassandra/db/Columns;I)Lorg/apache/cassandra/db/Columns; CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubset (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;ILorg/apache/cassandra/io/util/DataOutputPlus;)V CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubsetSize (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;I)I … INFO [MigrationStage:1] 2019-02-06 21:26:26,222 ColumnFamilyStore.java:407 - Initializing system_auth.role_members INFO [MigrationStage:1] 2019-02-06 21:26:26,234 ColumnFamilyStore.java:407 - Initializing system_auth.role_permissions INFO [MigrationStage:1] 2019-02-06 21:26:26,244 ColumnFamilyStore.java:407 - Initializing system_auth.roles

We seem to be up and running. Its time to try some things out:

Step 5) Have fun

We will start out making sure we are up and running by using nodetool to connect and display a cluster status. Then we will go into the CQL shell to see something new. It is important to note that since you are likely to have nodetool and cqlsh already installed on your host, you need to use the ./ in front of your commands to ensure you are using the 4.0 version. I have learned the hard way that forgetting the ./ can result in some very real confusion.


  john@Lenny:~/cassandra/bin$ ./nodetool status
  Datacenter: datacenter1
  =======================
  Status=Up/Down
  |/ State=Normal/Leaving/Joining/Moving
  -- Address Load Tokens Owns (effective) Host ID Rack
  UN 127.0.0.1 115.11 KiB 256 100.0% f875525b-3b78-49b4-a9e1-2ab0cf46b881 rack1
  
  john@Lenny:~/cassandra/bin$ ./cqlsh
  Connected to Test Cluster at 127.0.0.1:9042.
  [cqlsh 5.0.1 | Cassandra 4.0-SNAPSHOT | CQL spec 3.4.5 | Native protocol v4]
  Use HELP for help.
  cqlsh> desc keyspaces;
 
  system_traces system_auth system_distributed system_views
  system_schema system system_virtual_schema
 
  cqlsh>

We got a nice cluster with one node and we see the usual built-in key spaces. Well um… not exactly. We see two new key spaces system_virtual_schema and system_views. Those look very interesting. In my next blog, I’ll be talking more about Cassandra's new virtual table facility and how very useful it is going to be someday soon. I hope.

Insight and analysis of technology and business strategy