Cassandra RandomPartitioner Tokenizing

Posted in: Cassandra, Technical Track

So I’m creating a new cluster, and after setting up I needed to get my tokens.  As we’re told in https://wiki.apache.org/cassandra/Operations:

Token selection:

Using a strong hash function means RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify InitialToken to your first nodes as i * (2**127 / N) for i = 0 .. N-1. In Cassandra 0.7, you should specify initial_token in cassandra.yaml.

Here’s a nice simple code snippet to figure out your RandomPartitioner tokens based on the size of your cluster:

#! /usr/bin/python

#nodes = int(raw_input( “How many nodes?” ))

import sys

nodes=int(sys.argv[1])

def tokens(nodes):

    for i in range(1, nodes + 1):

        print (i * (2 ** 127 – 1) / nodes)

This should give something like this:

[[email protected] conf]# ./tokenizer.py 6

28356863910078205288614550619314017621

56713727820156410577229101238628035242

85070591730234615865843651857942052863

113427455640312821154458202477256070484

141784319550391026443072753096570088105

170141183460469231731687303715884105727

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *