cassandra

Load test Cassandra – The native way – Part 2: The How

In the previous post we talked briefly about why do we need to load test a service. In this post we’ll continue on where we left at the last post on how to load test cassandra. So, lets get started on how to use this amazingly powerful tool; cassandra-stress.


Things to keep in mind:

  • Load test should be simulating the real-time scenario. So, it is very important to have this setup as close to the one in production. It is highly recommended that we use a separate node/host in proximity to the cluster for load testing (Eg: Deploy the load test server in the same region if you are deployment is in AWS).
  • Do not use any node from the cluster itself for load testing. It is not unusual to think that, since cassandra-stress is a tool that comes bundled with the cassandra distribution and logically it makes sense to directly use the tool in one of the nodes. Because, cassandra-stress is a heavy-weight process and can consume a lot of JVM resources and can in-turn cloud your node’s performance.
  • We should also keep in mind that cassandra-stress tool is not actually a distributed program, so in order to test a cluster, we need to make sure that memory is not a bottleneck, so I would recommend to have a host with at-least 16Gigs of memory.

How to use cassandra-stress:

Step 1 : The configuration file

The configuration file is the way to let cassandra-stress tool to prepare key-space and table and prepare data for the load test. We need to configure a bunch of properties for defining the keyspace, table, data-distribution for the test and the queries to test. 

keyspaceKeyspace name
keyspace_definitionDefine keyspace
tableTable name
table_definitionDefine the table definition
columnspecColumn Distribution Specifications 
inserinsertBatch Ratio Distribution Specifications
queriesA list of queries you wish to run against the schema
# Keyspace Name
keyspace: keyspace_to_load_test
# The CQL for creating a keyspace (optional if it already exists)
keyspace_definition: |
CREATE KEYSPACE keyspace_to_load_test with replication = {'class': 'SimpleStrategy', 'replication_factor' : '3'}
# Table name
table: table_to_load_test
# The CQL for creating a table you wish to stress (optional if it already exists)
table_definition: |
CREATE TABLE table_to_load_test (
id uuid,
column1 text,
column2 int,
PRIMARY KEY((id), column1))
### Column Distribution Specifications ###
columnspec:
– name: id
population: GAUSSIAN(1..1000000, 500000, 15) # Normal distribution to mimic the production load
– name: column1
size: uniform(5..20) # Anywhere from 5 characters to 20 characters
cluster: fixed(5) #Assuming that we would be having 5 distinct carriers
– name: column2
size: uniform(100..500) # Anywhere from 5 characters to 20 characters
### Batch Ratio Distribution Specifications ###
insert:
partitions: fixed(1) # We are just going to be touching single partiton with an insert
select: fixed(1)/5 # We would want to update 1/5th of the rows in the partition at any given time
batchtype: UNLOGGED # No batched inserts
#
# A list of queries you wish to run against the schema
#
queries:
queryForUseCase:
cql: select * from table_to_load_test where id = ? and column1 = ?
fields: samerow

Now that we have this configuration file ready, we can use this to run our load test by using the cassandra-stress tool. Lets see how to run the tool now.


Step 2 : Command options

cassandra-stress tool comes bundled with your cassandra distribution download. You will be able to find the tool in apache-cassandra-<version>/tools/bin/.apache-cassandra-<version>/tools/bin/.  You can also learn the options available more deeply by checking out the help option in the tool. I will go thru an example and show you how to run the tool in this post. 

cassandra-stress user profile=stresstest.yaml duration=4h 'ops(insert=100, queryForUseCase =1)' cl=LOCAL_QUORUM -node <nodelist seperated by commas> -rate 'threads=450' throttle=30000/s -graph file="stress-result-4h-ratelimit-clients.html" title=Stress-test-4h -log file=result.log

Lets go over the options I used one by one to understand what they mean. This is by no means a comprehensive explanation. I would highly recommend giving the documentation a good read to know more about these options.

userSpecify the tool to say that cassandra-stress is used for running a load test on User specified schema.
profileSpecify where the configuration file (yaml file) exist.
durationDuration for which your load test should run
opsOperations defined in the yaml file to be included as part the load test. In our example it is insert and queryForUseCase defined in the yaml file.
clConsistency level for your operations
nodeNodes in the cluster
rate# of threads and peak ops/sec limit
graphGraphical report of the run. Specify the file name and title of the report
logLog file name

It is as simple as this. The tool will now run for the duration specified and output a detailed report on the run.

I hope you found this helpful and would certainly be delighted to answer any question regarding this.  

Standard
cassandra

Load test Cassandra – The native way – Part 1: The Why

Load testing is an imperative part of the software development process. The idea is to test out a feature/service in a prod-like environment with a realistic high load for an unusual time frame, just to gain confidence that the service would not bail out on us during critical times. Quite logical isn’t it? In this series, I’ll go thru my very brief experience load testing a schema in cassandra. So lets get started right way!

With micro-service architecture being a norm at almost every turn in software development, it is worth spending time, talking about how to load test a micro-service. Is it going to be different than testing a monolithic service? Since we say that we test out a near prod-like setup, does it mean that I have to spend a whole lot and setup the same number of nodes that prod-cluster has? But, what if I have some kind of auto-scaling setup? These were a few questions that I had when I had to load-test a micro-service. The answer is quite simple mathematics; extrapolation. We simulate the load to a node and then extrapolate the result. This however, may not be accurate as there may be a few things that might be left out of the equation like network bandwidth, disk I/O, etc. It is also essential to load test the load-balancer to get a clear picture.

But wait! The above method works fine as long as each service have just one responsibility. How about load testing a scenario where the architecture is supposed to perform only if its a part of a cluster?. What do we do, if these processes talk to each other and gossip among them? There are many big-data architectures like this and one such service is cassandra. Fortunately, there is a tool that comes bundled with cassandra for this very purpose; cassandra-stress.

cassandra-stress was initially developed as an internal tool created by developers of cassandra to load test the internals of cassandra. Later, a mode was added to this tool to enable cassandra users test their schema.

I wouldn’t definitely claim that cassandra-stress is the only way to achieve this. In fact, load testing cassandra was possible way before this tool was generally available to test cassandra cluster. My online research yielded the next most popular public choice was to use a JMeter plugin. I choose cassandra-stress because of the obvious fact that its a native tool that comes bundled with cassandra and has a pretty easy learning curve.

Let’s go over how to configure your own load test using the cassandra-stress tool in another post.

Standard
maven reactor
build-tools, java

Maven reactor: A smart way to a modular project structure

Usually, I would just avoid any thing that involves XML processing or XML configuration in it. So, I wasn’t a big fan of maven either when I started using it. Have I had any good alternate to build projects, I would have undoubtedly inclined towards it. Now, I do understand that gradle is still out there giving a very tough competition to maven. But, I feel it still has lot of distance to cover up; Maven just has got an awesome head start and I don’t think it could be replaced by gradle, even though with a lot new framework’s supporting it (Android, Spring, etc.). I was quite amazed to know what capabilities that maven could do to ease up the life of a programmer.

We can go on and on if I start talking about maven. But I would like to share one interesting feature I like about maven; The Reactor plugin.

It is often recommended to have your projects small and concise for obvious reasons. But usually, we find one huge project or a bunch of small standalone projects that depend on each other. Even if we divide a huge project into multiple small and cohesive projects/libraries/modules, we still have an overhead to manually make sure that the projects are built in the right order to make sure the right dependency is picked up. Many projects end up growing enormously due to this extra overhead on the developer when building the project.

Maven, does have a smart way for us to manage the modules for us without us having to make sure if the modules in the projects are built in the right order. Let’s see how that is done.

So, how reactor project works is that, you would have to setup a top-level pom that manages all your modules. This is usually called the parent-pom. All the modules that are part of this project will just be another simple maven project that will inherit this parent-pom. Along with this you will also, need to specify to the parent-pom on what are its children/modules. This will ensure maven does all the magic for you while its building your project.

Structure of a Maven reactor project

Structure of a Maven reactor project

That is all you need to do. Let’s now take a look at how to define your parent-pom.

Parent-pom:


<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0&quot;
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot;
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"&gt;
<modelVersion>4.0.0</modelVersion>
<groupId>com.indywiz.springorama.reactor</groupId>
<artifactId>maven-reactor-parent</artifactId>
<packaging>pom</packaging>
<version>1.0-SNAPSHOT</version>
<modules>
<module>maven-reactor-app</module>
<module>maven-reactor-util</module>
</modules>
</project>

view raw

parent-pom.xml

hosted with ❤ by GitHub

If you check out what is different when you compare the pom with a traditional pom file is the following.

  • Packaging is set to pom instead of jar/war. This is because, your parent-pom is just a maven entity to manage your module, it is not a project that produces any artifact for you.
  • The modules tag. This tag is responsible for defining what are all the projects that the reactor has to manage.

Keep in mind that the order you define your modules does not matter, we will go thru that part in the end.

Now lets look at the module-pom.

Module-pom:


<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0&quot;
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot;
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"&gt;
<parent>
<artifactId>maven-reactor-parent</artifactId>
<groupId>com.indywiz.springorama.reactor</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>maven-reactor-util</artifactId>
<name>maven-reactor-util</name>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.7</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>

view raw

module1-pom.xml

hosted with ❤ by GitHub

So, in this example, the first module is just a util library where I am using commons-lang3 library from apache. One other thing you will have to note is that we do not need to specify the groupId and the version in this pom. They are inherited from your parent-pom.

Now, I would like to use this module as a dependency on my module 2. The second module’s pom is similar to the first module. I just add the first module as the dependency to it.


<dependency>
<groupId>com.indywiz.springorama.reactor</groupId>
<artifactId>maven-reactor-util</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>

Now, all you have to do is to build the parent pom and see the magic happen.

Maven reactor build result

Maven reactor build result

So, what just happened was that, when we built the parent pom, reactor build part kicked in and maven started to check what are all the modules that come under this project, build the dependency graph and dynamically figured out that module2 (i.e the util project) depends on module 1 (the app module) and build util module before it started building the app module.

I deliberately, reversed the order in which I defined the modules in the parent pom. If you check the parent-pom’s modules tag, we defined app module before the util module. I did that on purpose to show that the order in which we define does not matter. Maven reactor will figure out the right order to build these project irrespective to the order in which they are defined in the parent pom.


<modules>
<module>maven-reactor-app</module>
<module>maven-reactor-util</module>
</modules>

I hope you guys also enjoyed this post. I’d be happy to hear your feedback. In case you can check out the complete example in github here.

Standard
java, libraries

Lombok – A must have library to spice up your Java

Java, unfortunately, does have a lot of unwanted verbosity when it comes to creating classes. Whilst there are new languages on jvm competing with each other by having to write very less amount of boilerplate code, java still is not even close to these competitors.

While we sit and complain about the language’s inability to incorporate reduction of unwanted ceremony of code, ProjectLombok, has given us a work around to make our lives a little bit easier.

Alright, I am a Java programmer about to write a simple model class for my CRUD operations. I will have to go thru the following steps to create the model.

  1. Create the model-class.
  2. Define the attributes for the model with right scope. (Since its good practice to encapsulate attributes properly, I will have to set the access specifier to private. Personally I’m not a big fan of this).
  3. Create default constructor and parameterized-constructors based on my needs.
  4. Optionally, I also have to override toString(), equals() and hashCode() if needed.
  5. If I am done with the above stuff, I might think of have a nice builder pattern if needed, so that I can have a fluid way to populate the attributes.

This has to be repeated and could sometimes be a very painful thing, if we have to do it for many models and, things for sure will get annoying if the model class has to change. Clearly, we don’t have to do such things in modern languages like scala or groovy or kotlin. These languages give you getters, setters, etc., for free. It would be awesome if java would also come up with something like this to avoid unnecessary boilerplate code when instantiating classes. Well, Lombok exactly solves this thing for us. It silently generates all the boiler plate code by integrating with the build tool and also the IDE for you to focus just on specifying what you need and the business logic.

If you look at the above steps we followed to create a model class, except creating the class and the attributes (which we have to anyway in any language) everything else, starting from specifying access specifier, creating getters and setters for the same, having to override bunch of obvious methods could be avoided and Lombok provides an awesome way to achieve this thru annotations.

Below is a model where I used Lombok annotations in the Spring-Integration post.


package com.indywiz.springorama.springintegration.model;
import javax.persistence.Entity;
import javax.persistence.Id;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.ToString;
@Data
@Entity
@Builder
@ToString
@NoArgsConstructor
@AllArgsConstructor
public class Person {
@Id
private Long personId;
private String personName;
private String personPhoneNumber;
}

view raw

Person.java

hosted with ❤ by GitHub

This is a very simple model where I have just three fields, couple of JPA annotations and a bunch of Lombok annotations. The interesting thing to see is what Lombok generated for me when compiling the class.


package com.indywiz.springorama.springintegration.model;
import javax.persistence.Entity;
import javax.persistence.Id;
@Entity
public class Person {
@Id
private Long personId;
private String personName;
private String personPhoneNumber;
public static Person.PersonBuilder builder() {
return new Person.PersonBuilder();
}
public Long getPersonId() {
return this.personId;
}
public String getPersonName() {
return this.personName;
}
public String getPersonPhoneNumber() {
return this.personPhoneNumber;
}
public void setPersonId(Long personId) {
this.personId = personId;
}
public void setPersonName(String personName) {
this.personName = personName;
}
public void setPersonPhoneNumber(String personPhoneNumber) {
this.personPhoneNumber = personPhoneNumber;
}
public boolean equals(Object o) {
if (o == this) {
return true;
} else if (!(o instanceof Person)) {
return false;
} else {
Person other = (Person)o;
if (!other.canEqual(this)) {
return false;
} else {
label47: {
Object this$personId = this.getPersonId();
Object other$personId = other.getPersonId();
if (this$personId == null) {
if (other$personId == null) {
break label47;
}
} else if (this$personId.equals(other$personId)) {
break label47;
}
return false;
}
Object this$personName = this.getPersonName();
Object other$personName = other.getPersonName();
if (this$personName == null) {
if (other$personName != null) {
return false;
}
} else if (!this$personName.equals(other$personName)) {
return false;
}
Object this$personPhoneNumber = this.getPersonPhoneNumber();
Object other$personPhoneNumber = other.getPersonPhoneNumber();
if (this$personPhoneNumber == null) {
if (other$personPhoneNumber != null) {
return false;
}
} else if (!this$personPhoneNumber.equals(other$personPhoneNumber)) {
return false;
}
return true;
}
}
}
protected boolean canEqual(Object other) {
return other instanceof Person;
}
public int hashCode() {
int PRIME = true;
int result = 1;
Object $personId = this.getPersonId();
int result = result * 59 + ($personId == null ? 43 : $personId.hashCode());
Object $personName = this.getPersonName();
result = result * 59 + ($personName == null ? 43 : $personName.hashCode());
Object $personPhoneNumber = this.getPersonPhoneNumber();
result = result * 59 + ($personPhoneNumber == null ? 43 : $personPhoneNumber.hashCode());
return result;
}
public String toString() {
return "Person(personId=" + this.getPersonId() + ", personName=" + this.getPersonName() + ", personPhoneNumber=" + this.getPersonPhoneNumber() + ")";
}
public Person() {
}
public Person(Long personId, String personName, String personPhoneNumber) {
this.personId = personId;
this.personName = personName;
this.personPhoneNumber = personPhoneNumber;
}
public static class PersonBuilder {
private Long personId;
private String personName;
private String personPhoneNumber;
PersonBuilder() {
}
public Person.PersonBuilder personId(Long personId) {
this.personId = personId;
return this;
}
public Person.PersonBuilder personName(String personName) {
this.personName = personName;
return this;
}
public Person.PersonBuilder personPhoneNumber(String personPhoneNumber) {
this.personPhoneNumber = personPhoneNumber;
return this;
}
public Person build() {
return new Person(this.personId, this.personName, this.personPhoneNumber);
}
public String toString() {
return "Person.PersonBuilder(personId=" + this.personId + ", personName=" + this.personName + ", personPhoneNumber=" + this.personPhoneNumber + ")";
}
}
}

view raw

Person.java

hosted with ❤ by GitHub

This is how the class would have looked if I had to write class manually without Lombok. With just 24 lines of code with Lombok, I get getters and setters, builder pattern, a constructor with all the three attributes, a default constructor and a toString() method that appends all the toString() of the class attributes. I could have gotten a lot more, by just adding bunch more Lombok annotations.

Installing Lombok:

Installing lombok is very straight-forward. You need to let the IDE and the build-tool know that you are using Lombok. Rest is all done for you by lombok.

Letting the IDE know:

  1. Download the Lombok jar from here.
  2. Either double-click on the downloaded jar file or run the following command from the location where you downloaded jar.
    java -jar lombok.jar
  3. Lombok will automatically scan your system to find the IDE installations and ask you permission to install Lombok plugin automatically for you. If it was not able to find the installations in their default locations, you also can specify the location where you have installed your IDE too. Screen Shot 2018-04-22 at 6.53.24 PM
  4. Then click on Install button and you are set.

Letting your build tool know:

For maven:

Add the following dependency to your pom file.


<dependencies>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.16.20</version>
<scope>provided</scope>
</dependency>
</dependencies>

For gradle:

Add the following to your build.gradle file.


lombok {
version = 1.16.20
sha256 = ""
}

And you are all set to use Lombok in your development environment. May you have all the power that Lombok annotations brings forth. To know what other annotations that Lombok offers take a look at the official documentation here.

Standard
java

Install and manage multiple versions of java on your Mac OS X gracefully with jenv

The Only Thing That Is Constant Is Change.
― Heraclitus

With Oracle opting to release for every six months (more info on this here), it’s obvious that we would end up having multiple java versions on our machine. The obvious next challenge would be to manage these installations and not mess up the java installation on our local machine.

TLDR; I have split this post into three parts. Feel free to jump on to any part as per your needs.

Part 1: How to install homebrew and homebrew-cask

Part 2: How to install java using homebrew.

Part 3: How to manage multiple java installations using jenv.


Part 1:

Install Homebrew and Homebrew-cask:

There is an awesome way for Mac users to install and manage their Java installations in a graceful way. Before getting to how to manage multiple versions of Java, let’s get to how to install java on a Mac OS X.

IMHO, if you are going to develop java apps or, to that matter of fact any programming in your Mac, I feel its almost mandatory to have homebrew tool installed on your machine.

If you do not have the tool installed yet, please do visit homebrew’s webpage to know how to install homebrew on your local machine.

Homebrew is what yum is for linux. Its a package manager for Mac OS.

Verify that you have correctly installed homebrew by running the following command.

Screen Shot 2018-03-30 at 6.36.17 PM

Also, while you do this, do install the homebrew-cask by running the following command. (Visit this place to see any other interesting way to install cask).

~> brew tap phinze/homebrew-cask
~> brew install brew-cask

Now you have all the power to install awesome tools from homebrew.


Part 2:

Install Java thru Homebrew

Now all you have to do is to run the following command in your terminal.

Step 1: Verify if you have a java version:

~> brew cask info java8

brew cask info

Observe that the output shows that java8: 1.8.0_162-b12 is not installed.

Step 2: Install java:

~> brew cask install java8

install java8 using brew

You have now successfully installed java on your Mac.


Part 3:

Install jenv to manage multiple version of java on Mac OS X:

Alright, now that you have java, let’s say within six months from now you get a new release for java. You do not want to upgrade your projects, but still, try out new and cool language features.

Managing multiple java versions might be a nightmare and requires some effort. Luckily, to our rescue is an awesome tool called jenv. Let’s look at how to manage multiple java versions in an awesome way.

jenv is a utility tool that manages multiple versions of java and gives you control to switch java versions with ease.

If you have made it this far, it is assumed that you have installedhomebrew on your machine, so lets get started right away to install jenv.

All you need to install jenv is to run the following command.

~> brew install jenv
~> echo 'eval "$(jenv init -)"' >> ~/.bash_profile

If you see the following result after your,brew install jenv command, then you have successfully installed jenv on your machine.

jenv installation confirmation

Run the following command to add java versions for jenv to manage for you.

~> jenv add /Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home/

That’s it, your java version can be managed by jenv now.

PS: You might run into problems with the following error when trying to run the above command.

ln: /Users/your_username/.jenv/versions/oracle64-1.8.0.162: No such file or directory

If you encounter this result when adding your java version, all you need to do is to create a directory .jenv and versions in your home directory and run the add command again.

Once the jenv add command succeeds, you should see a message like this

Add java version to jenv

Boom! done.

If you have multiple java installations on your machine, you would have to add all the java installations to jenv.

jenv provides you different commands to switch java versions based on your needs.

To list all the java installs managed by jenv run:

~> jenv versions
* system (set by /Users/vranganathan/.jenv/version)
1.8
1.8.0.162
9.0
9.0.4
oracle64-1.8.0.162
oracle64-9.0.4

To configure a version:

//Configure globally on your machine
~> jenv global 1.8

//Configure locally per directory
~> jenv local 1.8

//Configure per shell
~> jenv shell 1.8

There are lot more features that jenv offers. You can go thru their documentation briefed in their github page.

Standard