cassandra

Load test Cassandra – The native way – Part 1: The Why

Load testing is an imperative part of the software development process. The idea is to test out a feature/service in a prod-like environment with a realistic high load for an unusual time frame, just to gain confidence that the service would not bail out on us during critical times. Quite logical isn’t it? In this series, I’ll go thru my very brief experience load testing a schema in cassandra. So lets get started right way!

With micro-service architecture being a norm at almost every turn in software development, it is worth spending time, talking about how to load test a micro-service. Is it going to be different than testing a monolithic service? Since we say that we test out a near prod-like setup, does it mean that I have to spend a whole lot and setup the same number of nodes that prod-cluster has? But, what if I have some kind of auto-scaling setup? These were a few questions that I had when I had to load-test a micro-service. The answer is quite simple mathematics; extrapolation. We simulate the load to a node and then extrapolate the result. This however, may not be accurate as there may be a few things that might be left out of the equation like network bandwidth, disk I/O, etc. It is also essential to load test the load-balancer to get a clear picture.

But wait! The above method works fine as long as each service have just one responsibility. How about load testing a scenario where the architecture is supposed to perform only if its a part of a cluster?. What do we do, if these processes talk to each other and gossip among them? There are many big-data architectures like this and one such service is cassandra. Fortunately, there is a tool that comes bundled with cassandra for this very purpose; cassandra-stress.

cassandra-stress was initially developed as an internal tool created by developers of cassandra to load test the internals of cassandra. Later, a mode was added to this tool to enable cassandra users test their schema.

I wouldn’t definitely claim that cassandra-stress is the only way to achieve this. In fact, load testing cassandra was possible way before this tool was generally available to test cassandra cluster. My online research yielded the next most popular public choice was to use a JMeter plugin. I choose cassandra-stress because of the obvious fact that its a native tool that comes bundled with cassandra and has a pretty easy learning curve.

Let’s go over how to configure your own load test using the cassandra-stress tool in another post.

Standard

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.