This post is a quick tutorial on how to use Scala
for interacting with Cassandra
database. We will look into how to use phantom
library for achieving this. phantom
is a library that is written in Scala
and acts as a wrapper on top of the cassandra-driver
(Java
driver for Cassandra
).
The library phantom
offers a very concise and typesafe way to write and design Cassandra
interactions using Scala
. In this post, I will be sharing my 2 cents for people getting started in using this library.
Prerequisites
Before getting started, I would like to make a few assumptions to keep this post concise and short as much as possible. My assumptions are that;
- You know what
Cassandra
is. - You have
Cassandra
running locally or on a cluster for this tutorial. - You have a piece of basic knowledge of what is
Scala
andsbt
.
In this post, let’s write a simple database client for accessing User
information. For simplicity sake, our User
model would just have id
, firstName
, lastName
and emailId
as its fields. Since cassandra
is all about designing your data-model based on your query use-case, we shall have two use-cases for accessing data and hence shall define two tables in cassandra
.
The tables would have the following schema.
Make sure your database
has this schema configured before proceeding further.
Step 0: Add phantom
library to the dependencies
Let us start by creating a simple sbt
scala
project and add the following dependency to build.sbt
file to the libraryDependencies
.
"com.outworkers" % "phantom-dsl_2.12" % "2.30.0"
Step 1: Define your model class
In cassandra
, it is totally acceptable to denormalize the data and have multiple tables for the same model. phantom
is designed to allow this. We define one case class as the data holder and several model classes that correspond to our different use cases. Let us go ahead and define the model and case classes for our User
data-model.
Now, we define the tables that correspond to the tables in Cassandra
. This allows phantom
to construct the queries for us in a typesafe manner.
The following are the use cases for which we would need clients;
- To access the data using the user id. (
user_by_id
table) - To access the data using the user’s first name. (
user_by_first_name
table)
We create appropriate models that reflect these tables in cassandra
and define low-level queries using phantom-dsl
to allow phantom
to construct queries for us so we don’t have to write any cql
statements ever in our application.
You can access the state of the project until this step in GitHub here.
Now that we have our models and low level queries defined, we need to design an effective way to manage our session and database instances.
Step 2: Encapsulate database instances in a Provider
Since we have interactions with multiple tables for a single model User
, phantom
provides a way to encapsulate these database instances at one place and manage it for us. This way, our implementations won’t be scattered around and will be effectively managed with the appropriate session object.
So, in this step, we define an encapsulation for all the database instances by extending phantom
‘s Database
class. This is where we create instances for the models we defined in the above step.
Notice that we also defined a trait
extending DatabaseProvider[UserDatabase]
in the above snippet. We will in the next step discuss how this trait is useful.
You can access the state of the project until this step here.
Step 3: Orchestrate your queries using DatabaseProvider
Now, that you have your model and database instances in place, exposing these methods may not be a good design. What if you need some kind of data-validation/data-enrichment before accessing these methods. Well, to be very specific in our case, we would need to enter data into two tables when a User
record is created. To serve such purposes, we need an extra layer of abstraction for accessing the queries we defined. This is the exact purpose of DatabaseProvider
trait.
We define our database access layer (like a service) by extending the trait
DatabaseProvider
and orchestrate our low-level queries so that the application can access data with the right abstraction.
You can see that we were able to combine both the inserts (to user_by_id & user_by_first_name) in one method call. We could have definitely, done some sort of validation or data-transformation if we wanted to here.
You can access the state of the project until this step here.
Step 4: Putting everything together
We are all set to the database service we created so far. Lets see how this is done.
We start out by creating our CassandraConnection
instance. This is how we can inject out cassandra configuration into phantom
and let it manage the database session
for us.
Here we are using cassandra
installed locally, hence we used ContactPoint
.
local
. Ideally in real-world we would have to use ContactPoints(hosts)
.
Then we create our database encapsulation instance and the service object.
Now, it is as simple as just calling a bunch of methods to test out if our database interactions work.
You can find the final state of the project here.
We might have had a lot of constructs like Database
, DatabaseProvider
, etc, bundled with phantom-dsl
. But in my opinion, this is something that makes it more than just yet another scala
dsl-library. It is because of these design constructs, phantom
implicitly promotes good design for people using it.
Hope you found my rambling useful.