Apache SPARK and Cassandra and SQL

This is a short intro to start using Apache SPARK with Cassandra, running SQL on the Cassandra tables.

Note that I am not running a SPARK cluster, I am running “local”, to me this is really convenient, not having to run a SPARK server and workers for something so small. So for playing around with SPARK and Cassandra this is really good.

I am using Scala and SBT.

Something I was struggling hard with, to get the dependency versions right. It is crucial that you do not do like I did first, use version 1.5.2 of Spark and 1.5.0 for SparkCassandraConnector, this will NOT work. I constantly got exception with java.lang.NoSuchMethodException, so incredibly frustrating to try out version after version.

build.sbt

A small Scala program to show how it works

SparkTest.scala

The output…

 

 

2 thoughts on “Apache SPARK and Cassandra and SQL

  1. Well I think that you should make sure that
    val sparkV = “1.5.0”
    val sparkCassandraConnectorV = “1.5.0-RC1”
    val cassandraDriverV = “3.0.0-rc1”

    has version 1.5.2

    val sparkV = “1.5.2”
    val sparkCassandraConnectorV = “1.5.0-RC1”
    val cassandraDriverV = “3.0.0-rc1”

    Go to the maven repository there is Spark version 1.5.2
    http://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11/1.5.2

    then go to the maven repository for the Cassandra Spark Connector and there is no version 1.5.2 there, however there is a version 1.5.0
    http://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11

    According to this page
    https://github.com/datastax/spark-cassandra-connector

    Cassandra Spark Connector 1.5 should work with 1.5.x Spark version, so that should work.

    If you can not get that to work, I suggest you upgrade to 1.6, but perhaps you are not allowed to… then it is tough….

    Good Luck !

Leave a Reply

Your email address will not be published. Required fields are marked *