Zeppelin has become one of my favourite tools in my toolbox. I am heavily designing stuff for Cassandra and in Scala, and even though I love Cassandra there are times when things just gets so complicated with the CQL command line, and creating a small project in IntelliJ just seems like too much hazel. Then using Zeppelin to try out is just perfect. So this page is a How-To with some useful Cookbook recipes.
Setting Up Zeppelin
I use Docker where things are so much easier, and I pick v0.8.0 cause I never got 0.8.2 to work for some reason.
Download and Start Cassandra
1 |
docker pull cassandra |
1 |
docker run --name Cassandra3 -p 9042:9042 cassandra:3.11 |
Download and Start Zeppelin
Download Zeppelin image
1 |
docker pull apache/zeppelin:0.8.0 |
Start Zeppelin on port 8080
1 |
docker run -p 8080:8080 --name zeppelin apache/zeppelin:0.8.0 |
-p hp:cp
hp = Host Port, the port on your local machine
cp = Container Port, the port inside the docker which is what Zeppelin is exposing
Go to localhost:8080 in your web browser and you should see something like this
Setup Zeppelin
Find out the IP address of Cassandra in you Docker network, as you can see of the inspect, the IP address is 172.17.0.3.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
QSWEM078:~ teriksson$ docker network inspect bridge [ { "Name": "bridge", "Id": "355be8072aafa87bafa8de19d00af597746039000d27e9245e2464fa54bf81a8", "Created": "2020-04-03T14:23:57.446760383Z", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "172.17.0.0/16", "Gateway": "172.17.0.1" } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": { "ceda1cebea87ee7244f00d5e88292ff76fc46142627ed4064e0b98cd92f728a3": { "Name": "zeppelin", "EndpointID": "2cc39278d16db811bc593945adcc4a7ae2d0e5409a98c1ddf0d548bcf0b7052a", "MacAddress": "02:42:ac:11:00:02", "IPv4Address": "172.17.0.2/16", "IPv6Address": "" }, "f772b8c66fe3729bd00e2bd9d2e50472ec40b1e8047796f8f69db6ecee6a77ae": { "Name": "<strong>cassandra3</strong>", "EndpointID": "23fde4a184ca9456ddec164616c4603f6ee8f3c310e21cb7c4409d350d7c3fd6", "MacAddress": "02:42:ac:11:00:03", "IPv4Address": "<strong>172.17.0.3</strong>/16", "IPv6Address": "" } }, "Options": { "com.docker.network.bridge.default_bridge": "true", "com.docker.network.bridge.enable_icc": "true", "com.docker.network.bridge.enable_ip_masquerade": "true", "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0", "com.docker.network.bridge.name": "docker0", "com.docker.network.driver.mtu": "1500" }, "Labels": {} } ] |
Set up IP address for Cassandra in the Spark Interpreter
Go to the section on “Spark”
Now add a row that says
1 |
spark.cassandra.connection.host : <span class="ng-scope ng-binding editable">172.17.0.3 </span> |
Now also edit the Dependencies
You can do this in many ways, either you specify the MAVEN repo with version OR you download the JAR file(s) to disk and copy them into the Docker. I had to do the latter due to some issue with my network.
You need these two libraries :
- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11/2.0.12
- https://mvnrepository.com/artifact/com.twitter/jsr166e/1.1.0
Simply click on the JAR file and download the file, then copy it into the docker with
1 |
docker cp spark-cassandra-connector_2.11-2.0.12.jar zeppelin:/zeppelin/interpreter/spark/dep/spark-cassandra-connector_2.11-2.0.12.jar |
1 |
docker cp jsr166e-1.1.0.jar zeppelin:/zeppelin/interpreter/spark/dep/jsr166e-1.1.0.jar |
Setup IP address for Cassandra in Cassandra Interpreter
1 |
cassandra.hosts : 172.17.0.3 |