Apache Cassandra is a high-performance opensource NoSQL database engine that provides fault tolerance, linear scalability, and consistency across multiple nodes. Give its distributed architecture, Apache Cassandra handles huge volumes of data with dynamo-style replication. This is where replicas are stored on several nodes in a cluster thus providing high availability and zero points of failure.
Apache Cassandra is ideal in IoT applications where massive data is collected. It also comes in handy in social media analytics, messaging services, and retail applications.
Among the companies that make use of Apache Cassandra include Netflix, Facebook, Cisco, Hulu, Twitter, and many more.
In this article, you will learn how to install and configure Apache Cassandra on Ubuntu 20.04 and Ubuntu 18.04.
Step 1: Installing Java on Ubuntu
Installation of Apache Cassandra begins with checking whether Java is installed. To be more specific, OpenJDK is what is required to work seamlessly with Apache Cassandra. Installing a different version is more likely to give you errors during configuration.
To check whether Java is installed, run the command:
$ java -version
If Java is not yet installed, you will find the output printed as shown on your terminal.
To install OpenJDK, execute the following apt command.
$ sudo apt install openjdk-8-jdk
Once again, confirm that Java is installed by running the command.
$ java -version
Step 2: Install Apache Cassandra in Ubuntu
With Java installed, we will proceed to install Apache Cassandra. First, install the apt-transport-https package to allow access of repositories via the https protocol.
$ sudo apt install apt-transport-https
Next, Import the GPG key using following wget command as shown.
$ wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
Then add Apache Cassandra’s repository to the system’s sources list file as shown.
$ sudo sh -c 'echo "deb http://www.apache.org/dist/cassandra/debian 311x main" > /etc/apt/sources.list.d/cassandra.list'
Before installing Apache Cassandra, you need to update the package list first.
$ sudo apt update
Then install the NoSQL database using the command:
$ sudo apt install cassandra
Usually, Apache Cassandra starts automatically. To confirm its status, run the following command:
$ sudo systemctl status cassandra
The output below confirms that Cassandra is up and running as expected.
Additionally, you can verify the stats of your node by running the command.
$ sudo nodetool status
To log in to Cassandra on the terminal, invoke the command.
$ cqlsh
Step 3: Configuring Apache Cassandra in Ubuntu
Apache Cassandra configuration files are stacked in the /etc/cassandra directory whilst data is stored in /var/lib/cassandra directory. Start-up options can be tweaked in the /etc/default/cassandra file.
Cassandra’s default cluster name is ‘Test Cluster’. To change this to a more meaningful name, log in to Cassandra.
$ cqlsh
To set the Cluster name to your own preference, run the command shown below. In this case, we are setting the cluster name to ‘Tecmint Cluster‘
UPDATE system.local SET cluster_name = 'Tecmint Cluster' WHERE KEY = 'local';
Exit the prompt by typing:
EXIT;
Thereafter, head out to the cassandra.yaml file as shown:
$ sudo vim /etc/cassandra/cassandra.yaml
Search for the cluster_name
directive and edit the cluster name accordingly as shown below.
Save and exit the configuration file and restart the Cassandra service. You can log in again to confirm the cluster name as shown.
And that concludes the topic on the installation of Apache Cassandra on Ubuntu 20.04 LTS.