How are Secondary Indices really stored ?
This is based on the article from Datastax found here; https://www.datastax.com/blog/2016/04/cassandra-native-secondary-index-deep-dive
Let’s just create a simple table
CREATE TABLE customer (
id int PRIMARY KEY,
Or visualized as a table :
If we then create an index like this
CREATE INDEX customer_city_idx ON customer (city);
Then this will result in just “normal” table, just hidden , and here the column we created the index for becomes the Partition Key, and the original table Partition Key becomes the clustering key
With some data it would be like this for the “customer” table.
And the index which then is a “table” would thus be like this
When a cluster is used, the index then the data of the source table is distributed over the nodes, using the murmor3 algorithm. Now the index table is also distributed, BUT together on the same node with the data of the source table.