The Curious Case of Cosmos DB

Kunal Sale
6 min readJan 25, 2021

--

Recently, I got a chance to work on the Cosmos DB which is offered by Microsoft Azure. Initially it was fun writing CRUD operations on the DB. Once the things got complex, I had to take out the magnifying glass like in the above image, to get a closer look of the DB. Often the concepts and solution to a problem were interesting and increased the curiosity to learn more about it. So, I wanted to share some of the details which can help someone who wants to choose Cosmos DB or probably someone who is stuck.

Data Modelling (Normalise or Denormalise)

Cosmos DB is a Document based NoSQL database which is different from relational database. In relational database, we try to normalise the data by breaking an entity into a discrete components. In Cosmos DB, we should denormalise the data that is to keep every related thing of a particular entity in same document. As Cosmos DB doesn't support to write joins on different containers as we used to do in the relational database across the tables.

Suppose, there is an entity “Person” and its various components are “Address”, “ContactDetail”. In relational database, all the three components would have been in the separate tables and associated through the Public Key-Foriegn key mechanism. Picture will be different in Cosmos DB. There will be a document “Person” inside which there will be nested array of “Address” and “Address”.

{
"first_name": "Kunal",
"last_name": "Sale",
"address": [{
"city": "Mumbai",
"state": "Maharashtra"
}],
"contact": [{
"phone":"+1111111111",
"email_id":"something@abc.com"
}]
}

It’s all about the RU’s:

RU is basically Request Units which is used to calculate the throughput of a container. All the resources usage like CPU, IOPS(Input/Output Operation per second) and memory are considered under RU. Azure charges the usage of Cosmos DB on the RU consumption and it is on hourly basis. Now, you must have understood how important the RU is as its consumption is directly proportional to money you pay. There are two modes for setting the RUs.

  1. Manual mode:

You can set RU for the Cosmos DB manually and Cosmos DB will give the throughput as per the RU given by you. In this mode, you need to know exactly how much is your consumption rate and load on the DB. If there is miscalculation in RU and you have got requests more than you expected, then be prepared for 429 To Many request error.

Let’s take an example, If you have set the RU to 400 and let’s consider 1kb document write operations will take 1RU so Cosmos DB can handled 400 write request at a time. Suppose, the load is 500 request now, then 400 requests will be served directly but rest 100 has to wait and has to do retry in order to get a chance. Here, Auto-scale comes into the picture.

2. Auto-scale mode:

In Auto-scale mode, you don’t have to worry about the RU calculation. It will auto-scale the RU if there is a load on the Cosmos DB and 429 error will not occur. So, you will think that if it auto-scales the RU and does it also downsize the RU once consumption is low?. Answer is yes. Also, pricing tier for the auto-scale is such that it considers the highest RU consumed in a given hour. So, you will be charged high only in the peak hours and most of the time it will be charged by the lowest bound.

Now, the question is which one to choose?

We can set the RU on database as a whole and also on containers separately. Keep the RU on the container level and set the manual RU’s for those you know that the consumption is low. Those containers for which you cannot predict exact consumption rate but you know that it will be on the higher side, simply enable the auto-scale mode and let Cosmos DB handle it.

Don’t be lazy to choose the Partition key!

Performance of the containers largely depends on Partition key. Cosmos DB distributes the data and makes the logical partitions by using Partition key. Documents with the same partition key will be stored in the same cluster (Logical partition). Using this mechanism, Cosmos DB ensures that the data operations are cheap and less expensive. Don’t confuse the partitioning with the Indexing. It is very important to choose an efficient partition key for the performance.

Few tips to use the partition key for improving the performance:

  • Try to write queries that contains partition key because Cosmos DB can easily fetch the given document with minimum response time.
  • Choose a partition key which is unique throughout a container as it will distribute the partition evenly. But it cannot happen every time as you cannot find a partition key which will be unique for all the containers.

That’s Okay. In order to achieve the efficient throughput even if the partition key is not unique, you have to evaluate your use case and use combination of the keys to make it a optimised query.

Note: Partition key cannot be changed for a container after it is created. If you are planning to change the partition key then you have create a new container and migrate the data into it. Hence, don’t be lazy to choose you partition key wisely.

Beware of Concurrency

In a database, there are locks which means operation will wait until it gets the lock for the particular data. It makes sure that the data is uniform. Cosmos DB doesn’t lock the document on which the operation is currently happening. If there are concurrent write operations on the database, there is no assurance which document copy will get override the source.

Lets take an example:

{
“first_name”: “John”,
“last_name”:”Doe”,
“employee_id”: 1,
“mobile_number”: “9999811222”,
“city”: “Mumbai”,
“state”: “Maharashtra”
}

Suppose, there are 3 requests (R1, R2 and R3) which has read the above document on time T0. They modified the document accordingly to the businees logic. R1 changed the mobile_number to “9212121219”, R2 changed the city to “Bangalore” and R3 changed the state to “Karnataka”. And now they update their copies into the DB in the order R1, R2 and at last R3. Then, the result will be

{
“first_name”: “John”,
“last_name”:”Doe”,
“employee_id”: 1,
“mobile_number”: “9999811222”, ← R1 updated but it was overriden by R3
“city”: “Mumbai”, ← R2 changed this to “Bangalore” but it was overriden by R3
“state”: “Karnataka” ← Only the R3 changes got reflected because R3 copy didn’t have the changes made by the R1 and R2.
}

So, this is a major issue if the load is high and there are concurrent transactions on a particular document at a given time.

Solution: Optimistic Concurrency: (_etag: The unsung hero)

When the document is created in the container, there are few system defined parameters other than id. _etag is one of them and it keeps on changing whenever there is an update in the document. It can be used to avoid the data being overridden by the concurrent write operations.

It can be achieved by modifying the upsert operation as given below:

// Update the data
function update(<object_to_be_updated>){
return this.container.items.upsert(<object_to_be_updated>, {
accessCondition: {
type: “IfMatch”,
condition: <document_object>._etag
}
});
}
//Fetch the data for the given id
function fetch(id){
const querySpec = {
query: `select * from c where c.id='${id}'`
};
return await this.container.items.query(querySpec).fetchAll();
}
await update(<object_to_be_updated>)
.then(updateResponse => {})
.catch(async (error) => {
if(error.code === 412){
return await fetch('123') //the latest data is being fetched
.then(async (fetchedObject) {
return update(fetchedObject); //Again the update is called with the fetched data
})
.catch(error => return error)
});

In the above code, accessCondition is the object being added in the upsert which sends the current object _etag. So while updating the document in the container, Cosmos DB checks whether the _etag in the source document is same as the _etag sent in the upsert. If it is doesn’t matches then it gives 412 Precondition failure as an error which simply means that the document has been updated. In the catch, you have to fetch the latest document and again call the upsert with this document.

Let me summarise the points:

  1. Try to keep your data in a single document if possible (applicable only in NoSQL).
  2. Keep an eye on your RU as it is what effect your pocket. 😆
  3. Choose the partition key wisely as it will make your queries faster.
  4. Optimistic concurrency is a mechanism to avoid the concurrent write operation and avoid loss of the data.

--

--