To SQL or to NoSQL

Matthew MacFarquhar
4 min readApr 21, 2023

--

Over the past few weeks I have been watching videos, reading articles (and asking ChatGPT) what the main differences between these two Database Management Systems are. In this article I will talk about the differences and discuss some quick questions to ask yourself when picking which one you should use in your project.

I want to preface this comparison by saying in many cases, your project will use both. Neither of these are one size fits all which is why it is so nice that they can both live together in harmony.

An Example of Using Both in Harmony

Here is an example of where you would use both of these. Suppose I am a bank and I need to maintain very highly structured customer data (i.e. I cannot have customers without SSN and I cannot have customers with lists of addresses while others have one) I might also want to use some of SQLs powerful features such as Transactions (to ensure a modification is made successfully across tables and if not, then roll back) and rely on SQLs ACID guarantee to ensure that my customers account data is stable and correct.

However, my bank also released an app to help my customers track their spending. Receipt data is very inconsistent across different vendors (some may be missing addresses, some may not itemize the bill, some may not even have the vendor name!) putting this into a SQL table would be a nightmare full of NULL columns. Also, the sheer amount of receipt data we will need to collect would require a MASSIVE computer (because SQL can only scale — effectively — vertically). For these reasons, we would probably want to leverage a noSQL database for our receipt management side of our business.

Quick Questions

Now, we are going to go through some quick questions about your use case to determine if SQL or noSQL would fit more easily into your use case.

Do you have a lot of data?

And I mean a LOT like over 10TB worth, if so then noSQL is very likely to be your winner, the reason is that SQL can only scale vertically (i.e. get a bigger computer) while noSQL can scale horizontally (a lot of small computers). At some point, the size of computer you need to store all your data will become prohibitively expensive — or might not exist at all. The reason SQL cannot scale horizontally is because of its relational nature, the Queries and joins that SQL allow you to do means that the query must run on all of your data (which means it all has to be in the same place).

Do you have highly structured data?

If you have some requirements for your data to all have the same structure, then SQL is your best bet, you could mimic this in noSQL by enforcing a “schema” on write but there is no built in force stopping you from changing up the schema like there is in SQL.

Do you have irregular structured data?

noSQL will work best here, I can store partial data and different data types in a single noSQL database with no issue. Since noSQL data is essentially just json we can have really rich object structures as well like maps and lists and JSON objects.

Do you need to query your data with SQL Query Language?

This should be a no brainer…

Do you need to operate on multiple entries with a single transaction?

In this case, SQL wins out as a single transaction can update multiple rows (entries) and if it fails, all of these changes are rolled back. In noSQL there is no such rollback guarantee, half the entry updates could fail while the other half succeed and unless you explicitly preform a rollback, your transaction will be in this in between state.

Need lots of concurrent connections?

noSQL is the champ here as a single SQL server will have a limit on concurrent connections (since it is just one computer) while our noSQL servers have virtually no limit.

Some extra stuff

  • SQL is better if you don’t want repeated data across different tables (since it leverages joins and foreign keys) in noSQL it is very common to duplicate the same data in different tables.
  • SQL is really good for complex queries and reading your data, while noSQL is faster at writing data (since there is no strict schema to validate).
  • In SQL you would have multiple tables for customer, savingsAccount, checkingAccount in noSQL you can dump all of that together into on entry in the database.

I hope this article has helped you in your decision of SQL vs noSQL. But more importantly, I hope it has helped you realize that — like peanut butter & jelly — they are better when used together.

--

--

Matthew MacFarquhar
Matthew MacFarquhar

Written by Matthew MacFarquhar

I am a software engineer working for Amazon living in SF/NYC.

No responses yet