Skip to content

"SQL or NoSQL," that's the question

Share on twitter
Share on linkedin
Share on email
Share on whatsapp
SQL or NoSQL", that's the question

One of the most important moments in most projects we face today is deciding where and how to store data. The success of our project will largely depend on this choice, whether it is right or wrong.

We currently have a wide variety of databases. We have the traditional SQL databases, the NoSQL, as well as the more modern, and less known newSQL, next to them we can also find the object oriented ones, in memory, hybrids, and a long etcetera.

There is not a perfect database for all applications, so we must analyze which one will be the best for our application, choosing one or another will depend on the needs of the service you want to offer.

In this post we will focus on the choice between SQL or NoSLQ which, in a way, is to decide between relational or non-relational databases. The rest of the solutions in the market are either based on the relational model, like the newSQL that try to improve the performance of the SQL, or based on the non relational model, like the object oriented ones.

The following questions can help us focus on the choice between SQL or NoSQL: What size of data do we need to manage? What is the nature of the data? What response time do we need? What types of queries and services do you need to provide? How will we scale up data and query growth? Do we need high availability, fault tolerance, data integrity?

Let's see the main characteristics of each one of them to know their advantages and disadvantages and thus be able to answer the questions we have.

Main features of relational databases

  • They are based on an entity-relationship model.
  • They require a known and predefined scheme, which makes them very inflexible to change.
  • They use the SQL language and JOIN operations for the data query.
  • They guarantee the properties ACID.
    • Atomicity: Atomicity implies that every transaction is "all or nothing". If any part of the transaction fails, the whole operation fails and the database remains unchanged.
    • Consistency: Consistency is the property that ensures data integrity. Any insertion, modification or deletion of data will change the database from a valid state to another valid state. From this new state all new queries will show the new changes.
    • Isolation: Isolation implies that any transaction cannot affect others. It requires that concurrent transactions be executed separately. Most databases offer different levels of isolation to reduce the load of locks.
    • Durability: Durability is the property that guarantees the persistence of the data. Once an operation has been confirmed, the changes applied are permanent, even if the database falls.
  • Vertical scalability. They are usually scaled by more powerful hardware, which can involve heavy investments.
  • A very high number of requests can generate bottlenecks, due to the consistency and isolation, slowing down the whole system.

Main characteristics of the non-relational databases

  • They do not require a predefined scheme. As they are not tied to a scheme they offer a lot of flexibility.
  • They do not usually use SQL as a language for data queries.
  • The data is usually unformatted to avoid JOIN operations, which are usually not allowed.
  • They do not guarantee the ACID model, but are more in line with the BASE.
    • Basic Availability. The system continues to work when some part fails thanks to a distributed structure and replication of information.
    • Soft State. Not all parties have to return the same values at the same time.
    • Eventual Consistency. Consistency occurs eventually.
  • Horizontal scalability. They are designed to be scalable across multiple low-cost servers.
  • Capable of storing and managing large volumes of data with high performance and availability at the cost of loss of consistency and isolation.

Main differences

The fundamental difference is that NoSQL databases do not use the entity-realization model, while SQL databases are based on it.

Relational databases are the best known and most used. Composed of tables, defined by structured fields and relations, they are a robust but not very flexible solution. We all know the implication of changing the relational model plus the consequent modifications that will have to be made in the applications that are based on its known data structure.

In contrast, NoSQL databases are a more open and flexible solution to different types of information, but data integrity can be compromised by poor transactional support, which is why they are generally less robust.

CAP Theorem

According to this theorem a distributed system cannot simultaneously guarantee consistency, availability and partitioning of data, but can only have two of the three characteristics at the same time.

  • Consistency: Consistency implies that all nodes see the same information at the same time.
  • Availabilitty: Availability ensures that every request, both read and write, always receives a response.
  • Partition Tolerance: Data partitioning allows the system to continue working even if some nodes fail or are shut down.

Taking the theorem to our choice of database we will have to choose two of the three characteristics that best suit the needs of our system, and we can obtain the following combinations:

  • CA: Consistency and availability. Partitioning is not a requirement.
  • CP: Consistency and partitioning of data at the expense of availability.
  • AP: Data availability and partitioning. Consistency is not important.

The SQL databases are CA. Consistency is provided by ACID properties and availability can be achieved by clustering.

In the case of NoSQL we can find the other two possibilities depending on the type of database. We will comment on the different types of NoSQL in a new post soon.

Conclusions

If the data is always structured and invariable, we must ensure the ACID properties, and/or if we do not require much scalability, the best option will be SQL.

On the contrary, when the data do not fit in a relational model, are very variable with little or no structure, the volume of information grows very quickly and/or vertical scalability is not feasible in costs, we should propose a NoSQL solution.

During the last years, the war between SQL and NoSQL has been open, although some considered that NoSQL would win the battle by making SQL disappear, the truth is that nowadays both are still needed, each one to solve different problems, each one with its pros and cons.

It is not surprising that it is increasingly common to find applications where SQL and NoSQL coexist to provide solutions to our projects. This is what is known as Polyglot Persistence, so perhaps it is time to "smoke the pipe of peace".

Share the article

Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on email
Email
Share on whatsapp
WhatsApp

A new generation of technological services and products for our customers