Upskill/Reskill

Jun 13, 2022

To SQL or not to SQL, that is the question

Kenneth Kabiru

4 minutes

When Prince Hamlet opens his soliloquy with the phrase “To be, or not to be”, he is debating between living and dying. While this article will not delve into this morbid subject, the decision to use a relational or non-relational database for your application has far reaching consequences. Hopefully, by the time you’re done reading this, you’ll be in a better position to decide whether to SQL or not to SQL.

What is SQL?
Standard Query Language (SQL) is a programming language that was developed in the 1970s by a team from IBM Corporation. It was popularly known as SEQUEL, which is how some of us still pronounce it. The formulation of SQL can be traced to a paper by Dr. Edgar Codd on “A Relational Model of Data for Large Shared Data Banks”, which is considered the definitive model for Relational Database Management Systems (RDBMS). Codd sought to apply Relation Theory to data management in an attempt to abstract how data is organized from its users.

Similar to how SQL was developed to solve a problem, the popularity of noSQL has grown because of certain limitations with SQL. The needs of Web 2.0 companies such as Facebook have marked an increase in infrastructure and data needs; this has led to the term ‘Big Data’. Big Data refers to data that has more variety, velocity, and volume, popularly known as the 3 Vs.

For example, as of 2020, Twitter was averaging approximately 500 million tweets each day, which translates to around 6,000 tweets per second. The magnitude of the 3 Vs of this data might not be comprehensively handled by an RDBMS. Big data is an asset for an organization, but the advantage it offers is greatly determined by how fast this value can be derived from it.

This realization prompted large firms such as Amazon and Google to develop noSQL solutions like Cassandra, DynamoDB, and BigTable. Two more Vs of Big Data have emerged over the recent past: veracity and value. The 5 Vs have increased demand for non-relational databases, with an emphasis on consistency, performance, and reliability.

Relational Databases

SQL databases/ RDBMS connect data from several tables using keys. A key is a unique identifier that can be used to identify a row from another table. For example, in a STUDENTS table, the key in the tutor column would link to a row in the TUTORS table. Therefore, using the key can be used to retrieve details such as name, age, etc of the tutor. In the TUTORS table, this key might not necessarily be the Primary Key, but it is required to meet the unique constraint e.g. an email. In the STUDENTS table, the key is referred to as a Foreign Key and it establishes the relationship between records in different tables.

In addition to the existence of relationships between tables, SQL databases enforce a static and predefined schema. The schema outlines data configuration for example, a certain column should always be an INTEGER or a table only has 4 columns.

Examples of relational databases include Oracle, MySQL, Microsoft SQL Server, SQLite, & PostgreSQL.

Properties of Relational Databases

Stores data in tables
Strict and pre-defined schema
Queries performed using SQL

Advantages of Relational Databases

Referential Integrity: RDBMS achieves consistency through the use of primary and foreign keys. On the other hand, data integrity and consistency is achieved using constraints; for example, deleting a primary record would not be successful if there are foreign keys referencing it. Noteworthy, SQL allows for different rules in such scenarios: ON DELETE CASCADE i.e. delete related objects, ON DELETE RESTRICT i.e. deletion only allowed if the foreign key is referenced in the same operation, ON DELETE PROTECT i.e. do not allow deletion of referenced objects, and, ON DELETE SET NULL i.e. set Foreign Key to null.

ACID Compliance: Atomicity, Consistency, Isolation, and Durability (ACID) properties ensure database correctness and consistency.

Normalization: You can implement this technique in SQL databases to minimise data redundancies. This involves breaking down large tables into smaller ones.

Ability to write complex SQL queries, especially when performing data analysis.

Disadvantages of Relational Databases

Scalability: SQL databases scale vertically. As more data is collected and stored, the computing power and hardware will also need to be increased. In addition, significant processing power is needed to achieve complex queries on relational tables.

Rigidity: RDBMS require strict compliance with a pre-defined schema. Any alterations, for example from the data source, would require significant changes in the data model. This would also be implemented on already existing rows.

Non-Relational Databases

NoSQL databases differ from RDBMS in their lack of structure. Non-relational databases are schema-free, which offers flexibility when adapting to changing data models. These databases are optimized for the data type they are storing, instead of focusing on relationships between various records. The four common storage models adapted by noSQL databases are key-value, document, graph, and column-oriented.

Key-Value Stores

Uses a hash table where a unique key references an item. The keys can be organised in various logical groups that allow for better readability. An example of a key-value store is Redis.

Document Stores

These databases use the key-value model, but differ from key-value stores as they offer data encoding. Common encodings include JSON, BSON, and XML. In addition, querying can be done using both the data and the keys. Examples in this category include MongoDB and ElasticSearch.

Graph Databases

Edges and nodes are used to represent data in the form of a directed graph structure. These stores are popularly used in social networking applications where the focus is more on the relationship between objects instead of the objects themselves. Examples include Neo4J and Graph Base.

Column-Oriented Stores

In RDBMS, data is stored in rows, whereas this noSQL model stores data in columns. This approach offers fast reads and writes as rows for a single column are saved as a single disk entry. Examples of column stores include Cassandra and BigTable.

Advantages of noSQL Databases

Flexibility: non-relational databases are schema-free and hence lack the rigidity that RDBMS try to enforce.
Horizontal Scaling: This allows for speedy querying by adding more machines for example cloud servers.
Less Maintenance Costs: RDBMS require trained SQL professionals to manage and maintain them. However, noSQL databases offer easier data distribution and automatic repair that make administration less demanding in terms of workforce.

Disadvantages of noSQL Databases

Lack Data Consistency and Integrity: noSQL databases do not implement ACID properties. To achieve similar standards, custom logic has to be implemented, which results in increased complexity.
Lack of a Standard Language: Unlike RDBMS, where SQL is used to write queries, each noSQL database has its own unique query language; this makes it difficult to shift between such databases. For example, a developer that has worked with MySQL would find it easier to use PostgreSQL in contrast to switching from Redis to ElasticSearch.

To SQL… or not?

Both RDBMS and noSQL databases were developed to solve problems in data management. Therefore, it would be incorrect to say that one type of database is superior over the other. The decision on which type to use should be driven by business needs and the data being stored. For example, a school that has a consistent population of 300 students might prefer a SQL database, whereas a social media application would opt for noSQL to achieve the 5 Vs. The important factor to consider though, is not only the current business needs but also future projections. If at some point in the near future, rapid scaling will be required, it might be advisable to go with a noSQL database.

The Table below compares SQL and noSQL databases based on 6 common metrics:

Want to be part of the Andela Community? Then join the Andela Talent Network!

With more than 175,000 technologists in our community, in over 90 countries, we’re committed to creating diverse remote engineering teams with the world’s top talent. And our network members enjoy being part of a talented community, through activities, benefits, collaboration, and virtual and in-person meetups.

All you need to do to join the Andela Talent Network is to follow our simple sign-up process.

Submit your details via our online application then…

Complete an English fluency test – 15 minutes.

Complete a technical assessment on your chosen skill (Python, Golang, etc.) – 1 hour.

Meet with one of our Senior Developers for a technical interview – 1 hour.

Visit the Andela Talent Network sign-up page to find out more.

If you found this blog useful, check out our other blog posts for more essential insights!