More than 2.5 quintillion bytes of data are created every day, and this is only expected to grow from there according to the latest DOMO report . By the end of 2020, it is estimated that 1.7MB of data will be created every second for every person on Earth. Can you see the importance of databases?
Companies have invested literally billions of dollars in infrastructure to not only store but also search, update and delete this simply absurd amount of data.
So let’s understand what they are, what they are for and what the main types of databases we currently have at our disposal are, and discover which is the ideal database for your next project.
What is a database?
In its simplest definition, a database is simply a way to store and organize information. Ideally, this information is stored in an orderly fashion to aid in search.
You can imagine a database as if it were a library of books, each book is organized according to some type of category, be it by genre, author, language in which it was written, etc.
And each library has an organization system that depends on factors such as the size of the library, what kind of books it stores, how many visitors it receives, and so on.
Databases are very similar structures. Choosing the best type for your project and organizing your data in the best possible way will make all the difference.
Today we have two main types of databases:
- relational database.
- non-relational databases (NoSQL).
To talk about these two, I need to introduce what SQL is.
What is SQL?
From English SQL (Structured Query Language), or in good Portuguese Structured Query Language.
It is a programming language used to create a connection between data stored in a Relational Database Management System (RDBMS).
Yes, SQL is a programming language, but not a general-purpose one like Java, PHP, Python, and others. SQL is a language specifically for communicating with databases.
SQL syntax resembles English, which makes it simple to write, read, and interpret.
An example of SQL code would be like this, this code retrieves all names and emails from the Users table, for example.
SELECT username, email FROM Users ;
Many RDBMSs use SQL (and variations of SQL) to access data in tables.
An RDBMS is a program that allows you to create, update, and manage relational databases.
MySQL, for example, is an RDBMS.
Relational database
A relational database is a type of database. It stores and provides access to data that is related to each other. Hence the name ‘relational’. These relationships are built using tables, similar to this one:
Id | username | password | |
---|---|---|---|
1 | user1 | user1@example.com | !*&JGs1ˆ& |
2 | user2 | user2@example.com | ($%ˆgˆ8r*7 |
3 | user3 | user3@example.com | $%rtg&nd |
- Each column represents information referring to a piece of data, for example in the email column you will have all emails relating to all users.
- In each line you have the complete information of each user, for example, in line 2 we have all the data of user2 (username, email and password).
In a relational database, each row in the table is a record with a unique ID called a primary key.
It is from this primary key that we can create relationships between data and tables.
For example, we can have a second table with addresses. And we can generate a relationship between them through the user ID.
Id | address | country | userID |
---|---|---|---|
1 | Main Avenue, 500 | Brazil | 1 |
2 | Second Avenue, 300 | Portugal | 2 |
3 | Third Ave, 100 | Mozambique | 3 |
With this second table it is simple to recognize that the address with Id 1 belongs to the user with Id 1, the address with Id 2 belongs to the user with Id 2 and the address with Id 3 belongs to the user with Id 3.
If we wanted to retrieve user1’s address through an SQL query, we could do it like this:
SELECT * FROM Address WHERE userId = 1 ;
If you liked this structure and way of organizing data, I have excellent news for you: there are several free options that you can choose to work with. I will list the main ones below.
The main Relational Database Management Systems
MySQL
According to them, “MySQL is the most popular Open Source SQL database management system”. Supported by Oracle.
As main advantages of MySQL I can mention:
- Ease of use, unmodified SQL.
- Cheap, and offers a free community version.
- Ensures data security with exceptional data protection features.
- Large developer community.
I used MySQL in this project where I showed how to connect Python to MySQL.
Difficulties in scalability and poor performance involving tables with a lot of data can be cited as some of the weaknesses of this system.
PostgreSQL
PostgreSQL is also an open source RDBMS, but it is not controlled by any corporation.
PostgreSQL provides many of the same benefits as MySQL. It is easy to use, inexpensive, reliable, and has a large developer community.
PostgreSQL typically performs worse than other RDBMSs, such as MySQL, and although it has been used more in recent years, PostgreSQL lags behind in terms of popularity.
OracleDB
Many of the world’s leading banks run on Oracle, as Oracle offers a strong mix of robust, pre-integrated business technologies and software, including features designed specifically for banks.
Oracle offers a free version called Oracle Database Express Edition (XE).
The downside is perhaps the high cost. But you are paying for high performance, excellent customer service, and many other benefits, and the high cost may be justified.
SQL Server
Microsoft also offers RDBMS. With a free entry-level version, which by the way has exactly the same name as its competitor, Express.
It also focuses on large enterprise applications with large infrastructure.
Like the Oracle system, it is also not open source.
The downside is also the high cost and the fact that it is necessary to invest in good hardware to run the latest versions.
SQLite
SQLite is a popular open source SQL database. It can store an entire database in a single file.
One of the most significant advantages this provides is that all data can be stored locally without the need to connect your database to a server. Widely used in mobile applications (Android and iOS).
Non-relational database (NoSQL)
NoSQL databases (also known as “not only SQL”) are not made up of tables, they store data differently than relational tables.
They appeared in the late 2000s, when the cost of storage dropped dramatically. Gone are the days when you had to create a complex and difficult-to-manage data model simply for the purpose of reducing data duplication.
NoSQL databases come in a variety of flavors based on their data model. The main flavors are document, key-value, wide-column, and graph. They provide flexible schemas and scale easily with large amounts of data and high user loads.
A common misconception is that NoSQL databases or non-relational databases don’t store relational data well. NoSQL databases can store relational data – they just store it differently than relational databases do.
In fact, when compared to a SQL database, many find that modeling relationship data in a NoSQL database is easier than in a SQL database because related data does not need to be split across tables.
NoSQL data models allow related data to be nested within a single data structure.
What are the Types of Non-Relational Databases?
Over time, four main types of NoSQL databases have emerged:
- Document-oriented database
- Key value database
- Column family database.
- Graph database
Let’s examine each type.
Document-oriented database
NoSQL document databases store data in structures similar to JSON (JavaScript Object Notation) objects. Something like this:
{
"_id" : ObjectId( "5ea3f99880aeb46f07d550fe" ),
"id" : 1 ,
"password" : "pbkdf2_sha256$150000$2WAquqg5uny4$9mVb1OqdkvvRubzHay0I3EzODn7i37icn6GQVHLqtiY=" ,
"is_superuser" : true ,
"username" : "debug" ,
"first_name" : "Debug" ,
"last_name" : "Everything" ,
"email" : 'admin@debugeverything.com',
"date_joined" : ISODate( " 2020-04-25T08:49:28.513Z" )
}
Each document contains pairs of fields and values. Values can typically be a variety of types, including things like strings, numbers, booleans, arrays, or objects.
Due to their variety of field value types and powerful query languages, document databases are great for a wide variety of use cases and can be used as a general-purpose database.
Key value database
A Key-Value NoSQL database stores data as a set of Keys and Values, also known as an associative array, organized in rows.
An example of what an electronics store database might look like.

Key-value databases are great for use cases where you need to store large amounts of data but don’t need to perform complex queries to retrieve it. A common use case is storing user preferences or caches.
Column family database
In this type of NoSQL database, the columns in each row are contained within that row. Here’s an image to explain it better:

Each column family is a container of rows in a table. The key identifies the row that consists of multiple columns.
Rows do not need to have the same number of columns nor contain the same information. Columns can be added to any row at any time without having to add it to other rows.
Columnar databases are great for when you need to store large amounts of data and you can predict what your query patterns will be.
This type of database is often used to store user profile data.
Graph database
Nodes and relationships are the essential constituents of this type of database. A node represents an entity. A relationship represents how two nodes are associated.

Graph databases use nodes that contain lists of relationship records. These records represent relationships with other nodes, which eliminates the (time-consuming) search and comparison operations found in relational databases.

The relationships between nodes are predetermined, that is, they are not determined at the time of the query.
Top NoSQL Databases
MongoDB
The most famous of the non-relational databases.
MongoDB stores data in flexible, JSON-like documents, which means that fields can vary from document to document and the data structure can change over time.
Additionally, you will receive benefits like full index support, high availability across WANs and LANs along with easy replication, horizontal scalability, rich document-based queries, flexibility in data processing and aggregation along with proper training, support and consultation.
MongoDB is free and open source.
Redis
Redis, which stands for Remote Dictionary Server, is a fast, open-source, in-memory key-value storage database.
One of the main attractions of using Redis is its speed. Redis data is kept in memory, whereas other databases store data on disk or SSDs, which allows Redis to access information much faster. Redis will only store data on disk when it needs to be persisted.
Cassandra
Cassandra If you are looking for high availability and scalability without compromising on performance, then the Cassandra database is the ideal one for you.
Apache Cassandra is a free and open source, distributed software that uses the wide column standard.
Cassandra is highly scalable and high-performance, designed to handle large amounts of data across many servers.
Neo4j
Neo4j is an open-source, native NoSQL graph database that provides an ACID-compliant transactional backend for your applications.
It is referred to as native because it efficiently implements the property graph model down to the storage level. This means that data is stored exactly as you store it, and it uses pointers to navigate and traverse the graph.
Conclusion
I believe we were able to clarify the differences between SQL and NoSQL.
Most projects are still based on this first model. However, large companies are already migrating some of their databases to NoSQL systems.
It may become a new standard in the future, or it may continue to co-exist with SQL. Only time will tell.
Don’t forget to subscribe to receive all new and exclusive content!
See you next time.
Deixe um comentário