Information is knowledge, and knowledge is the key to success.
Relying on one’s memory is not a very good idea – people realized it 30,000 years ago and we have witnessed evolution of the storage medium – from stones to papyrus, to paper, and then to bits and bytes.
Groundwork: magnetic tapes and flat files
Historically databases have gone through an amazing evolution.
The groundwork was laid with the magnetic tapes invention in 1945, and later when the magnetic disks were introduced by IBM in 1956.
In the 1960-ies there were predecessors of database, where data was maintained in flat files.
In 1960 SABRE (Semi-automated Business Research Environment) was developed by IBM and American Airlines to automate reservation bookings.
As you can imagine, there was no standardization at that time. The database users had to know how the information was stored in the system if they wanted to use the data, which means company would hire developers who were familiar with a specific product and knew how to store and retrieve the data from the system.
Edgar Codd and the relational model
So while digitization of the data storage was a very good milestone, it still wasn’t a breakthrough.
The real game changing event actually happened in the 1970s, when the IBM engineer Edgar F. Codd proposed the relational model of data.
That eventually led to the reduction of knowledge required to access data.
Those who saw a real opportunity in this became very successful – for example Oracle, who’s initial incarnation was Relational Software Inc. – developed the first commercial RDBMS in 1979. It was a system that implemented some basic SQL queries and simple joins.
SQL and Peter Pin-Shan Chen’s ER
IBM began developing System R in the middle of seventies but it wasn’t commercially available right away. System R was the first implementation of SQL and it was the beginning of the standardization.
To abstract things away from a data designer, Peter Pin-Shan Chen proposed a new database model called Entity-Relationship, or ER.
Preliminary SQL standard was published in 1984 and with the emergence of the personal computers the Desktop DBMSes appeared – Paradox, DBASE, Access which attempted to abstract away the complexities of the database querying with the introduction of Query By Example (QBE).
Internet
From then the evolution of the database and the advent of the Internet led to exponential growth of the data industry.
This opened a lot of opportunities for data driven business, governments, laboratories and corporations. Not only they could store the data but organize it, manage it, and retrieve it so fast that it made the decision making more efficient. The standardization of the data access interface – SQL – was the key in this evolution, and nowadays you can hardly find a business analyst who doesn’t know SQL.
Big Data and ML
So as the amount of data stored grew and the Big Data was born, so did the need for scaling, and this made NoSQL databases very popular.
It became easy to handle huge amounts of data at low cost and scale very cheaply.
The relational databases focus on consistency and are scalable only vertical – meaning you have to buy more memory, faster CPU and eventually you hit the limit of physics.
With ability to store large amounts of data – terabytes and petabytes of data – the new possibilities arose and the projects that were put on hold sprung to life and started to flourish. Technologies that were put on hold like machine learning, data mining, knowledge discovery came back roaring. Some of these concepts were developed in seventies and earlier, but had to be inactivated because no database could handle the amount of information required for the success.
Other types of databases
Let’s note that there are other types of databases – like graph database which store graph structures for queries with nodes and edges and properties representing data and its interconnections. These databases allow user to elegantly traverse the graph but they lack a query language like SQL, so you can see that the biggest reason that made relational database such a successful product was the invention and implementation of a structured query language.
Of course RDBMs are not the perfect solution for everything, but because of the existence of SQL it makes it a very attractive solution for many problems.