When I graduated from college in the late 1990s, it was just in time to enjoy the Y2K crisis. If you recall those fun times, then you are old enough to appreciate this blog. I have completed my graduation with a Management Information Systems (MIS) degree, which is a cross among Computer Science (CS) and Business Management, and though I was sturdier in CS than Business Management, I survived. There was a class straddling both disciplines that I partly shined in called Database Theory, which educated the basics of RDBMS. We educated everything from appropriate table structures, primary keys, and foreign keys to basic modeling methods. It is also where we first listen of the term SQL, or “sequel”.
Suggested Read: How do machines learn?
SQL is an abbreviation of structured query language and is supported by a collection of standards, although they appear to be implemented slightly differently by each database vendor. Even though SQL is always a slight dissimilar depending on if you are using MySQL, Oracle, DB2, or whatever tool you have, if you are smart at writing SQL and understand the database model, you can acclimate quickly to get whatever data you require.
To SQL or Not to SQL?
The database paradigm has modified. There is now document databases, columnar databases, NoSQL, graph databases, Hadoop, Spark, and many different massively parallel processing (MPP) platforms coming up daily. They all offer great benefits for various use cases that just don’t work well with traditional RDBMSs.
Big data platforms offer a way to process many diverse data faster than we could have thought of in 1999 when most IT persons had to understand SQL to meet business needs. Today, you need to know many more platforms and environments to take advantage of all the capabilities and benefits that big data vendors are promising.
Can those of us who have rest on SQL-compliant systems endure or do we have to learn Scala, Python, R, Java, or whatever the next cool language and platform requires?
Also Read: What is bog data and why use Hadoop?
There are tools like Hive and Impala that enable you to use your SQL skills to find and access data on Hadoop platforms and data lakes, but tools like Hive all come with their limitations. You are able to only do so many functions on the data — the distinct functions that SQL has always maintained. Of course, you are able to use user-defined functions, but then you get into programming rapidly.
Is Python in Your Future?
Where SQL systems fall squat is when you begin applying the new machine learning processes on your data or when you want to take benefit of big volumes of streaming data and query data in line. Yes, for those of us who like RDBMSs, data is not always a break.
Times have changed and so must our skills. I have started to study Python as it is simple language to use than Java, and different machine learning methods are supported in Python. Incoming 5 to 10 years, every information employee or IT support person will have to understand how to use machine learning, or at least how to support it. You will need to support MPP systems like Hadoop and Spark in some stage for data processing. Machine learning will be main to support data-oriented decision making and to get the reasonable insights needed to win your market.
Featured article: How ML is saving time and money?
SQL Is Dead, Long Live SQL
There is still a high need for SQL, as I see procedures such as data vaults change and become widely popular in the NoSQL and HDFS/storage spaces. There will continually be organized systems, such as ERP and CRM systems, that will require structured data warehouses. But when your manager comes to you to forecast the future and its business inferences, or to know highly automated optimization solutions that get cleverer over time, you might want to stop looking in the same old typical places. You might have to start looking at all the data present in its most natural forms and find solutions to become more predictive, prescriptive and even intellectual. So, while SQL and RDBMSs, like the mainframe, will exist for many, many years, the tide is shifting towards tools for real-time analytics as SQL currently falls short!