The following is the fourth and final in a series of four blog posts on the evolving roles, skills and functions played by business intelligence and data professionals. The first three blog posts on the changing roles of BI Professionals are listed below:
- From analysts to scientists
- Why managers think beyond BI skills when hiring analytics talent
- The perfect candidate
We’ve seen in the first three posts on the changing role of BI professionals the growing value organizations now place on data and data professionals. Along with this, we’ve illustrated the importance (and, in most cases, the necessity) of building a team — either internally, or outsourced — instead of relying on one or two individuals.
But there’s little value in engaging a team of data professionals if your organization doesn’t have the data infrastructure to back them up.
Aside from being a hindrance to attracting the best talent, the lack of modern data infrastructure can handcuff a data team by not allowing them access to the latest visualization tools, APIs, data types, deep analysis and other benefits (while also forcing them to deal with performance slowdowns, thanks to the growing amounts of data).
Enter the modern data platform. It extends the functionality of a traditional data warehouse to a system that includes a data lake, built-in ETL (extract, transform, load) and support for advanced analytics and machine learning. It provides the flexibility necessary for advanced data modelling, while also allowing self-serve data analysis capability to everyday users.
After all, according to the SAS survey mentioned in our third blog post, most data professionals regularly exhibit “high levels of stress.” Fifty-five per cent indicate they feel very stressed. And if you’re lucky enough to hire a top data professional, asking them use an old and very limited system won’t help in this regard.
Why a modern data platform?
Two main trends are currently driving organizations towards data infrastructure modernization:
1) the evolution of technology and data, including more data, unstructured data and more data types
2) the increasing demand from regular users for self-serve data access.
Traditional data warehouses, which weren’t built to service these demands, simply can’t keep up with these pressures: more data and more data types, more users doing analytics, the proliferation of self-serve analytics tools, Agile development methods (requiring the use of APIs), the need to support data science activities and bad performance are all big reasons why many companies are retiring traditional data warehouses in favour of modern platforms.
For organizations looking to modernize, there are two major options: Hadoop and cloud-based.
Both bring agility, scalability and affordability, along with the on-demand ability to access semi-structured and unstructured data (from internal or external sources).
The modern data platform: Hadoop
Hadoop can handle semi-structured or unstructured data, and is ideal for the batch processing of big data. Creating a modern data platform on Hadoop can save your organization money on hardware while reducing the load on your relational database system.
But there are several significant drawbacks to using Hadoop, including the fact that you must build your system from the ground up, meaning engineering costs are typically high. DataFlair has detailed other limitations of Hadoop, including it not being well suited for small datasets, having relatively slow processing speeds and being relatively difficult to use.
If the scale of your project justifies the engineering cost, however, Hadoop can be a good option for your data infrastructure modernization.
The modern data platform: Cloud
Cloud data warehouses such as Amazon Web Services, Google Cloud Platform and Microsoft Azure are quickly becoming more and more popular (both in terms of IaaS and DBaaS) since Amazon’s offering debuted in in 2012.
These services remove much of the administration and engineering burden, while offering almost unlimited opportunities to scale; it has been documented that cloud-based technologies separating storage from compute are cheaper to run and scale faster than the alternatives.
No matter which route you take, a modernized, cloud-based data platform is key to enabling your BI professionals to get the most out of your data. A modern platform incorporates the ingestion of multiple data sources (including external and internal data), supports data storage and a data lake and performs batch and real-time processing while supporting a wide range of users throughout the organization.