By Piet Loubser
Over the past couple of months, I’ve had the opportunity to speak to numerous Chief Data Officers (CDOs), IT leaders, and business leaders on approaches to accelerate their data initiatives. Everyone seems to agree on the need for better data and the necessity of getting it into the hands of stakeholders sooner. It is also becoming increasingly clear that traditional approaches to data preparation via IT lead data integration, master data management, and data quality are no longer sufficient.
Traditional Approaches Will Not Scale
It was telling that a CDO from a global bank attending a recent Evanta CDO Inner Circle Dinner in New York suggested that “we have to completely rearchitect” to meet the business demands for relevant and fresh data at the speed of business.
- Asking IT is Dead: Over the past 15+ years, the standard business approach to obtain data for a new report was to ask IT. This typically led to a back-and-forth exchange between the business requestor and the developer trying to match the requirements with a data extract; often, the business side would receive the desired data several weeks later. This method may have been satisfactory in a pre-internet world, but it fails miserably in today’s rapid business environment.
- Self-Service BI Failed: In recent years, IT’s response to this bottleneck was to essentially toss data over the fence to the business side, who would then put it into Tableau, Qlik, Excel, or other visualization tool. Two exceedingly unfortunate results occurred:
- The 80% effort that IT developers spent on data preparation, shaping, and cleaning shifted over to business analysts, using tools that were never designed for these tasks.
- Everything done by business analysts became siloed data artifacts that could neither be shared with other users nor governed, leading to chaos.
Data Preparation Emerges as a Clear Category
Over the past 5+ years, a number of vendors delivered a new breed of data management software that focused on the task of data preparation. While data always went through some form of preparation, the difference now is that modern data preparation tools are designed specifically for business users. Empowering business users with self-service data preparation solves both problems stated above: 1) there is no longer a need to ask IT to develop refined data sets, and 2) business users now have a purpose-built tool to profile, shape, clean, and publish the data to their BI or visualization tool of choice. This eBook on the 4 styles of data preparation explains more.
Data Preparation Beyond Analytics
Reiterating the comment made by the bank CDO at the Evanta event, a new architectural vision is emerging. The recent Market Guide for Data Preparation Tools report from Gartner suggests that “by 2022, data preparation will become a critical capability in more than 60% of data integration, analytics/BI, data science, data engineering and data lake enablement platforms.” In the May 2019 Forrester report titled Big Data Fabric 2.0 Drives Data Democratization, Principal Analyst Noel Yuhanna writes that “businesses are reporting that integrating data from silos to support real-time insights has become a nightmare,” and that “without a big data fabric strategy, organizations will likely spend more time and effort ingesting, integrating, curating, and securing data insights.”
Data preparation platforms as part of your fabric can power more than just ad hoc analytical requests. At the Gartner Data and Analytics Symposium in Orlando, Cox Automotive, Nationwide Insurance, and AdhereHealth joined me in a session to discuss their uses of Paxata’s self-service data preparation solution. Read my event blog post here.
With Paxata, Nationwide Insurance now has more than 20 different projects or initiatives, ranging from financial reporting to creating customer masters. Cox Automotive is powering their data quality program. And AdhereHealth is onboarding medical and health related data from a broad range of external sources to drive personalized medication adherence insights.
You can read more Paxata customer case studies on our website. Vanguard organizations are indeed rearchitecting in order to gain their ability to manage and leverage their data as an asset. By making an enterprise grade data preparation platform part of their data fabric, forward-thinking organizations are now able to accelerate their analytics and data science initiatives – and the same data preparation can be used to accelerate your application consolidation and migration initiatives.
Gartner, Market Guide for Data Preparation Tools, Ehtisham Zadi, Sharat Menon, 17 April 2019
Forrester, Big Data Fabric 2.0 Drives Data Democratization, Noel Yuhanna, Gene Leganza, Elizabeth Hoberman, 9 May 2019