IBM has launched a data preparation solution designed to help clients improve their DataOps processes so they can have data ready for AI more quickly and efficiently.
Data preparation is an integral step in building machine learning and predictive models, but it is also cumbersome and time-consuming. Citing an often-used statistic that data scentists commonly spend 80% of their time on data prep, IBM says the new solution is designed to help clients transform raw datasets by formatting, structuring, and enriching them for analytic processing and standard reporting.
Jointly developed with data prep software provider Trifacta, the new InfoSphere solution is engineered to work in conjunction with clients’ existing data environments, including data lakes.
Organizations are looking to leverage data for strategic decision making but often analytics, machine learning, and AI initiatives slowed by poor data quality, inefficient data preparation processes, and a lack of governance,said Adam Wilson, CEO, Trifacta. The new IBM collaboration will allow organizations to accelerate data preparation for self-service analytics in a governed and centrally managed environment, he noted.
IBM says the new InfoSphere solution includes a dashboard for visualizing the data prep process, including the progress of tracking data quality, and lineage (where the data originated, and where it’s been). With the cleaned datasets, clients can move them into the business analytics tool of their choice.
The solution resides on top of a client’s data lake or data warehouse and provides automated transformation capabilities. Through a self-service user interface, business users, as well as data scientists, can access, explore, prepare and enrich datasets for analytics. In addition to data prep, the tool is designed to allow users with a variety of levels of technical expertise to generate business-ready insights.
For more information go to www.ibm.com/analytics.