Today, data is increasingly seen as the fuel of the business, rather than its byproduct.
As a result, the old adage “garbage in, garbage out” couldn’t ring truer when it comes to maximizing the value of machine learning in the enterprise, according to Steve Zisk, senior product marketing manager of RedPoint Global, which provides a customer data platform and engagement hub.
Zisk presented a session at Data Summit 2018, titled “Garbage in, Garbage Out: Why Data Quality Is the Lifeblood of Machine Learning.”
Machine learning is worthless if it’s fueled by bad data, according to Zisk, who covered why simply collecting massive amounts of data isn’t enough to extract value from machine learning technology; what’s real and what’s hype when it comes to machine learning; and best practices for using machine learning to predict, identify patterns, and optimize processes for reaching customers effectively.
According to Zisk the market imperative for using great data to fuel machine learning driven customer engagement has never been higher. Organizations that get this right will be winners and others will have a problem. Demand for excellence in customer engagement is driven by customers themselves.
Zisk noted that:
- 19% of potential revenue lifts are generated by data-driven personalization (Forrester)
- 5x higher retention rates are generated by personalized engagement (Forbes)
- 6-7x higher conversion rates are generated by contextually relevant messages (Experian)
- 89% of enterprises expect to compete on the basis of customer experience (Gartner)
The growth of data is being fueled by
- Velocity - Cheap universal internet and simplified interfaces
- Variety - Ubiquitous smartphones, cheap IoT devices, and monitors everywhere
- Volume - Consumers have more than one device on at all times
- Veracity - Matching devices, people, and personas in the face of privacy and compliance
All this data is fueling artificial intelligence applications but problems can be caused by data quality and the old adage, garbage in garbage out applies in this context. According to Zisk, machine learning is required for personalization and machine learning requires accurate, robust data management. Problems occur when data has missing values, poor categories, inconsistent formatting, and erroneous values; when there is noisy data with conflicting data, misleading information, too much detail, poor collection practices; sparse data with very few actual values, too many fields/variables; and inadequate data with incomplete collection, biased sampling, and over-correlation.
Personalized recommendations enable organizations to match individual customer behaviors/preferences to products/offers with no unwanted assumptions about “similar customers”; to use customer history and any known preferences enabled by recent purchases, responses to advertisements, and survey results; is customizable for both offer and product recommendations; and greatly improved response rates.
Key points to remember, said Zisk are:
- It’s not just the amount of data, it’s what’s in the data that counts
- Think about the problem before just throwing data at it
- ML can augment your capabilities, efficiency, and performance
- Manage your expectations
- Keep updating your models
Many presentations from Data Summit 2018 have been made available for review at www.dbta.com/DataSummit/2018/Presentations.aspx.