It has been a whirlwind time for data managers and their enterprises, and the innovations that are reshaping data operations aren’t showing any signs of slowing down soon. AI and advanced analytics are changing the game, of course, as are a myriad of technologies now available to help manage and extract business value from the data flowing through organizations. Here is what industry experts tell BDQ they now see emerging and what we can expect in the months and years to come.
AI for Advanced Analytics
Advanced analytics have been part of the enterprise data landscape for decades now, but the rise of AI—particularly generative AI (GenAI)—promises to accelerate progress at a geometrically increased rate. “GenAI enhances enterprise analytics tools by improving their usability while synthesizing information in a consumable way,” said Asa Whillock, VP and GM of AI and machine learning at Alteryx. AI also makes advanced analytics accessible to a wider base of users. “Previously, analytics tasks required technical experts with years of experience to execute,” Whillock explained. “Though no-code and low-code interfaces have made analytics easier to use, GenAI now shifts this paradigm further by opening these tools to non-data professionals,” he said. “They can now perform sophisticated data analytics tasks using natural language.”
AI also “substantially improves automation quality for analytics and can be applied across the entire data analytics lifecycle—from ETL to data preparation, analysis, and reporting,” said Whillock. “The application of GenAI will clear away the logjams that have hamstrung fruitful returns on enterprise analytics tools and deliver data-driven decision making.”
The challenge with this emerging technology is governance—in particular, a need for “policies and guardrails surrounding GenAI,” Whillock continued. “Companies that seek to employ GenAI to improve their analytics must abide by responsible AI guidelines to guarantee its proper use.”
Data executives and professionals also need to consider a lack of readiness around GenAI. “Organizations struggle to prepare their employees with adequate training,” Whillock said. Alteryx’s own data shows that 29% of businesses report a lack of skilled talent is holding them back from scaling GenAI across the organization. Meanwhile, 19% of organizations using GenAI “don’t offer any mandatory AI training,” he stated.
Blockchain for AI Governance and Transparency
It may be a matter of speculation as to whether blockchain is a technology that fell from the hype cycle or is a long-term information foundation. However, one industry observer sees its potential in providing much-needed traceability and transparency to AI algorithms and systems. “Blockchain is the most effective tool to ensure data sovereignty and govern its use,” said Henry Guo, VP of AI product management at Casper Labs.
Blockchain enables “a time-stamped and tamper-proof ledger—ultimately a certifiable audit trail—for data,” Guo explained. “This is a critical capability for AI governance, as blockchain can help set and enforce parameters around what data is or isn’t used to power AI applications. With virtually all enterprises looking to adopt AI and funnel their data into these applications, blockchain will be critical in shaping how data is governed far into the future.”
Blockchain is uniquely suited “to meet strict data governance demands,” Guo added. “It can be tailored to the demands of specific organizations, teams, and use cases; automate the enforcement of internal and external data standards; and provide trusted data provenance security across the end-to-end AI stack. It also provides failover mechanisms for enhanced control, allowing enterprises to detect issues in application performance and security and seamlessly revert to past versions to mitigate damage.”
The challenge is that blockchain technology—and its adoption by enterprises—has been met with fits and starts. “It’s yet to shed its long-held ties to Bitcoin and cryptocurrency,” said Guo. “Deep-seated misunderstandings on what blockchain is and what use cases it can serve remain a persistent challenge. Plus, there’s a lack of blockchain-specific expertise.”
Synthetic Data for Security and Privacy
The rise of GenAI and large language models (LLMs) has exacerbated security and privacy concerns about the use of private or sensitive data that may be exposed. An emerging solution that may alleviate these concerns is enabling the training of AI models with synthetic or anonymized data.
This use of synthetic data “enables secure, collaborative innovation by mitigating privacy risks and enhancing AI applications,” according to Amie Richards, senior manager for technology and experience at West Monroe, writing in a recent analysis. “Synthetic data is data that is not the result of manual collection, measurement, or observation but instead is manufactured by systems, simulations, or models using the statistical properties of the real thing.”
With the rise of GenAI and advanced large language models, “companies are increasingly looking to synthetic data as a way to train AI on similar information without compromising individuals’ privacy,” Richards stated.
With synthetic data, “choosing between the data you have and the data you need could be a thing of the past,” Richards added. Synthetic data is becoming attractive due to “regulations related to data privacy, as well as the cost of data breaches, the need to protect privacy and security related to training new AI and ML [machine learning] models and increasing expectations about data sharing. Data doesn’t need to be ‘real’ to train models, create simulations, or revise projections.”
At the same time, synthetic data ultimately can’t take the place of the real data needed to build an AI-driven organization. “Don’t use synthetic data to get around solving real data problems,” said Richards. “To fully leverage AI, it’s essential to establish a solid data foundation. If real data is poorly structured or insufficient, synthetic data can be used to predict trends and identify outliers without introducing biases—but it’s not a replacement for solving real data challenges.”