Experts Discuss Expanding Your Data Science and Machine Learning Capabilities

Surviving and thriving with data science and machine learning means not only having the right platforms, tools and skills, but identifying use cases and implementing processes that can deliver repeatable, scalable business value.

The challenges are numerous, from selecting data sets and data platforms, to architecting and optimizing data pipelines, and model training and deployment.

As a result, new solutions have emerged to deliver key capabilities in areas including visualization, self-service and real-time analytics. Along with the rise of DataOps, greater collaboration and automation have been identified as key success factors.

DBTA held a webinar featuring Gaurav Deshpande, VP marketing, TigerGraph; Paige Roberts, open source relations manager, Vertica; and Ugo Pollio, director presales, customer success, Delphix, who discussed new technologies and strategies for expanding data science and machine learning capabilities.

According to Gartner, “By 2025, graph technologies will be used in 80% of data and analytics innovations, up from 10% in 2021, facilitating rapid decision making across the enterprise.” TigerGraph provides advanced analytics and machine learning on connected data, Deshpande said.

With TigerGraph, users can connect all datasets and pipelines, analyze connected data, and learn from connected data. Graph and AI produces smarter, richer data, asks deeper and smarter questions, offers more computational options, and provides explainable results, Deshpande explained.

Transforming data into business assets it not just an option, competing on data is the new style of business, Roberts noted.

Vertica offers a unique value in the machine learning space, she said. Vertica offers proven infrastructure with no additional hardware and software to manage, provides open integration with other solutions, and requires fewer people to set up and maintain.

Vertica machine learning is fast and scalable by leveraging MPP infrastructure and scale-out architecture for parallel high performance. Users can manage resources for other users and jobs. And the solution uses a familiar SQL interface. Users can prep data, manage, train, and deploy machine learning models using simple SQL calls or use Python or Jupyter familiar data science tools. Users can integrate machine learning functions with visualization tools via SQL.

Digital transformation demands app data plus environments, Pollio said. An API-first programmable infrastructure provides:

  • All apps, clouds—efficiently sync from on prem, across multi-cloud
  • Comprehensive data operations programmable via APIs
  • Data compliance and security with referential integrity
  • Real data for production-like environments
  • Data immutability via time machine for reliability, resiliency, safety
  • Version control to manage data with code
  • Space efficiency via data virtualization for speed, cost savings

Delphix lives in the center of data science, Pollio noted. Delphix provides an automated DevOps data platform. The platform masks data for privacy compliance, secures data from ransomware, and delivers efficient, virtualized data for CI/CD and digital transformation.

Delphix provides essential data APIs for DevOps, including data provisioning, refresh, rewind, integration, and version control. Delphix supports all apps from mainframe to cloud native across the multi-cloud, including SaaS, private, and public clouds.

An archived on-demand replay of this webinar is available here.