MarkLogic’s Gary Bloom on What to Do Before You Start an AI Project

Oct 25, 2017

By Joyce Wells

The way MarkLogic CEO Gary Bloom sees it, interest in artificial intelligence is soaring: Everyone wants to talk about it and everyone wants to apply intelligence for better insights and better decisions. But there is just one problem.

Gary Bloom “In our industry as technologists, the market seems to go through one hype cycle after another and what is interesting is that AI is an example of that after the big move toward Hadoop,” said Bloom, whose company provides a NoSQL database. According to Bloom, “Hadoop was great for analytics. It allowed you to consolidate data to look at things. But you couldn’t really run your business of it. And so as we leave that hype cycle, AI has kind of taken off. But AI as a technology was in the market close to 30 years ago. It is not really new. What is new is the level of promotion it is getting.”

Garbage in, garbage out

Despite the marketing campaigns and high enthusiasm for AI’s potential, Bloom says people will become disillusioned when they encounter the inevitable challenges that are bound to arise if they attempt to forgo prerequisites such as unifying their data for a comprehensive view with the required onboarding of data, integration of silos, data lineage and governance, and consistency of terms.

“Let’s take a life sciences clinical trial. If I can’t trace my AI answer back to the data that contributed to that answer, I probably can’t use my findings," Bloom says. "The history of the data, where it came from, who had access to it, who modified it—all those things are critical. And that is what we have been focused on here at MarkLogic: giving customers the ability to integrate data from silos—not just to look at their data and analyze it but to actually run their business on their data. I can assure you that AI is something they want to do in the future. I am not criticizing AI as a trend but you have to do the prep work to consolidate the data and get an integrated view, have lineage, and have governance, and then load the data into your AI engine to support what you are doing.”

Similarities with compliance

The same is true for compliance with regulatory mandates such as GDPR, MIFID II, the Dodd-Frank Act, and HIPAA. What all of them have in common, Bloom says, is that they require companies to have a comprehensive view of their data with unified governance, good controls, and knowledge of the history of the data. “It is almost becoming a standard way to meet these regulatory compliance issues.” The good news, says Bloom, is that the work that company has done to consolidate its data and create that common view for regulatory compliance can also be used to run the business and to feed the AI engine.

“AI as an engine that adds smarts to data and analyzes data and self-teaches is definitely possible given the processor capacity that is available today,” acknowledges Bloom, adding that if companies have prepared data for regulatory compliance, they have also prepared it for running the business and for machine learning and AI.

If you can’t get the data in, you can’t be intelligent

But with the massive marketing effort behind AI, it is necessary to act with caution. With AI, as with the rise of other newer technologies such as Hadoop, says Bloom, it may offer advantages, but without security controls, visibility controls, data governance and lineage, it can be difficult to achieve the sought-after benefits.

“I think we are seeing the exact same thing in the AI world where a rush of people and investment is going into AI,” says Bloom, noting that, on the back side of that wave of optimism, there can be bitter disillusionment when the resources devoted to a project fail to deliver results. “We are in the early innings of the game and we are going to see a lot of people strike out that get too aggressive too quickly—and in particular, those that think it is a cure for all data issues. The AI engine is only as smart as the data you put into it.”