A new way to sort data to procure insights is emerging called the knowledge graph. For publishers, content is the most important data asset. However, journal articles (and other content) are expressed in formats ideally suited for transmittal and display—not as data for analysis.
Creating a knowledge graph to sort through this data, put it in order, and glean insights will be the topic of Bob Kasenchak’s, director of business development, Access Innovations, Inc., USA, Data Summit 2019 presentation titled, “From Structured Text to Knowledge Graphs: Creating RDF Triples From Published Scholarly Data.”
The sixth annual Data Summit conference will be held in Boston, May 21-22, 2019, with pre-conference workshops on May 20. Registration is now open.
“I’m speaking about graph databases and specifically what I want to focus on is how publishers can leverage the structured information they already have to build knowledge graphs,” Kasenchak said.
In scholarly publishing journal articles are written in XML formats, which is a structured format to deliver information, he explained.
This format is designed for transmission and display but the problem is that 900,000 articles in XML is not a good database, it’s a content repository and it’s not good for querying.
Since data is fielded it can extracted from XML and put it in RDF Triples to create a knowledge graph, Kasenchak said.
“For publishers, their largest data asset is their content but their content isn’t treated like data, it’s treated like content,” Kasenchak said. “The idea is that they can derive a great amount of value out of their content as a data asset.”
This will be Kasenchak’s first time at Data Summit and he’s looking forward to see what other people in the field are doing around semantic technologies and taxonomies.
“I do a lot of work for publishers and I’m looking forward to finding out what people are doing with semantics with corporate products,” Kasenchak said.
He predicts that attendees will continue to crowd around machine learning, cognitive computing, and AI sessions.
“I’m interested in new database technologies and what’s going on with neural nets, cognitive computing, machine learning, and AI because lots of people talk about it but it’s not clear to me that it’s actually being implemented in all the places that it is,” Kasenchak said. “They have the potential to be extremely disruptive.”
Kasenchak will present his Data Summit session, on Wednesday, May 22, at 3:00 pm.
For more information about Data Summit 2019, and to register, go here.
To review the Data Summit program, go to https://www.dbta.com/DataSummit/2019/Program.aspx