Unlocking the Potential of Big Data in a Data Warehouse Environment

<< back Page 2 of 3 next >>

5. Search engines. Search engines have been around for a long time. Search engines have the capability of operating on unstructured data as well as structured data. The only problem is that search engines still need for data to have context in order for a search to produce sophisticated results. While search engines can produce some limited results while operating on unstructured data, sophisticated queries are out of the reach of search engines. The missing ingredient that search engines need is the context of data which is not present in unstructured data.

So the data warehouse has arrived at the point where it is possible to include big data in the realm of data warehousing. But in order to include big data, it is necessary to overcome a very basic problem—the data found in big data is void of context, and without context, it is very difficult to do meaningful analysis on the data.

While it is possible that data warehousing will be extended to include big data, unless the basic problem of achieving or creating context in an unstructured environment is solved, there will always be a gap between big data and the potential value of big data.

Deriving context then is the forthcoming major issue of data warehouse and big data for the future. Without being able to derive context for unstructured data, there are limited uses for big data. So exactly how can context of text be derived, especially when context of text cannot be derived from the text itself?

Two Ways to Derive Context for Unstructured Data

In fact, there are two ways to derive context for unstructured data. Those ways are “general context” and “specific context.” General context can be derived by merely declaring a document to be of a particular variety. A document may be about fishing. A document may be about legislation. A document may be about healthcare, and so forth. Once the general context of the document is declared, then the interpretation of text can be made in accordance with the general category.

<< back Page 2 of 3 next >>

Related Articles

Today, businesses are ending up with more and more critical dependency on their data infrastructure. If underlying database systems are not available, manufacturing floors cannot operate, stock exchanges cannot trade, retail stores cannot sell, banks cannot serve customers, mobile phone users cannot place calls, stadiums cannot host sports games, gyms cannot verify their subscribers' identity. Here is a look at some of the trends and how they are going to impact data management professionals.

Posted March 17, 2014

The Data Warehouse's New Role in the Big Data Revolution

Posted February 26, 2014