Newsletters




Big Data

The well-known three Vs of Big Data - Volume, Variety, and Velocity – are increasingly placing pressure on organizations that need to manage this data as well as extract value from this data deluge for Predictive Analytics and Decision-Making. Big Data technologies, services, and tools such as Hadoop, MapReduce, Hive and NoSQL/NewSQL databases and Data Integration techniques, In-Memory approaches, and Cloud technologies have emerged to help meet the challenges posed by the flood of Web, Social Media, Internet of Things (IoT) and machine-to-machine (M2M) data flowing into organizations.



Big Data Articles

The term "big data" refers to the massive amounts of data being generated on a daily basis by businesses and consumers alike - data which cannot be processed using conventional data analysis tools owing to its sheer size and, in many case, its unstructured nature. Convinced that such data hold the key to improved productivity and profitability, enterprise planners are searching for tools capable of processing big data, and information technology providers are scrambling to develop solutions to accommodate new big data market opportunities.

Posted May 23, 2012

Organizations are struggling with big data, which they define as any large-size data store that becomes unmanageable by standard technologies or methods, according to a new survey of 264 data managers and professionals who are subscribers to Database Trends and Applications. The survey was conducted by Unisphere Research, a division of Information Today, Inc., in partnership with MarkLogic in January 2012. Among the key findings uncovered by the survey is the fact that unstructured data is on the rise, and ready to engulf current data management systems. Added to that concern, say respondents, is their belief that management does not understand the challenge that is looming, and is failing to recognize the significance of unstructured data assets to the business.

Posted May 09, 2012

Zettaset has announced SHadoop, a new security initiative designed to improve security for Hadoop. The new initiative will be incorporated as a security layer into Zettaset's Hadoop Orchestrator data management platform. The SHadoop layer is intended to mitigate architectural and input validation issues that exist within the core Hadoop code, and improve upon user role audit tracking and user level security.

Posted April 12, 2012

RainStor, a provider of big data management software, is joining with IBM in the big data market. RainStor will work with IBM to deliver a solution that combines IBM's enterprise-class, Hadoop-based product, InfoSphere BigInsights, with RainStor's Big Data Analytics on Hadoop product to enable faster, more flexible analytics on multi-structured data, without the need to move data out of the Hadoop environment. According to the vendors, the new combined solution can reduce the TCO for customers by significantly reducing physical storage, and also improving the performance of querying and analyzing big data sets across the enterprise.

Posted March 27, 2012

At the Strata Conference today Calpont anounced InfiniDB 3, the latest release of its high performance analytic database. Designed from the ground up for large-scale, high-performance dimensional analytics, predictive analytics, and ad hoc business intelligence, the new release includes capabilities to capitalize on a variety of data structures and deployment variations to meet organizations' need for a flexible and scalable big data architecture.

Posted February 29, 2012

Composite Software has introduced version 6.1 of its Composite Data Virtualization Platform. The new release offers improved caching performance, expanded caching targets, data ship join for Teradata, and Hadoop MapReduce connectivity. Composite 6.1 also provides improvements to the data services development environment with an enhanced data services editor and new publishing options for Representational State Transfer (REST) and Open Data Protocol (OData) data services.

Posted February 17, 2012

RainStor, a provider of big data management software, has unveiled the RainStor Big Data Analytics on Hadoop, which the company describes as the first enterprise database running natively on Hadoop. It is intended to enable faster analytics on multi-structured data without the need to move data out of the Hadoop Distributed File System (HDFS) environment. There is architectural compatibility with the way Rainstor manages data and the way Hadoop Distributed File Systems manage CSV files, says Deirdre Mahon, vice president of marketing at Rainstor.

Posted January 25, 2012

The Oracle Big Data Appliance, an engineered system of hardware and software that was first unveiled at Oracle OpenWorld in October, is now generally available. The new system incorporates Cloudera's Distribution Including Apache Hadoop (CDH3) with Cloudera Manager 3.7, plus an open source distribution of R. The Oracle Big Data Appliance represents "two industry leaders coming together to wrap their arms around all things big data," says Cloudera COO Kirk Dunn.

Posted January 25, 2012

"Big data" and analytics have become the rage within the executive suite. The promise is immense - harness all the available information within the enterprise, regardless of data model or source, and mine it for insights that can't be seen any other way. In short, senior managers become more effective at business planning, spotting emerging trends and opportunities and anticipating crises because they have the means to see both the metaphorical trees and the forest at the same time. However, big data technologies don't come without a cost.

Posted January 11, 2012

The big data playing field grew larger with the formation of Hortonworks and HPCC Systems. Hortonworks is a new company consisting of key architects and core contributors to the Apache Hadoop technology pioneered by Yahoo. In addition, HPCC Systems, which has been launched by LexisNexis Risk Solutions, aims to offer a high performance computing cluster technology as an alternative to Hadoop.

Posted July 27, 2011

The rise of "big data" solutions - often involving the increasingly common Hadoop platform - together with the growing use of sophisticated analytics to drive business value - such as collective intelligence and predictive analytics - has led to a new category of IT professional: the data scientist.

Posted May 12, 2011

Google's first "secret sauce" for web search was the innovative PageRank link analysis algorithm which successfully identifies the most relevant pages matching a search term. Google's superior search results were a huge factor in their early success. However, Google could never have achieved their current market dominance without an ability to reliably and quickly return those results. From the beginning, Google needed to handle volumes of data that exceeded the capabilities of existing commercial technologies. Instead, Google leveraged clusters of inexpensive commodity hardware, and created their own software frameworks to sift and index the data. Over time, these techniques evolved into the MapReduce algorithm. MapReduce allows data stored on a distributed file system - such as the Google File System (GFS) - to be processed in parallel by hundreds of thousands of inexpensive computers. Using MapReduce, Google is able to process more than a petabyte (one million GB) of new web data every hour.

Posted January 11, 2010

Google introduced the MapReduce algorithm to perform massively parallel processing of very large data sets using clusters of commodity hardware. MapReduce is a core Google technology and key to maintaining Google's website indexes.

Posted September 14, 2009

Pages
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160

Sponsors