Newsletters




Hadoop

The Apache Hadoop framework for the processing of data on commodity hardware is at the center of the Big Data picture today. Key solutions and technologies include the Hadoop Distributed File System (HDFS), YARN, MapReduce, Pig, Hive, Security, as well as a growing spectrum of solutions that support Business Intelligence (BI) and Analytics.



Hadoop Articles

What's on the horizon for big data, analytics, and business intelligence as technology evolves faster and faster? In Data Summit's 2016 closing keynote John O'Brien, principal analyst and CEO, at Radiant Advisors, discussed how technology will evolve and grow in the future.

Posted May 26, 2016

Dynatrace, a digital performance software company, is teaming up with Pivotal, to deploy its application monitoring solutions for the Pivotal Cloud Foundry (PCF) platform. The integration of Dynatrace with Pivotal Cloud Foundry will enable companies to take advantage of this acceleration by collecting analytics for applications running on PCF, allowing them to detect and act on performance shortcomings and optimize end-to-end transaction latencies.

Posted May 26, 2016

Alpine Data is making advancements to its Chorus platform, combining an integrated analytics platform with improvements that will accelerate the delivery of data. Chorus 6 now delivers capabilities that will help business leaders take the reins in assisting organizations in managing processes that connect machine learning to business behavior.

Posted May 26, 2016

Thanks to the cloud and other empowering technologies such as Hadoop and Apache Spark, we're at the tipping point for big data. These technologies now provide a path to big data success for companies who otherwise lack the specialized big data skills or heretofore proprietary (and expensive) infrastructure to do it themselves. As 2016 progresses, we'll see the broader market put big data capabilities to work and the benefits of big data will, in turn, spread beyond the privileged few companies that were early big data adopters.

Posted May 25, 2016

Utilizing data lakes are an alluring option for users with an enormous amount of information, yet questions remain regarding data accuracy, security, and relevancy. Three experts in the big data space, including Anne Buff, business solutions manager for SAS best practices at the SAS Institute, Abhik Roy, database solution engineer at Experion, and Tassos Sarbanes, data architect at Credit Suisse, participated in a roundtable discussion at Data Summit 2016 that focuses on these questions and more regarding data lakes.

Posted May 25, 2016

With newer and newer big data sources exploding onto the scene, traditional data warehouses are being challenged. Satya Bhamidipati, director of business development of big data and advanced analytics at Oracle, discussed data mining and advanced analytics techniques that will enable the monetization of data during a session at Data Summit 2016.

Posted May 25, 2016

Oracle is introducing version 4.0 of its NoSQL database. First introduced in 2011, the Oracle NoSQL Database is a key-value database that evolved from the company's acquisition of BerkeleyDB Java Edition, a mature, high-performance embeddable database. Ashok Joshi, senior director of NoSQL, Berkeley Database, and Database Mobile Server at Oracle, outlined the key enhancements in the new release.

Posted May 25, 2016

EMC Corp.'s Enterprise Content Division (ECD) is releasing an upgraded version of its EMC InfoArchive platform, enhancing the ability to secure and leverage large amounts of critical data and content.

Posted May 25, 2016

Data Summit 2016 kicked off at the New York Hilton Midtown earlier this month with keynote presentations by Ben Wellington, the creator of I Quant NY, and Nicholas Chandra, vice president of Cloud Customer Success at Oracle.

Posted May 25, 2016

Trifacta, a provider of data prep software and tools, is introducing the Wrangler Partner Program, a global program that will support and enable partners to sell, implement, and innovate with Trifacta.

Posted May 24, 2016

Data Summit 2016, held in May in NYC, brought together IT managers, data architects, application developers, data analysts, project managers, and business managers to hear industry-leading professionals deliver educational presentations on industry trends and technologies, networks with their peers, and participate in hands-on workshops. Here are 10 key takeaways from Data Summit 2016:

Posted May 23, 2016

Companies are facing a growing problem: Data is everywhere, clogging up systems and preventing enterprises from gaining meaningful insights. Data virtualization is a way to reduce data proliferation and ensure that all consumers are working from a single source.

Posted May 23, 2016

What does the Oracle DBA need to know about NoSQL? Charles Pack, technical director, CSX Technology, answered that question in a session at Data Summit 2016 titled "Oracle NoSQL for the Oracle RDBMS DBA." The Oracle NoSQL Database offers benefits and Oracle DBAs have the opportunity to add it to their portfolios, according to Pack, who covered where the NoSQL database fits in within the overall Oracle ecosystem. "It is not a replacement for all your databases. It is a piece in the puzzle," emphasized Pack.

Posted May 18, 2016

Syncsort is adding new capabilities to its platform, including native integration with Apache Spark and Apache Kafka. DMX-h v9 allows organizations to access and integrate enterprise-wide data with streams from real-time sources.

Posted May 18, 2016

SnapLogic is unveiling new updates to its SnapLogic Elastic Integration Platform that add the ability to integrate streaming data and power big data analytics in the cloud. The Spring 2016 release adds support for Apache Kafka, Microsoft HDInsight, and Google Cloud Storage, plus multiple enhancements that automate data shaping and management tasks.

Posted May 18, 2016

AtScale, Inc., which provides a self-service BI platform for Hadoop, has raised a Series B round of $11 million, bringing its total funding to date to $20 million. According to Bruno Aziza, chief marketing officer of AtScale, its platform is different from others in three key ways, making it applicable to use cases in an array of industries including healthcare, telecommunications, retail, and financial services.

Posted May 17, 2016

There are many ways to combine structured with unstructured data explained Jana Mikovska, senior consultant as Raytion, and Sebastian Klatt, vice president of business development at Raytion. Their presentation at Data Summit 2016 focused on approaches and advantages of combining the two to uncover knowledge buried in unstructured information.

Posted May 16, 2016

What type of email garners the most attention? How can enterprises hook more customers and be in the know? Matt Laudato, senior manager for Big Data Analytics at Constant Contact, addressed these questions and more during Data Summit 2016, explaining how to use big data analytics to optimize email marketing campaigns.

Posted May 12, 2016

As enterprises search for ways to support modern data applications and keep up with the pace of the revolving door of new technologies and solutions, bottlenecks become more frequent and put a stop to application development.

Posted May 12, 2016

Ensuring data is governed properly is a hot topic as more tools and capabilities to analyze and gain insights become available. At the same time, data discovery is becoming more imperative as analysts must be able to move through the process with as little friction as possible.

Posted May 11, 2016

Big data is changing how we view the world, and necessitating new ways of handling data as well, according to Data Summit 2016 keynotes presented on Wednesday by Kalev Hannes Leetaru and George John.

Posted May 11, 2016

At Data Summit 2016, Robin A. Thottungal, the EPA's first chief data scientist, explained how the agency is becoming more data-driven and shared some of challenges and the innovative solutions taken by the agency in implementing real-time monitoring of environmental parameters about the current state of our ecosystem.

Posted May 10, 2016

At the center of the new big data movement is the Hadoop framework, which provides an efficient file system and related ecosystem of solutions to store and analyze big datasets. The Hadoop ecosystem was addressed from two points of view in a session at Data Summit 2016. James Casaletto, principal solutions architect, Professional Services at MapR, presented a talk titled "Harnessing the Hadoop Ecosystem," and Tassos Sarbanes, mathematician / data scientist, Investment Banking at Credit Suisse, covered the advantages of HBase in a talk titled "HBase Data Model - The Ultimate Model on Hadoop."

Posted May 10, 2016

Despite the increasing focus on offering more access to more users in organizations, ad hoc querying of big data remains a problem for most, according to Jair Aguirre, data scientist at Booz Allen Hamilton, who presented a session at Data Summit 2016 titled "De-Siloing Data Using Apache Drill."

Posted May 10, 2016

IT and businesses don't always see eye to eye when it comes to overall goals within an enterprise. To address this glaring issue, Anne Buff, business solutions manager and thought leader for SAS Best Practices, a thought leadership organization at SAS Institute, discussed aligning data strategy goals at Data Summit 2016.

Posted May 10, 2016

Sinequa has announced the general availability of Sinequa ES Version 10. Powered by machine learning capabilities, the new version aims to deliver deep analytics of contents and user behavior, and offer information with continually improving relevance to users in their work environments. In order to achieve this advancement into the world of cognitive computing, with this new version, Sinequa has integrated the Spark platform in its distributed architecture and implemented machine learning algorithms on Spark within the core of its product

Posted May 05, 2016

Say what you will about Oracle, it certainly can't be accused of failing to move with the times. Typically, Oracle comes late to a technology party but arrives dressed to kill.

Posted May 04, 2016

Oracle database migration can pose a variety of learning curve challenges. However, a platform does exist that can make the transition easier. In a recent DBTA webinar, Bill Brunt, product manager of SharePlex at Dell, discussed how users can reduce downtime, migrate at speed, eliminate risk, and validate success by tapping into SharePlex.

Posted May 03, 2016

The new name for Dell after it merges with EMC later in 2016 will be Dell Technologies. The new name was announced by Michael Dell, chairman and CEO of Dell Inc., at EMC World and in a letter to Dell team members.

Posted May 02, 2016

Qubole is announcing two major changes. It is releasing an open sourced version of its StreamX tool and forming a partnership with Looker.

Posted May 02, 2016

Enterprises are constantly searching for ways to capture, leverage, and analyze data effectively. However, bottlenecks can wreak havoc on the application development process.

Posted April 29, 2016

Enabled by a partnership with Pentaho, a Hitachi Group Company, and integration with Pentaho's Big Data Integration and Analytics platform, Melissa Data's data quality tools and services can now be scaled across the Hadoop cluster to cleanse and verify data center records.

Posted April 27, 2016

Cisco is launching an appliance that includes the MapR Converged Data Platform for SAP HANA, making it easier and faster for users to take advantage of big data. The UCS Integrated Infrastructure for SAP HANA is made easy to deploy, speeds time to market, and will reduce operational expenses along with providing users with the flexibility to choose a scale-up (on-premises) or scale-out (cloud) storage strategy.

Posted April 27, 2016

Voting has opened for the 2016 DBTA Readers' Choice Awards. Cloud, in-memory, real-time, virtualization, SaaS, IoT - today, there are many opportunities for data-driven companies to take advantage of more data in more varieties flowing at greater velocity than ever before.

Posted April 27, 2016

Cloudera, provider of a data management and analytics platform built on Apache Hadoop and open source technologies, has announced the general availability of Cloudera Enterprise 5.7. According to the vendor, the new release offers an average 3x improvement for data processing with added support of Hive-on-Spark, and an average 2x improvement for business intelligence analytics with updates to Apache Impala (incubating).

Posted April 26, 2016

Neo Technology, creator of Neo4j, is releasing an improved version of its signature platform, enhancing its scalability, introducing new language drivers and a host of other developer friendly features.

Posted April 26, 2016

Along with an increasing flow of big data that needs to be captured and analyzed, IT departments today also have more solution choices than ever before. However, before making a solution selection, organizations need to understand their requirements and also evaluate the attributes of the possible tools.

Posted April 25, 2016

The COLLABORATE 16 conference for Oracle users kicked off with a presentation by Unisphere Research analyst Joe McKendrick who shared insights from a ground-breaking study that examined future trends and technology among 690 members of three major Oracle users groups.

Posted April 25, 2016

The need for data integration has never been more intense than it has been recently. The Internet of Things and its muscular sibling, the Industrial Internet of Things, are now being embraced as a way to better understand the status and working order of products, services, partners, and customers. Mobile technology is ubiquitous, pouring in a treasure trove of geolocation and usage data. Analytics has become the only way to compete, and with it comes a need for terabytes—and gigabytes—worth of data. The organization of 2016, in essence, has become a data machine, with an insatiable appetite for all the data that can be ingested.

Posted April 25, 2016

GridGain Systems, provider of enterprise-grade in-memory data fabric solutions based on Apache Ignite, is releasing a new version of its platform. GridGain Professional Edition includes the latest version of Apache Ignite plus LGPL libraries, along with a subscription that includes monthly maintenance releases with bug fixes that have been contributed to the Apache Ignite project but will be included only with the next quarterly Ignite release.

Posted April 20, 2016

Dataguise, a provider of data security solutions, is making DgSecure available for the detection, monitoring, and protection of sensitive data across Amazon Web Services (AWS) Simple Storage Service (S3) and all Elastic MapReduce (EMR) platforms that use AWS S3.

Posted April 19, 2016

Sumo Logic, a provider of cloud-native, machine data analytics services, is unveiling a new platform that natively ingests, indexes, and analyzes structured metrics data, and unstructured log data together in real-time.

Posted April 18, 2016

Hortonworks is making several key updates to its platform along with furthering its mission as being a leading innovator of open and connected data solutions by enhancing partnerships with Pivotal and expanding upon established integrations with Syncsort.

Posted April 15, 2016

Pages
1
2
3
4
5
6
7
8
9
10
11
12

Sponsors