Newsletters




Hadoop

The Apache Hadoop framework for the processing of data on commodity hardware is at the center of the Big Data picture today. Key solutions and technologies include the Hadoop Distributed File System (HDFS), YARN, MapReduce, Pig, Hive, Security, as well as a growing spectrum of solutions that support Business Intelligence (BI) and Analytics.



Hadoop Articles

Trifacta, a provider of data wrangling software, is deepening technical integration with the Hortonworks Data Platform (HDP) and the industry's first certification for Apache Atlas, a data governance and metadata framework for Hadoop.

Posted June 23, 2016

For those who haven't encountered the term, the "trough of disillusionment" is a standard phase within the Gartner hype cycle. New technologies are expected to pass from a "peak of inflated expectations" through the trough of disillusionment before eventually reaching the "plateau of productivity." Most new technologies are expected to go through this trough, so it's hardly surprising to find big data entering this phase.

Posted June 22, 2016

The data manager now sits in the center of a revolution swirling about enterprises. In today's up-and-down global economy, opportunities and threats are coming in from a number of directions. Business leaders recognize that the key to success in hyper-competitive markets is the ability to leverage data to draw insights that predict and provide prescriptive action to stay ahead of markets and customer preferences. For that, they need to keep up with the latest solutions and approaches in data management. Here are 12 of the key technologies turning heads—or potentially opening enterprise wallets—in today's data centers.

Posted June 22, 2016

Announcing a new version of its "data lake in-a-box," Koverse, Inc. has released the Koverse Platform Version 2.0, which provides enhancements for organizations trying to extract value out of their investments in big data and analytics. With this introduction, the company says it is offering enterprises a 30-day guarantee to bring their data into production with a data lake capable of delivering insights for real-world organizational challenges.

Posted June 21, 2016

With the increase in data sources, data types, and data management platforms, new obstacles can also appear, creating difficulties in combining data for important insights. During educational presentations on industry trends and technologies, keynotes, discussions, and hands-on workshops at Data Summit 2016, the philosophies and technical approaches that can help organizations be successful at putting their data to work were addressed.

Posted June 17, 2016

What's on the horizon for big data, analytics, and business intelligence as technology evolves faster and faster? In Data Summit's 2016 closing keynote John O'Brien, principal analyst and CEO, at Radiant Advisors, discussed how technology will evolve and grow in the future.

Posted June 17, 2016

Hortonworks, Inc. is enhancing its Global Professional Services (GPS) program to support and enable Hortonworks Connected Data Platforms customers.

Posted June 16, 2016

Talend is "going all-in" with Amazon Web Services (AWS), now providing its entire product line on AWS cloud. The latest release of Talend Integration Cloud extends the company's ability to allow IT organizations to quickly "spin up and spin down" big data and data integration workloads running on Amazon Redshift or Amazon EMR.

Posted June 10, 2016

Cloudera is collaborating with Microsoft to build a new open source platform that will reduce the burden on application developers leveraging Spark. The two entities, together with other open source contributors, have built a new open source Apache licensed REST-based Spark Service, called Livy, which is still in early alpha development.

Posted June 09, 2016

Data Summit 2016 in New York City drew IT managers, data architects, application developers, data analysts, project managers, and business managers. Analytics, search, machine learning, and IoT were some of the key topics of discussion in educational presentations on industry trends and technologies, keynotes, and hands-on workshops.

Posted June 08, 2016

Progress is releasing a new package of platforms that will enable enterprises to tap into the full potential of digital business. Progress DigitalFactory is a new cloud-based platform that provides a holistic, extensible solution for businesses to create omni-channel digital experiences.

Posted June 08, 2016

This week at Spark Summit, data management companies are rolling out new Spark integrations and support at Spark Summit to enable their users to take advantage of the open source data processing framework. In addition, Databricks, the company founded by the team that created Apache Spark, has announced that the Databricks Community Edition (DCE) is now generally available.

Posted June 07, 2016

Emerging and newer vendors can offer fresh, innovative ways of dealing with data management and analytics challenges. Here, DBTA looks at the 10 companies whose approaches we think are worth watching.

Posted June 06, 2016

The IT landscape is always shifting and being contoured by external market forces and internal industry initiatives. Against this changing backdrop, each year, DBTA presents a list of 100 companies that matter in data, compelling us to pause and reflect on the market changes taking place.

Posted June 06, 2016

Teradata has introduced the Teradata Aster Connector for Spark, an integration of Apache Spark analytics with Teradata Aster Analytics. The connector enables pre-built analytics functions from both solutions to be executed from Aster Analytics, enabling anyone who can use Aster Analytics to also run advanced analytics on Spark without the need to learn or know Scala.

Posted June 06, 2016

NoSQL database technology vendor Couchbase has introduced a new Couchbase Spark Connector. According to Couchbase, the new Spark connector will enable businesses to gain business insights faster, enabling them to deliver better customer experiences through web, mobile and IoT applications.

Posted June 06, 2016

Today at Spark Summit, MapR Technologies is announcing a new enterprise-grade Apache Spark Distribution. "This is a Spark-focused distribution that combines Apache Spark with the real time, persistent, web-scale data layer of MapR," said Jack Norris, SVP, Data and Applications, MapR. The new Spark Distribution option for the MapR Converged Data Platform enables advanced analytics - including batch processing, machine learning, procedural SQL, and graph computation, and is a production-ready platform for Spark workloads on-premise and in the cloud.

Posted June 06, 2016

In the wide world of Hadoop today, there are seven technology areas that have garnered a high level of interest. These key areas prove that Hadoop is not just a big data tool; it is a strong ecosystem in which new projects coming along are assured of exposure and interoperability because of the strength of the environment.

Posted June 03, 2016

Open Text Corporation, a provider of Enterprise Information Management solutions, is entering into a definitive agreement to acquire Recommind, Inc., a leading provider of eDiscovery, and information analytics. The transaction purchase price is approximately $163 million and with this acquisition, Recommind's eDiscovery platform will complement OpenText's own enterprise information management (EIM) solutions.

Posted June 02, 2016

Oracle is introducing version 4.0 of its NoSQL database. First introduced in 2011, the Oracle NoSQL Database is a key-value database that evolved from the company's acquisition of BerkeleyDB Java Edition, a mature, high-performance embeddable database. Ashok Joshi, senior director of NoSQL, Berkeley Database, and Database Mobile Server at Oracle, outlined the key enhancements in the new release.

Posted June 01, 2016

Dynatrace, a digital performance software company, is teaming up with Pivotal, to deploy its application monitoring solutions for the Pivotal Cloud Foundry (PCF) platform. The integration of Dynatrace with Pivotal Cloud Foundry will enable companies to take advantage of this acceleration by collecting analytics for applications running on PCF, allowing them to detect and act on performance shortcomings and optimize end-to-end transaction latencies.

Posted May 26, 2016

Alpine Data is making advancements to its Chorus platform, combining an integrated analytics platform with improvements that will accelerate the delivery of data. Chorus 6 now delivers capabilities that will help business leaders take the reins in assisting organizations in managing processes that connect machine learning to business behavior.

Posted May 26, 2016

Thanks to the cloud and other empowering technologies such as Hadoop and Apache Spark, we're at the tipping point for big data. These technologies now provide a path to big data success for companies who otherwise lack the specialized big data skills or heretofore proprietary (and expensive) infrastructure to do it themselves. As 2016 progresses, we'll see the broader market put big data capabilities to work and the benefits of big data will, in turn, spread beyond the privileged few companies that were early big data adopters.

Posted May 25, 2016

Utilizing data lakes are an alluring option for users with an enormous amount of information, yet questions remain regarding data accuracy, security, and relevancy. Three experts in the big data space, including Anne Buff, business solutions manager for SAS best practices at the SAS Institute, Abhik Roy, database solution engineer at Experion, and Tassos Sarbanes, data architect at Credit Suisse, participated in a roundtable discussion at Data Summit 2016 that focuses on these questions and more regarding data lakes.

Posted May 25, 2016

With newer and newer big data sources exploding onto the scene, traditional data warehouses are being challenged. Satya Bhamidipati, director of business development of big data and advanced analytics at Oracle, discussed data mining and advanced analytics techniques that will enable the monetization of data during a session at Data Summit 2016.

Posted May 25, 2016

EMC Corp.'s Enterprise Content Division (ECD) is releasing an upgraded version of its EMC InfoArchive platform, enhancing the ability to secure and leverage large amounts of critical data and content.

Posted May 25, 2016

Data Summit 2016 kicked off at the New York Hilton Midtown earlier this month with keynote presentations by Ben Wellington, the creator of I Quant NY, and Nicholas Chandra, vice president of Cloud Customer Success at Oracle.

Posted May 25, 2016

Trifacta, a provider of data prep software and tools, is introducing the Wrangler Partner Program, a global program that will support and enable partners to sell, implement, and innovate with Trifacta.

Posted May 24, 2016

Data Summit 2016, held in May in NYC, brought together IT managers, data architects, application developers, data analysts, project managers, and business managers to hear industry-leading professionals deliver educational presentations on industry trends and technologies, networks with their peers, and participate in hands-on workshops. Here are 10 key takeaways from Data Summit 2016:

Posted May 23, 2016

Companies are facing a growing problem: Data is everywhere, clogging up systems and preventing enterprises from gaining meaningful insights. Data virtualization is a way to reduce data proliferation and ensure that all consumers are working from a single source.

Posted May 23, 2016

What does the Oracle DBA need to know about NoSQL? Charles Pack, technical director, CSX Technology, answered that question in a session at Data Summit 2016 titled "Oracle NoSQL for the Oracle RDBMS DBA." The Oracle NoSQL Database offers benefits and Oracle DBAs have the opportunity to add it to their portfolios, according to Pack, who covered where the NoSQL database fits in within the overall Oracle ecosystem. "It is not a replacement for all your databases. It is a piece in the puzzle," emphasized Pack.

Posted May 18, 2016

Syncsort is adding new capabilities to its platform, including native integration with Apache Spark and Apache Kafka. DMX-h v9 allows organizations to access and integrate enterprise-wide data with streams from real-time sources.

Posted May 18, 2016

SnapLogic is unveiling new updates to its SnapLogic Elastic Integration Platform that add the ability to integrate streaming data and power big data analytics in the cloud. The Spring 2016 release adds support for Apache Kafka, Microsoft HDInsight, and Google Cloud Storage, plus multiple enhancements that automate data shaping and management tasks.

Posted May 18, 2016

AtScale, Inc., which provides a self-service BI platform for Hadoop, has raised a Series B round of $11 million, bringing its total funding to date to $20 million. According to Bruno Aziza, chief marketing officer of AtScale, its platform is different from others in three key ways, making it applicable to use cases in an array of industries including healthcare, telecommunications, retail, and financial services.

Posted May 17, 2016

There are many ways to combine structured with unstructured data explained Jana Mikovska, senior consultant as Raytion, and Sebastian Klatt, vice president of business development at Raytion. Their presentation at Data Summit 2016 focused on approaches and advantages of combining the two to uncover knowledge buried in unstructured information.

Posted May 16, 2016

What type of email garners the most attention? How can enterprises hook more customers and be in the know? Matt Laudato, senior manager for Big Data Analytics at Constant Contact, addressed these questions and more during Data Summit 2016, explaining how to use big data analytics to optimize email marketing campaigns.

Posted May 12, 2016

As enterprises search for ways to support modern data applications and keep up with the pace of the revolving door of new technologies and solutions, bottlenecks become more frequent and put a stop to application development.

Posted May 12, 2016

Ensuring data is governed properly is a hot topic as more tools and capabilities to analyze and gain insights become available. At the same time, data discovery is becoming more imperative as analysts must be able to move through the process with as little friction as possible.

Posted May 11, 2016

Big data is changing how we view the world, and necessitating new ways of handling data as well, according to Data Summit 2016 keynotes presented on Wednesday by Kalev Hannes Leetaru and George John.

Posted May 11, 2016

At Data Summit 2016, Robin A. Thottungal, the EPA's first chief data scientist, explained how the agency is becoming more data-driven and shared some of challenges and the innovative solutions taken by the agency in implementing real-time monitoring of environmental parameters about the current state of our ecosystem.

Posted May 10, 2016

At the center of the new big data movement is the Hadoop framework, which provides an efficient file system and related ecosystem of solutions to store and analyze big datasets. The Hadoop ecosystem was addressed from two points of view in a session at Data Summit 2016. James Casaletto, principal solutions architect, Professional Services at MapR, presented a talk titled "Harnessing the Hadoop Ecosystem," and Tassos Sarbanes, mathematician / data scientist, Investment Banking at Credit Suisse, covered the advantages of HBase in a talk titled "HBase Data Model - The Ultimate Model on Hadoop."

Posted May 10, 2016

Pages
1
2
3
4
5
6
7
8
9
10
11
12

Sponsors