Big Data Quarterly Articles



Data security is a growing concern in all enterprises, and since DBAs have been guardians of the data, their skillset can be utilized in moving forward to combat new risks and threats, writes Extreme Scale Solutions' Michelle Malcher in the latest article in a six-part series by editors of IOUG SELECT and Big Data Quarterly on the "The Changing Role of the Modern DBA."

Posted January 22, 2018

Cybersecurity and terms written about it, such as "ransomware," "malware," and "hacker" have become part of our regular vocabulary. Concerns about protecting our identities and personal data are on the rise in our day-to-day lives. With the onslaught of news reports about data breaches and various hacks, the awareness is more elevated now than it has been in the past few years.

Posted January 11, 2018

Emerging technologies are outpacing data governance at a rapid clip. Specifically, the rate of growth and development of emerging technologies in areas such as artificial intelligence (AI), the Internet of Things (IoT), and machine learning (ML) drastically exceeds the current speed and willingness of businesses to change their governance models to manage and protect their data and information assets. Unfortunately, the larger the delta becomes between the advancements in technology and the changes in governance, the greater the risks and losses for the business.

Posted January 09, 2018

As the use of cloud expands as a database platform and automated capabilities increase, there are new opportunities for DBAs to move beyond operations and emergency fire-fighting and instead get involved in more innovative, high-level tasks that bring additional value to their organizations. Experts weigh in on what's ahead.

Posted January 08, 2018

A lot has been written about why IoT is going to be absolutely essential to your company. If you are convinced as to the why but wonder about the how, pay attention. Based on all the lessons learned from my engagements around the globe, I am going to tell you how. It basically comes down to three basic phases: preparation, the "real stuff," and embedding it in the organization.

Posted January 08, 2018

In a world where new technologies are often presented to the industry as rainbows and unicorns, there is always someone in a cubicle trying to figure out how to solve business problems and just make these great new technologies work together. The truth is that all of these technologies take time to learn, and it also takes time to identify the problems that can be solved by each of them.

Posted January 05, 2018

As Paul Graham, founder of Y-Combinator, has put it: "Hacking and painting have a lot in common … [hackers and painters] they're both makers." And as modern-day artists, developers will help create the future of cognitive computing and AI. New technology has granted them more paints and more materials with which to paint and sculpt. It will allow them to create works of art never previously envisioned, that are powered by big datasets and cognitive systems. But they can't go it alone. Even da Vinci had his patrons. That leads to the next logical questions. How should organizations support their developers right now? And where should they place resources to ensure their developers reap the greatest benefits from cognitive technologies?

Posted January 04, 2018

The suddenness of the non-relational "breakout" created a lot of noise and confusion and—at least initially—an explosion of new database systems. However, the database landscape is settling down, and in the past few years, the biggest meta trend in database management has been a reduction in the number of leading vendors and consolidation of core technologies. Additionally, we're starting to see database as a service (DBaaS) offerings become increasingly credible alternatives to on-premise or do-it-yourself cloud database configuration.

Posted January 03, 2018

The expense to design and manufacture robots is rapidly declining. Big data, artificial intelligence (AI), machine learning, and deep learning, coupled with componentized software and near-unlimited processing power, are enabling new classes of robotics never possible before. The graphical processing unit has evolved far from its humble video game beginnings into a system of such processing power that a microsecond is considered to be unacceptably high latency. Large fortunes are being generated as these companies are developing into entities that resemble the great figures and even empires of antiquity. When considering the technology industry as a whole, it is understandable if we wonder if we are witnessing the inception of a new colossus.

Posted January 03, 2018

You're a large organization headquartered in the U.S. Your clients and customers live mainly in the U.S. You have no intention to expand beyond the U.S. What could you possibly need to know about new data rules in the EU? Everything, it turns out.

Posted January 02, 2018

With cyberattacks on the rise and the EU's new General Data Protection Regulation (GDPR) going into effect in 2018, there is a greater focus on data security and governance. Here six top IT leaders reflect on the changes taking place and offer their predictions for data security and compliance in 2018.

Posted December 13, 2017

Web scraping can be an invaluable skill to possess when working on data-related projects because many interesting analytics projects often start not with over-explored internal data, but with the treasure trove of information found on the web, according to authors, lecturers, and data scientists Seppe vanden Broucke and Bart Baesens. However, while the web holds a wealth of information, collecting and structuring web data can be a daunting prospect for many data practitioners, believes Baesens who has written a new book on the topic with vanden Broucke titled, Web Scraping for Data Science with Python. Here, Baesens expands on the techniques and uses for web scraping.

Posted December 04, 2017

Databricks, provider of a Unified Analytics Platform and founded by the team who created Apache Spark, has become a partner with Microsoft to expand the reach of its Unified Analytics Platform and address customer demand for Spark on Microsoft Azure.

Posted November 29, 2017

One of the biggest challenges IT professionals responsible for corporate data will face in 2017 comes from a law passed by the European Union due to take effect in 2018, the General Data Protection Regulation (GDPR). A metadata-driven approach is important for ensuring compliance with this new data privacy law.

Posted November 28, 2017

Protecting online systems has become an increasingly difficult job. By shifting your mindset from "if" to "when" a cyberattack happens, certain activities which may appear burdensome, tedious, and sometimes are even ignored, will become relevant and important.

Posted November 28, 2017

Data integration can seem like a never-ending quest as organizations try to combine and access data from disparate applications and sources. But as we move beyond relational as the only DBMS type that matters and embrace NoSQL and Hadoop data platforms, data integration can become more challenging and require new tools and approaches to achieve success.

Posted November 28, 2017

No longer the stuff of science fiction, the business uses for cognitive computing, artificial intelligence, and machine learning today include fields as diverse as medicine, marketing, defense, energy, and agriculture. Enabling these applications is the vast amount of data that companies are collecting from machine sensors, instruments, and websites and the ability to support smarter solutions with faster data processing.

Posted November 13, 2017

AtScale is releasing a universal semantic platform for business intelligence (BI) with Microsoft Azure HDInsight, providing enterprises with faster time to insight. AtScale's new offer helps enhance Azure HDInsight customers' ability to quickly turn their Big Data lake into a highly available and performant analytical database.

Posted November 02, 2017

There's a surprising trick for greatly increasing the chances of real impact, true success with many types of machine learning systems, and that is "do the logistics correctly and efficiently."   That sounds like simple advice - it is - but the impact can be enormous. If the logistics are not handled well, machine learning projects generally fail to deliver practical value. In fact, they may fail to deliver at all. But carrying out this advice may not seem simple at all.

Posted October 26, 2017

After years of ambiguous expectations about data governance, organizations now have a better handle on how these programs can help them manage the exponential growth of the data they generate. More importantly, there is an understanding about how the integration, organization, and alignment of that data can help meet or exceed business and technology goals.

Posted October 18, 2017

As companies grow increasingly data-centric in their decision making, product and services development, and their overall understanding of the world they work in, speed and agility are becoming critical capabilities. A common theme in big data and analytics today is "Industry 4.0," representing a new wave of technology that enables the automation necessary for scaling. There's compelling justification for this as companies seek to unlock business value from big data with two broad approaches: the democratization of data with greater access by more users, and the enablement of automation everywhere possible.

Posted September 20, 2017

The movement toward the instrumentation of everything and the democratization of data and analytics is resulting in more data flowing to more users, and is creating new challenges in data management.

Posted September 20, 2017

Over the last few years, organizations have shifted from using virtual data centers to creating private or hybrid IaaS clouds that allow authorized users to perform self-service provisioning of virtual machines. These environments have reduced administrative workloads, improved the user experience, and discouraged shadow IT, but they have also brought their own challenges. As virtualized environments increase in scale, management techniques have often become far less effective, making it difficult to keep track of virtual machines, their owners, and why the virtual machines were created in the first place.

Posted September 20, 2017

Companies today are spreading their applications across multiple clouds in a hybrid fashion. According to a recent IDC CloudView study among 6,000 IT and line-of-business executives whose organizations have adopted cloud technologies, 73% are implementing a hybrid strategy, which most defined as utilizing more than one public cloud in addition to dedicated assets.

Posted September 20, 2017

Many people are unsure of the differences between deep learning, machine learning, and artificial intelligence. Generally speaking, and with minimal debate, it is reasonably well-accepted that artificial intelligence can most easily be categorized as that which we have not yet figured out how to solve, while machine learning is a practical application with the know-how to solve problems, such as with anomaly detectio

Posted September 20, 2017

When it comes to visualizing data, there is no shortage of charts and graphs to choose from. From traditional graphs to innovative hand-coded visualizations, there is a continuum of visualizations ready to translate data from numbers into meaning using shapes, colors, and other visual cues. However, each visualization type is intended to show different types of data in specific ways to best represent its insight. Let's look at five of the most common visualization types to help you choose the right chart for your da

Posted September 20, 2017

Businesses of all sizes across all industries are rapidly adopting digital transformation models that put data at the center of driving the business forward—as they should. However, putting data at the center of everything the business does can be risky without proper planning and rigorous management. Many companies have been wise to introduce data governance programs to protect corporate data assets and establish a framework for operational excellence when it comes to data management and use. Data governance emphasizes the enforcement of defined standards or policies and provides mechanisms for consistency and repeatable processes, but it is not enough to protect businesses in today's world of data.

Posted September 20, 2017

Nowadays, many firms are already using big data and analytics to manage and optimize their customer relationships. Both technologies can also prove beneficial to leverage a firm's other key assets: its employees! Various HR analytics (also called workforce analytics) examples can be thought of.

Posted September 20, 2017

While Vic Damone and Jane Powell wanted their eggs with a kiss in the 1950s musical Rich, Young and Pretty, in the near future, your kitchen might well know exactly how you want them thanks to the Internet of Things (IoT).

Posted September 20, 2017

Tic toc, tic toc—back and forth swings the privacy pendulum. While we in the U.S. continue to regress on issues of data privacy, the European Union (EU) is proceeding with bold steps to protect the privacy of its citizens. On May 25, 2018, the General Data Protection Regulation (GDPR) becomes the law of the land in the EU. It applies to any company that processes or holds data on EU residents, regardless of where it is located in the world. Popular applications such as Facebook, Twitter, and Airbnb are among the companies that will be directly impacted by this law. If you do business with EU residents, regardless of geographic locality, this law directly applies to you.

Posted September 20, 2017

Qubole Data Service provides a single platform for ETL, reporting, ad hoc analysis, stream processing and machine learning. It runs on AWS, Microsoft Azure and Oracle Bare Metal Cloud, taking advantage of the elasticity and scale of the cloud, and also supports leading open source engines, including Apache Spark, Hadoop, Presto, and Hive.

Posted September 12, 2017

The Apache Arrow project is a standard for representing columnar data for in-memory processing, which has a different set of trade-offs compared to on-disk storage. In-memory, access is much faster and processes optimize for CPU throughput by paying attention to cache locality, pipelining, and SIMD instructions.

Posted September 12, 2017

New multi-cloud capabilities scale discovery and dependency mapping of all assets to go beyond on-prem data centers to public and private cloud.

Posted September 12, 2017

As organizations increasingly move their data and applications from on-premise deployments to the cloud, the role of the DBA is also shifting. According to Penny Avril, vice president of product management, Oracle Database, the transition means that DBAs have the opportunity to move from being data custodians and keepers to taking on a more strategic role in their organizations. But, she says,the time to prepare for the new cloud reality is now.

Posted September 07, 2017

Evaluating new and disruptive technologies, as well as when and where they may prove useful, is a challenge. Against the rapidly evolving big data scene, this year, Big Data Quarterly presents the newest "Big Data 50," an annual list of forward-thinking companies that are working to expand what's possible in terms of collecting, storing, and deriving value from data.

Posted September 07, 2017

InfluxData provides an open source platform built for metrics, events, and other time-based data. Recently, Evan Kaplan, CEO of InfluxData, reflected on the future of databases and why time-series represents the next wave of databases for data—from humans, sensors, and machines.

Posted August 30, 2017

These days, end users—be they employees or consumers visiting a site—expect information delivered in seconds, if not nanoseconds. Applications tied into networks of connected devices and sensors are powering operations and making adjustments on a real-time basis.

Posted August 16, 2017

Barracuda Networks, a provider of cloud-enabled security and data protection solutions, has added the ability to replicate data from either an on-premises physical or virtual backup appliance to AWS. The new feature provides customers, resellers, and MSPs with greater flexibility and choice to protect their data from data loss and potential disasters, including security threats like ransomware. This adds an additional option for customers, in addition to the ability to replicate to the Barracuda Cloud.

Posted August 15, 2017

As data flows into businesses faster than ever before time-to-insight and time-to-action are critical competitive differentiators, and the demand for fast access to information is growing.

Posted August 08, 2017

Dremio has announced its launch in the data analytics market with the availability of the Dremio Self-Service Data Platform. According to Dremio, its platform allows users to be independent and self-directed in their use of data, while accessing data from a variety of sources at scale.

Posted July 19, 2017

Pricchaa has released a free solution for detecting, encrypting, and monitoring sensitive data housed in the Amazon Web Services (AWS) Cloud.

Posted July 13, 2017

As data continues to impact every facet of every business more and more Global 2000 companies are choosing NoSQL databases to power their Digital Economy applications.

Posted June 15, 2017

The Internet of Things continues to grow exponentially, continuing to disrupt markets, and causing enterprises to grapple with more data than ever before. Without the right data management strategy, investments in IoT can yield limited results. DBTA recently held a webinar featuring John O'Brien, principal advisor and CEO of Radiant Advisors, and Vijay Raja, solutions marketing lead, IoT, at Cloudera, who discussed key drivers and patterns for IoT adoption across industries.

Posted June 15, 2017

The Apache Arrow project is a standard for representing data for in-memory processing.Hardware evolves rapidly. Because Apache Arrow was designed to benefit many different types of software in a wide range of hardware environments, the project team focused on making the work "future-proof," which meant anticipating changes to hardware over the next decade.

Posted June 14, 2017

Looker, provider of a cloud data platform, has announced Instant Insight, a new feature that allows users to analyze their data without waiting for help from an analyst.

Posted June 14, 2017

Attunity Ltd., a provider of data integration and big data management software solutions, is launching a new solution, Attunity Compose for Hive, which automates the process of creation and continuous loading of operational and historical data stores in a data lake.

Posted June 13, 2017

Addressing the rise of hybrid deployments, Hortonworks has introduced a new software support subscription to provide seamless support to organizations as they transition from on-premise to cloud. Separately, Hortonworks also announced the general availability of Hortonworks Dataflow (HDF) 3.0, a new release of its open source data-in-motion platform, which enables customers to collect, curate, analyze and act on all data in real-time, across the data center and cloud.

Posted June 12, 2017

Hadoop adoption is growing and so is the commitment to data lake strategies. Data security, governance, integration, and access have all been identified as critical success factors for data lake deployments.

Posted June 09, 2017

The demand for speed and agility are among the key drivers of the growing DevOps movement, which seeks to better align software development and IT operations. Yet, challenges still exist.

Posted June 07, 2017

With the recently unleashed WannaCry ransomware attacks that targeted computer systems globally fresh in attendees' minds, a number of Data Summit 2017 sessions looked at the need for smarter approaches to data governance and data security.

Posted May 24, 2017

Pages
1
2
3
4
5
6

Newsletters

Subscribe to Big Data Quarterly E-Edition