<< back Page 2 of 3 next >>

The Changing Role of the Modern DBA in a Big Data World

But, despite all that, Harrison said, there has never been a better time to be a data professional—providing you are ready to keep learning and adapting.  “In the same 10 years we’ve seen an explosion of significant new data stores—Hadoop, Spark, MongoDB, and Cassandra—just to mention a few.  Enterprises are trying harder than ever before to squeeze value from data assets to achieve mission-critical competitive advantage.”

DBAs aren't trained, nor are they expected to be able to identify on their own, which grains of data matter to the business, pointed out McKendrick. “But they are in a position to help the business understand what nuggets of data are out there, and how they are relevant to the business.”

Everything can be sensored, monitored, metered, and measured from the lunchroom to the executive suite to the factory floor, but, said McKendrick, the most challenging aspect of the big data revolution is the ability to understand what nuggets of information can be found that are of material importance to the business.

With the emergence of big data, data governance has also become progressively more challenging as businesses become more dependent on data analytics. “Increased speed-to-value requirements mean we must apply metadata, information lifecycle management, quality assurance, and security to streaming data to meet near real-time data access requirements,” said Caserta.  “As analytics becomes the back-bone of every business application, very high data volumes, reduced latency, high concurrency and continuous integration and continuous deployment are all common objectives for any modern data organization.”

Governance is a growing problem, agreed Corey. “Who owns it and what are we collecting and why are questions organizations need to ask,” he said. “Today, many organizations are just collecting data as if it’s the Wild West. They are making it actionable in any way they can. As this data becomes more important to the business, proper safeguards must be taken. In addition, there is the need for data security to ensure the data is safe and accurate.”

Still, it is essential to remember that big data technologies simply provide a new way to manage and provide data.  “The same challenges around integrity, scalability, and security exist,” said Reeves. “Every single challenge that we’ve overcome with the relational database management system (RDBMS) will be faced again with big data,” said Reeves. “We need those seasoned DBAs who hold a wealth of tribal knowledge between their ears learning these new big data platforms. Companies have invested a huge amount of time and money in developing them and they should be working to maintain that pattern recognition expertise.”

What New Skills Are Needed?

Clearly, the cloud is here to stay, and as a result, it is incumbent on DBAs to sharpen their cloud skills so they can oversee those services and be aware of cloud provider weaknesses.

“Most data projects are moving to the cloud, introducing a new set of skills to be learned by folks experienced in on-premise-only solutions,” said Caserta.  And many of the new skills that are important fall under the title of data engineer, he said. “Understanding how to spin up compute clusters, using Spark, and implementing LAMBDA and serverless architecture would be strategic skills for any system DBA to acquire.”

In addition, Caserta added, ETL is still as important as ever, so learning data integration using big data technologies is a critical path for any developer or application DBA. And, of all the languages, Python has gained the most popularity among data engineers and data scientists, said Caserta“Specifically, PySpark, which exposes the Spark programming model to Python through the Spark Python API, has become so critical to building data pipelines in analytics platforms, that we now test every candidate consultant for proficiency in Pyspark as a prerequisite for employment.”

'Specifically, PySpark, which exposes the Spark programming model to Python through the Spark Python API, has become so critical to building data pipelines in analytics platforms, that we now test every candidate consultant for proficiency in Pyspark as a prerequisite for employment.' – Joe Caserta

DBAs are the “sentinels of data availability,” said Reeves. “Just because you put your data in the cloud in a big data repository doesn’t mean all other problems go away. Having DBAs laser-focused on data is the second most important thing a company has—superseded only by the data itself.

What is needed is for DBAs to join the rest of IT, Reeves added. “While 10 years ago, the idea of a full-stack developer was unheard of outside of five-person startups, now it’s an imperative. We also have system administrators who write far more code today than the shell scripts they have in the past. During that journey, they’ve picked up soft skills such as project management and incident response, and some even understand how business demands are met with their contributions." Being a certified DBA with 20 years of experience with one company's products is no longer sufficient, he added. "IT needs gifted generalists, not domain experts. After all, don’t we all just Google our error messages now."

'Far too few data professionals spend enough time honing their soft skills, such as public speaking, writing, collecting requirements, active listening, or influence.' – Kevin Kline

Far too few data professionals spend enough time honing their soft skills, such as public speaking, writing, collecting requirements, active listening, or influence, agreed Kevin Kline, principal program manager at SentryOne, Microsoft Data Platform MVP, and a founding board member of PASS. "I’ve seen at least an equal proportion of projects run aground because of poor people skills as I have seen crash due to bad technical decisions. So, no matter what technologies you master, you’ll never regret joining Toastmasters or taking attending a seminar on leadership and motivation."

<< back Page 2 of 3 next >>


Newsletters

Subscribe to Big Data Quarterly E-Edition