As a data professional, you have heard the term "unstructured data." And you probably know what is meant by that term, as well. For those who do not, unstructured data is a general term used to define data that is not numbers, letters, and dates stored or viewed as rows and columns. But it is a horrible term. In fact, unstructured data is a lie. Let me tell you why.
Posted August 10, 2015
You'd be surprised at the variety of sages that have uttered pithy pieces of wisdom that prove useful in some way to DBAs. So with that in mind, let's review some of the could-have-been-DBAs through history by reviewing their own words!
Posted July 08, 2015
The landscape for database management systems is changing more rapidly these days than it has since the earliest days of the relational DBMS. Not only do we have an onslaught of NoSQL database systems of various different forms (column, document, key/value, and graph databases), but we also see a burgeoning market for in-memory database management, where the DBMS relies on main memory instead of disk for data storage, management, and manipulation. But there is another "category" of DBMS evolving that is known as "NewSQL."
Posted June 09, 2015
Every now and then, somebody will raise the age-old question "How can I measure the effectiveness and quality of my DBA staff?" This can be a difficult question to answer. And it almost always hides the actual question that is begging to be asked, which is "How many DBAs do we need?"
Posted May 14, 2015
When databases are built from a well-designed data model, the resulting structures provide increased value to the organization. The value derived from the data model exhibits itself in the form of minimized redundancy, maximized data integrity, increased stability, better data sharing, increased consistency, more timely access to data, and better usability.
Posted April 06, 2015
One of the trickiest aspects of relational database management can be dealing with missing information. The standard method of representing missing information is to set the "value" to null.
Posted March 12, 2015
When the data requirements of an organization change, the databases used to store the data must also change. Unfortunately, today's database systems do not make managing database change particularly easy.
Posted February 11, 2015
In 2014, the big data drumbeat continued to pound, major DBMS vendors expanded their product offerings, Microsoft hired a new CEO, and a range of new technology offerings were introduced. In retrospect, what stands out?
Posted January 07, 2015
Lock contention issues can be frustrating problems to investigate and debug. Before blaming the database system (as is the usual response), there are a few questions to consider that can help identify the cause of the locking problems.
Posted December 03, 2014
Designing an appropriate set of indexes can be one of the more troubling aspects of developing efficient relational database applications. Perhaps, the most important thing you can do to assure optimal application performance when accessing data in a relational/SQL database is to create the correct indexes for your tables based on the queries your applications use. Of course, this is easier said than done. With the caveat that appropriate index creation can be complicated, let's look at the top 10 steps to building the right indexes on your tables.
Posted November 12, 2014
Check constraints are supported in all of the major DBMS products, including DB2, Oracle, and SQL Server. But they are too-often ignored and not implemented in production databases. Properly defined, check constraints can provide enhanced data integrity without requiring procedural logic.
Posted October 08, 2014
Recounting the evolution of IBM's relational database technology DB2, Craig S. Mullins goes back a full 3 decades in his latest column, starting with the wild and woolly 1980s and ending with DB2 10 and 11 for z/OS as the currently supported versions of DB2, which have added important new functionality.
Posted September 10, 2014
Programs that read database data can access numerous rows and are therefore susceptible to concurrency problems. To get around this issue, most major RDBMS products support read-through locks, also known as "dirty read" or "uncommitted read," to help overcome concurrency problems. When might you want to consider using dirty reads in your applications?
Posted August 05, 2014
Lists help us to remember, impose order upon tasks, and provide guidance. Some lists have taken on great importance in the world and in our lives. The Bill of Rights is such a list. So is the Ten Commandments. But although we use lists to administer, program, and maintain our databases, there really is no overarching list of "things you should (or should not) do" to ensure an effective, accurate, well-designed database. So here is the list of Ten Database Commandments.
Posted July 03, 2014
Good data quality starts with metadata. Accurate data definitions are required in order to apply the controls for compliance to the correct data. Without proper metadata definitions, it is impossible to apply regulatory compliance to data.
Posted June 11, 2014
Recursive SQL can be very elegant and efficient. However, because of the difficulty developers can have understanding recursion, it is sometimes thought of as "too inefficient to use frequently." But, if you have a business need to walk or explode hierarchies in your database, recursive SQL will likely be your most efficient option.
Posted May 08, 2014
Driving a wide range of applications, from operational applications such as fraud detection to strategic analysis such as customer segmentation, advanced analytics goes deeper than traditional business intelligence activities into the "why" of the situation, and delivers likely outcomes.
Posted April 04, 2014
Database disaster recovery must be an integral component of your overall business recovery plan. Just because your organization has not yet experienced a disaster, or is not in a high-risk area, does not absolve you from the need for contingency planning. Here is what all DBAs should do to be prepared.
Posted March 12, 2014
Metadata is required to make data usable by applications, yet many data professionals do not understand or simply ignore metadata management. Just think about the metadata you must manage on your iPod—or whatever MP3 player you use—to be able to successfully select the music you want to hear.
Posted February 10, 2014
2013 saw its fair share of important data-related acquisitions. Key players like Oracle, IBM and SAP were all busy with data-related acquisitions. Last year also saw the announcement and release of a lot of great new DBMS technology, too. Let's look back over the most significant data and database-related events of the previous year.
Posted January 07, 2014
DB2 11 for z/OS brings with it a bevy of interesting and useful new features. They range the gamut from development to administration to performance to integration with big data. Even if your company won't be migrating right away, now that DB2 11 is out in the field and available, the time has come for all DB2 users to take some time to learn what this latest and greatest version of DB2 can do.
Posted December 04, 2013
Although the concept of NoSQL is relatively new, some of the players have actually been around for a while. Yet, the concept of what NoSQL means continues to change. With all that in mind, let's take a look at the world of NoSQL.
Posted November 13, 2013
A successful DBA must understand application development and the issues involved in programming and design. Here are some things that every DBA must know about application development and the design projects of their organization.
Posted October 09, 2013
DBAs need to make many different types of changes to a database over its lifetime. Some will be simple and easy to implement, others much more difficult and complex. It is the DBA's job to understand the best way to implement any type of database change, but often, simple changes become more difficult in the real world. Database change management tools help to make this job easier and is one of the first tools acquired by many organizations when they implement a database of any size.
Posted September 11, 2013
Just about every company with a DBMS has that binder full of corporate and/or IT standards. That one over there in the corner with the cobwebs on it — the one that you only use when you need an excuse to avoid work. Okay, well, maybe it's not quite that bad. Your standards documents could be on the company intranet or some other online mechanism (but chances are there will be virtual cobwebs on your online standards manuals, too).
Posted August 07, 2013
Perhaps the most common question I am asked by readers is "How can I become a DBA?" The question is actually not as simple as it seems and there are many different aspects to the answer. Sometimes it is an application programmer who wants to become a DBA. Sometimes a DBA for one DBMS wants to support a different DBMS; for example, a SQL Server DBA wants to move to become an Oracle DBA. And other times someone with no IT background at all asks the question. The answer is different for all of these folks.
Posted July 09, 2013
Although storage management can be an afterthought for the DBA, it really shouldn't be. Storage issues are vitally important and unless managed appropriately, it can be very costly. The cost of managing storage can be as much as 10 times higher than the initial cost of acquiring the storage—and the growth rate for disk storage was 37% between 1996 and 2007. Even so, it is common for storage-related issues to be relegated to the backburner by DBAs, but every database professional should understand modern storage basics.
Posted June 13, 2013
Data is not sedentary. Once data has been created, organizations tend to move it around to support many different purposes—different applications, different geographies, different users, different computing environments, and different DBMSs. Data is copied and transformed and cleansed and duplicated and stored many times throughout the organization. Different copies of the same data are used to support transaction processing and analysis; test, quality assurance, and operational systems; day-to-day operations and reporting; data warehouses, data marts, and data mining; and distributed databases. Controlling this vast sea of data falls on the DBA who uses many techniques and technologies to facilitate data movement and distribution.
Posted May 09, 2013
"Big data" and the impact of analytics on large quantities of data is a persistent meme in today's Information Technology market. One of the big questions looming in IT departments about big data is what, exactly, does it mean in terms of management and administration. Will traditional data management concepts such as data modeling, database administration, data quality, data governance, and data stewardship apply in the new age of big data? According to analysts at Wikibon, big data refers to datasets whose size, type and speed of creation make it impractical to process and analyze with traditional tools . So, given that definition, it would seem that traditional concepts are at the very least "impractical," right?
Posted April 10, 2013
When data professionals think about regulatory compliance we tend to consider only data in our production databases. After all, it is this data that runs our business and that must be protected. So we work to implement database auditing to know who did what to which data when; or we tackle database security and data protection initiatives to protect our data from prying eyes; or we focus on improving data quality to ensure the accuracy of our processes.
Posted March 14, 2013
Enterprise developers these days are usually heads down, in the trenches working on in-depth applications using Java or .NET with data stored in SQL Server or Oracle or DB2 databases. But there are other options. One of them is FileMaker, an elegant database system and development platform that can be used to quickly build visually appealing and robust applications that run on Macs, Windows PCs, smartphones, and iPads.
Posted February 13, 2013
Each new year, this column looks back over the most significant data and database-related events of the previous year. Keeping in mind that this column is written before the year is over (in November 2012) to meet publication deadlines, let's dive into the year that was in data.
Posted January 03, 2013
A proper database design cannot be thrown together quickly by novices. A practiced and formal approach to gathering data requirements and modeling data is mandatory. This modeling effort requires a formal approach to the discovery and identification of entities and data elements. Data normalization is a big part of data modeling and database design. A normalized data model reduces data redundancy and inconsistencies by ensuring that the data elements are designed appropriately.
Posted December 06, 2012
Every organization that manages data using a DBMS requires a database administration group to ensure the effective use and deployment of the company's databases. And since most modern organizations of every size use a DBMS, most organizations have DBAs, or at least people who perform the on-going maintenance and optimization of the database infrastructure.
Posted November 13, 2012
As businesses push to reduce the data latency between analytical systems and operational systems, data warehouses begin to take on more of the character of a transactional system. For a data warehouse to deliver near real-time information, the choices generally are to update the warehouse more frequently or access data directly from operational systems. Either way, the push to reduce latency changes the nature of database performance to support the data warehouse. The main reason we created data warehouses in the first place was to separate resource-intensive analytical processing from shorter duration, but very frequent transaction processing. As the two worlds now come back together, the churn pressure on the database system can be significant.
Posted October 10, 2012
If you've worked with relational database systems for any length of time, you've probably participated in a discussion (argument?) about the topic of this month's column, surrogate keys. A great debate rages within the realm of database developers about the use of "synthetic" keys. And if you've ever Googled the term "surrogate key," you know the hornet's nest of opinions that swirls around on the topic. For those who haven't heard the term, here is my attempt at a quick summary: A surrogate key is a generated unique value that is used as the primary key of a database table; database designers tend to consider surrogate keys when the natural key consists of many columns, is very long, or may need to change.
Posted September 11, 2012
It is impossible to have missed the sweeping changes being thrust upon the data world due to regulatory compliance. But even if you've noticed, chances are that the sheer volume of regulations was too mind-boggling to fully digest. Compliance starts with the CEO, but it works its way down into the trenches, and impacts database administration. With that in mind, this month's column will offer a brief introduction to the regulatory landscape and its impact on database administration.
Posted August 09, 2012
Although data integrity is a pervasive problem, there are some data integrity issues that can be cleaned up using a touch of SQL. Consider the common data entry problem of extraneous spaces in a name field. Not only is it annoying, sometimes it can cause the system to ignore relationships between data elements.
Posted July 11, 2012
Unless you plan for and issue regular COMMITs in your database programs, you will be causing locking problems. It is important for every programmer to issue COMMIT statements in all application programs where data is modified (INSERT, UPDATE, and DELETE). A COMMIT externalizes the modifications that occurred in the program since the beginning of the program or the last COMMIT. A COMMIT ensures that all modifications have been physically applied to the database, thereby ensuring data integrity and recoverability. Failing to code COMMITs in a data modification program is what I like to call "Bachelor Programming Syndrome" — in other words, fear of committing.
Posted June 13, 2012
Every now and then some sage consultant will offer advice like "Large tables should be partitioned" or "Be sure to use static SQL for your applications with high volume transaction workloads." But how useful is this advice? What do they mean by large and high volume? Terms such as these are nebulous and ever changing. Just what is a large database today?
Posted May 09, 2012
An important aspect of database security is designing your applications to avoid SQL injection attacks. SQL injection is a form of web hacking whereby SQL statements are specified in the fields of a web form to cause a poorly designed web application to dump database content to the attacker. Stories abound in the news where SQL injection was used for nefarious purposes. Several high-profile cases over the past few years impacted government websites, Microsoft in the U.K., a Swedish election, PBS (the Public Broadcasting System), and Lady Gaga's website (among many others).
Posted April 11, 2012
As a DBA, it is almost inevitable that you will change jobs several times during your career. When making a job change, you will obviously consider requirements such as salary, bonus, benefits, frequency of reviews, and amount of vacation time. However, you also should consider how the company "treats" their DBAs. Different organizations place different value on the DBA job. It is imperative to your career development that you scout for progressive organizations that understand the complexity and ongoing learning requirements for the position.
Posted March 07, 2012
Many types of data change over time, and different users and applications have requirements to access data at different points in time. A traditional DBMS stores data that is implied to be valid at the current point-in-time, it does not track the past or future states of the data. For some, the current, up-to-date values for the data are sufficient. But for others, accessing earlier versions of the data is needed. Temporal support makes it possible to store different database states and to query the data "as of" those different states.
Posted February 09, 2012
At the outset of each new year, I devote an edition of my column to review the significant data and database-related events of the previous year. Of course, to meet my deadlines, the column is written before the year is over (this column is being written in November 2011), so please excuse any significant news that may have happened late in December.
Posted January 11, 2012
In a world replete with regulations and threats, organizations today have to go well beyond just securing their data. Protecting this most valuable asset means that companies have to perpetually monitor their systems in order to know who did exactly what, when and how - to their data.
Posted December 01, 2011
Being a successful database administrator requires far more than technical acumen and database knowledge. DBAs should be armed with a proper attitude as well as sufficient fortitude and personality before attempting to practice database administration. Gaining the technical know-how is important, yes, but there are many sources that offer technical guidance for DBAs. The non-technical aspects of DBA are just as challenging, though. So with that in mind, this month's column will offer 10 "rules of thumb" for DBAs to follow as they improve their soft skills.
Posted November 10, 2011
Every professional programmer (and DBA) should have a library of books on SQL fundamentals. There are many SQL titles to choose from, and a lot of them are very good. But this month's column will outline four SQL books that should be on every database professional's bookshelf.
Posted October 15, 2011
Cost containment is an important IT department goal in this day and age of financial austerity. Every decision regarding your computing environment must be weighed not only against the value it can deliver to your organization but also upon its cost to procure, implement, and maintain. If a positive return on investment cannot be rapidly delivered, then the software (or hardware) won't be adopted.
Posted September 14, 2011
Often, when the business of data management frustrates me, I look for inspiration in what may seem at first glance to be odd places. For instance, I think the Lewis Carroll "Alice in Wonderland" books offer sage advice for our particular industry.
Posted August 11, 2011
Simplification is important in today's era of increasing complexity and ever-changing software environments. A key component of simplification is to remember the basics and apply some elementary rules and practices to your database environment. Many problems arise because we don't keep track of the things most of us already know. I stumbled upon the idea for this month's column after recalling Malcolm Gladwell's excellent business book Blink: The Power of Thinking Without Thinking. In this book, Gladwell offers up case studies and examples depicting the benefit of our "adaptive unconscious" - a powerful innate ability that provides us with instant and sophisticated information.
Posted July 07, 2011