"Big data" and the impact of analytics on large quantities of data is a persistent meme in today's Information Technology market. One of the big questions looming in IT departments about big data is what, exactly, does it mean in terms of management and administration. Will traditional data management concepts such as data modeling, database administration, data quality, data governance, and data stewardship apply in the new age of big data? According to analysts at Wikibon, big data refers to datasets whose size, type and speed of creation make it impractical to process and analyze with traditional tools . So, given that definition, it would seem that traditional concepts are at the very least "impractical," right?
Posted April 10, 2013
When data professionals think about regulatory compliance we tend to consider only data in our production databases. After all, it is this data that runs our business and that must be protected. So we work to implement database auditing to know who did what to which data when; or we tackle database security and data protection initiatives to protect our data from prying eyes; or we focus on improving data quality to ensure the accuracy of our processes.
Posted March 14, 2013
Enterprise developers these days are usually heads down, in the trenches working on in-depth applications using Java or .NET with data stored in SQL Server or Oracle or DB2 databases. But there are other options. One of them is FileMaker, an elegant database system and development platform that can be used to quickly build visually appealing and robust applications that run on Macs, Windows PCs, smartphones, and iPads.
Posted February 13, 2013
Each new year, this column looks back over the most significant data and database-related events of the previous year. Keeping in mind that this column is written before the year is over (in November 2012) to meet publication deadlines, let's dive into the year that was in data.
Posted January 03, 2013
A proper database design cannot be thrown together quickly by novices. A practiced and formal approach to gathering data requirements and modeling data is mandatory. This modeling effort requires a formal approach to the discovery and identification of entities and data elements. Data normalization is a big part of data modeling and database design. A normalized data model reduces data redundancy and inconsistencies by ensuring that the data elements are designed appropriately.
Posted December 06, 2012
Every organization that manages data using a DBMS requires a database administration group to ensure the effective use and deployment of the company's databases. And since most modern organizations of every size use a DBMS, most organizations have DBAs, or at least people who perform the on-going maintenance and optimization of the database infrastructure.
Posted November 13, 2012
As businesses push to reduce the data latency between analytical systems and operational systems, data warehouses begin to take on more of the character of a transactional system. For a data warehouse to deliver near real-time information, the choices generally are to update the warehouse more frequently or access data directly from operational systems. Either way, the push to reduce latency changes the nature of database performance to support the data warehouse. The main reason we created data warehouses in the first place was to separate resource-intensive analytical processing from shorter duration, but very frequent transaction processing. As the two worlds now come back together, the churn pressure on the database system can be significant.
Posted October 10, 2012
If you've worked with relational database systems for any length of time, you've probably participated in a discussion (argument?) about the topic of this month's column, surrogate keys. A great debate rages within the realm of database developers about the use of "synthetic" keys. And if you've ever Googled the term "surrogate key," you know the hornet's nest of opinions that swirls around on the topic. For those who haven't heard the term, here is my attempt at a quick summary: A surrogate key is a generated unique value that is used as the primary key of a database table; database designers tend to consider surrogate keys when the natural key consists of many columns, is very long, or may need to change.
Posted September 11, 2012
It is impossible to have missed the sweeping changes being thrust upon the data world due to regulatory compliance. But even if you've noticed, chances are that the sheer volume of regulations was too mind-boggling to fully digest. Compliance starts with the CEO, but it works its way down into the trenches, and impacts database administration. With that in mind, this month's column will offer a brief introduction to the regulatory landscape and its impact on database administration.
Posted August 09, 2012
Although data integrity is a pervasive problem, there are some data integrity issues that can be cleaned up using a touch of SQL. Consider the common data entry problem of extraneous spaces in a name field. Not only is it annoying, sometimes it can cause the system to ignore relationships between data elements.
Posted July 11, 2012
Unless you plan for and issue regular COMMITs in your database programs, you will be causing locking problems. It is important for every programmer to issue COMMIT statements in all application programs where data is modified (INSERT, UPDATE, and DELETE). A COMMIT externalizes the modifications that occurred in the program since the beginning of the program or the last COMMIT. A COMMIT ensures that all modifications have been physically applied to the database, thereby ensuring data integrity and recoverability. Failing to code COMMITs in a data modification program is what I like to call "Bachelor Programming Syndrome" — in other words, fear of committing.
Posted June 13, 2012
Every now and then some sage consultant will offer advice like "Large tables should be partitioned" or "Be sure to use static SQL for your applications with high volume transaction workloads." But how useful is this advice? What do they mean by large and high volume? Terms such as these are nebulous and ever changing. Just what is a large database today?
Posted May 09, 2012
An important aspect of database security is designing your applications to avoid SQL injection attacks. SQL injection is a form of web hacking whereby SQL statements are specified in the fields of a web form to cause a poorly designed web application to dump database content to the attacker. Stories abound in the news where SQL injection was used for nefarious purposes. Several high-profile cases over the past few years impacted government websites, Microsoft in the U.K., a Swedish election, PBS (the Public Broadcasting System), and Lady Gaga's website (among many others).
Posted April 11, 2012
As a DBA, it is almost inevitable that you will change jobs several times during your career. When making a job change, you will obviously consider requirements such as salary, bonus, benefits, frequency of reviews, and amount of vacation time. However, you also should consider how the company "treats" their DBAs. Different organizations place different value on the DBA job. It is imperative to your career development that you scout for progressive organizations that understand the complexity and ongoing learning requirements for the position.
Posted March 07, 2012
Many types of data change over time, and different users and applications have requirements to access data at different points in time. A traditional DBMS stores data that is implied to be valid at the current point-in-time, it does not track the past or future states of the data. For some, the current, up-to-date values for the data are sufficient. But for others, accessing earlier versions of the data is needed. Temporal support makes it possible to store different database states and to query the data "as of" those different states.
Posted February 09, 2012
At the outset of each new year, I devote an edition of my column to review the significant data and database-related events of the previous year. Of course, to meet my deadlines, the column is written before the year is over (this column is being written in November 2011), so please excuse any significant news that may have happened late in December.
Posted January 11, 2012
In a world replete with regulations and threats, organizations today have to go well beyond just securing their data. Protecting this most valuable asset means that companies have to perpetually monitor their systems in order to know who did exactly what, when and how - to their data.
Posted December 01, 2011
Being a successful database administrator requires far more than technical acumen and database knowledge. DBAs should be armed with a proper attitude as well as sufficient fortitude and personality before attempting to practice database administration. Gaining the technical know-how is important, yes, but there are many sources that offer technical guidance for DBAs. The non-technical aspects of DBA are just as challenging, though. So with that in mind, this month's column will offer 10 "rules of thumb" for DBAs to follow as they improve their soft skills.
Posted November 10, 2011
Every professional programmer (and DBA) should have a library of books on SQL fundamentals. There are many SQL titles to choose from, and a lot of them are very good. But this month's column will outline four SQL books that should be on every database professional's bookshelf.
Posted October 15, 2011
Cost containment is an important IT department goal in this day and age of financial austerity. Every decision regarding your computing environment must be weighed not only against the value it can deliver to your organization but also upon its cost to procure, implement, and maintain. If a positive return on investment cannot be rapidly delivered, then the software (or hardware) won't be adopted.
Posted September 14, 2011
Often, when the business of data management frustrates me, I look for inspiration in what may seem at first glance to be odd places. For instance, I think the Lewis Carroll "Alice in Wonderland" books offer sage advice for our particular industry.
Posted August 11, 2011
Simplification is important in today's era of increasing complexity and ever-changing software environments. A key component of simplification is to remember the basics and apply some elementary rules and practices to your database environment. Many problems arise because we don't keep track of the things most of us already know. I stumbled upon the idea for this month's column after recalling Malcolm Gladwell's excellent business book Blink: The Power of Thinking Without Thinking. In this book, Gladwell offers up case studies and examples depicting the benefit of our "adaptive unconscious" - a powerful innate ability that provides us with instant and sophisticated information.
Posted July 07, 2011
The DBA, often respected as a database guru, is just as frequently criticized as a curmudgeon with technical knowledge but limited people skills. Most programmers have their favorite DBA story. You know, those anecdotes that begin with "I had a problem ..." and end with "... and then he told me to stop bothering him and read the manual." DBAs do not have a warm and fuzzy image. This may have something to do with the nature and scope of the job. The DBMS spans the enterprise, effectively placing the DBA on call for the applications of the entire organization. Database issues can require periods of quiet reflection and analysis to resolve, so DBAs generally do not want to be disturbed. But their quiet time is usually less than quiet; constant interruptions to answer questions and solve problems are a daily fact of life.
Posted June 08, 2011
Design reviews are an important facet of the system development lifecycle for database applications. It is during the design review that all aspects of the database and application code are reviewed for efficiency, effectiveness, and accuracy. It is imperative that all database applications, regardless of their size, are reviewed to assure that the application was design properly, efficient coding techniques were used, and the database is accessed and modified correctly and efficiently. The design review is an important process for checking the validity of design decisions and correcting errors before applications and databases are promoted to production status.
Posted May 12, 2011
What will you do when you find out you're about to acquire or consolidate with another firm or division? Are you aware of the risks you may be inheriting? What data is going to demand the highest availability? What IT regulations will you have to address and how do you know if existing controls already address them? Here are 10 "data health" checks you can conduct to answer help these questions before giving a green light to an M&A or consolidation.
Posted April 05, 2011
One of the ongoing goals of database administration is to minimize downtime and improve availability. If the DBMS is down, data cannot be accessed. If the data is not available, applications cannot run. And if your applications cannot run, your company is losing business. Lost business translates into lower earnings and perhaps even a lower stock valuation for your company. These are all detrimental to the business and therefore, the DBA is called upon to do everything in his or her power to ensure that databases are kept online and operational.
Posted March 09, 2011
One of the most fertile grounds for disagreement between database professionals is the appropriate usage of views. Some analysts promote the liberal creation and usage of views, whereas others preach a more conservative approach. When properly implemented and managed, views can be fantastic tools that help to ease data access and simplify development. Although views are simple to create and implement, few organizations take a systematic and logical approach to view creation. And therein lies the controversy. A strategic and reasonable policy guiding the creation and maintenance of views is required to avoid a muddled and confused mish-mash of view usage. Basically, views are very useful when implemented wisely, but can be an administrative burden if implemented without planning.
Posted February 02, 2011
As my regular readers know, toward the end of each year I devote an edition of my column to review the significant data and database-related events of the past year. Of course, to meet my deadlines, the column is written before the year is over (this column is being written in November 2010), so please excuse any significant news that may have happened late in the year!
Posted January 07, 2011
Assuring optimal performance of database applications starts with coding properly formulated SQL. Poorly written SQL and application code is the cause of most performance problems. As much as 75% of poor relational performance is caused by "bad" SQL and application code. But writing efficient SQL statements can be tricky. This is especially so for programmers new to a relational database environment or those who have never been trained to properly write SQL.
Posted November 30, 2010
SQL is the lingua franca for modifying and reading database data and any DBA worth his (or her) paycheck should be proficient in writing SQL queries. But SQL is a flexible and feature-rich language, so there are always things that can be learned - even by senior technicians. As such, this month's column discusses several interesting SQL queries that you can put in your bag of tricks for future use.
Posted November 09, 2010
If you are a DBA, or a database performance analyst, chances are that you deal with performance-related issues regarding your database systems every day of the week. But have you ever stopped for a moment and tried to define what you mean when you say "database performance"? Doing so can be a worthwhile exercise, if only to organize your thoughts on the matter. Think about it; don't we need a firm definition of database performance before we can attempt to manage the performance of our databases?
Posted October 12, 2010
Database systems require data files to store the data under management. These files, or data sets, reside on storage media. So storage management should be a key part of the database operations required of a database administrator (DBA). Unfortunately, storage is sometimes relegated to an afterthought; after all, don't we have storage administrators who deal with our disk arrays? But this way of thinking is misguided. To succeed, database administration and storage administration need to cooperate and work together.
Posted September 07, 2010
What is the most difficult thing about acquiring enterprise software? If you are like most IT technicians, your first inclination was probably something related to cost justification. Let's face it, enterprise software typically is very expensive ... and eventually, something will need to bring costs more in line with value.
Posted August 10, 2010
In this issue's column I'll be providing a fundamental introduction to database and database management concepts. Many of you may think that they understand the basic concepts and fundamentals of database technology. But quite a few of you likely do not, so please do not skip over this. First of all, what is a database? DB2 is not a database; neither are Informix, Oracle and SQL Server. Each of these is a DBMS, or Database Management System. You can use DB2 (or Informix or SQL Server) to create a database, but DB2, in and of itself, is not a database.
Posted July 12, 2010
Before we go any further, let me briefly answer the question posed in this column's title: "No Way!" OK ... with that out of the way, let's discuss the issue ... Every so often, some industry pundit gets his opinions published by declaring that "Database administrators are obsolete" or that "we no longer need DBAs." Every time I hear this, it makes me shake my head sadly as I regard just how gullible IT publications can be.
Posted June 07, 2010
Have you heard about stream computing? Basically, it involves the ingestion of data - structured or unstructured - from arbitrary sources and the processing of it without necessarily persisting it. Any digitized data is fair game for stream computing. As the data streams it is analyzed and processed in a problem-specific manner. The "sweet spot" applications for stream computing are situations in which devices produce large amounts of instrumentation data on a regular basis. The data is difficult for humans to interpret easily and is likely to be too voluminous to be stored in a database somewhere. Examples of types of data that are well-suited for stream computing include healthcare, weather, telephony, stock trades, and so on.
Posted May 10, 2010
As you work to protect your data in this day-and-age of data breaches and regulatory compliance, technology and software solutions to data and database security spring to the top of most people's minds. This is to be expected because, after all, most of our data is stored on computers so technology and software are required to protect the data from unauthorized access. This is a good thing: Technology is a crucial component of protecting your valuable business data. But it is not the only thing.
Posted April 07, 2010
The continuing acceptance and growing usage of Linux as an enterprise computing platform has enlivened the open source community. The term "open source" refers to software that users are free to run, copy, distribute, study, change, and improve. Often "open source" gets misinterpreted to mean free software. This is understandable, but the open source concept of free is closer to liberty than it is to no charge.
Posted March 04, 2010
Have you ever read those inserts that your bank, credit card providers, insurance company, mutual fund company, and others slip inside your statements and bills? We all get them. You know, those flimsy pieces of paper, printed in small type and written in convoluted English. I have started collecting them - sort of like baseball cards. But I doubt they'll ever be valuable. They are entertaining, though ... and disheartening.
Posted February 09, 2010
My whole career has been based on managing data and producing information and, as such, I am intrigued with the issue of information overload - or the perception that there is too much information. A former boss called me an information bottom-feeder because I always seemed to have a nugget of information or two that applied to her projects and quests. You see, I'm of the opinion that you can never have enough information - at least regarding those things you care about.
Posted January 11, 2010
As per my regular custom, this final DBA Corner column of the year is a review of the most significant data and database-related events of the year. Of course, to meet my deadlines, it is October 2009 as I write this, so please excuse any significant news that may have happened late in the year!
Posted December 14, 2009
How many times have you been surfing the web only to encounter a form that requests a slew of personal information before you can continue on? You know what I'm talking about. A company markets a white paper or poll results or something else that intrigues you, so you click on the link, and bang, there you are. You don't have the information you wanted yet, but if you just fill out this form then you'll be redirected to the information.
Posted November 11, 2009
As my regular readers know, I am an avid reader, especially of technology books. And every now and then I review some of the more interesting database-related books in the DBA Corner column.
Posted October 13, 2009
If you use an IBM z Series mainframe you've undoubtedly heard about zIIPs and zAAPs and other specialty processors. But maybe you haven't yet truly examined what they are, what they do, and why they exist. So, with that in mind, let's take a brief journey into the world of specialty processors.
Posted September 14, 2009
As the U.S. markets strive for a recovery in 2009, many IT managers are cringing at the thought of managing their data through what may be a record year of mergers and acquisitions. Managing an ever-increasing mountain of data is not a simple task in the best of times, but doing so while combining formerly separate entities during an economic slowdown can be a monumental challenge.
Posted August 14, 2009
Protecting the data in our enterprise databases is extremely important. But what exactly does that mean? Oh, at one level we have the database authorization and roles built directly into the DBMS products. You know what I'm talking about: GRANT and REVOKE statements that can be used to authorize access to database objects, resources and statements. Many organizations have adopted policies and products to migrate this type of security out of the DBMS and into their operating system security software.
Posted July 13, 2009
Before we even begin this month's column I had better define what I mean by a "black box." Simply put, a black box is a database access program that sits in-between your application programs and the DBMS. It is designed so that all application programs call the black box for data instead of writing SQL statements that are embedded into a program. The general idea behind such a contraption is that it will simplify application development because programmers will not need to know how to write SQL. Instead, programmers call the black box to request data. SQL statements become calls-and every programmer knows how to code a call, right?
Posted June 15, 2009
The economy is a wreck and things will likely get worse before they improve. Unemployment is even worse; almost 600,000 jobs were lost in January 2009, sending the unemployment rate to 7.6%, the highest it has been in 16 years. So many data professionals are out there looking for their next challenge … and more probably will be job hunting before the year is out.
Posted May 15, 2009
Although the most important aspect of DBA tool selection is functionality and the way it satisfies your needs, the stability of the vendor that provides the product is also important
Posted April 15, 2009
Have you noticed that dynamic SQL is more popular today than ever before? There are a number of factors contributing to the success of dynamic SQL. Commercial off-the-shelf applications, such as SAP, Siebel, and PeopleSoft, utilize dynamic SQL exclusively. In many cases, too, dynamic SQL is the default choice for in-house application development.
Posted March 15, 2009