Data Modeling

Data Modeling and Database Design and Development – including popular approaches such as Agile and Waterfall design - provide the basis for the Visualization, and Management of, Business Data in support of initiatives such a Big Data Analytics, Business Intelligence, Data Governance, Data Security, and other enterprise-wide data-driven objectives.

Data Modeling Articles

In writing a definition for an entity, an attribute, or any other element within a database design, the desired end is a descriptive text that is clear, factual and concise. Semantics are an ambiguous and often painful tool to employ. Balancing the need for clarity against the desire to avoid redundancy can be a juggling act that is hard to accomplish. One might not easily recognize what is complete versus what is lacking, versus what has gone too far. But even so, within a definition if one finds oneself listing valid values and decoding the value's meaning, then one has likely already moved beyond what is "concise." Lists of values easily add bulk and quantity of verbiage into a definition, yet such lists do not usually increase the quality of a definition.

Posted November 13, 2012

The beauty of a truly wonderful database design is its ability to serve many masters. And good database designers are able to empathize with those who will use their designs. In business intelligence settings, three perspectives deserve consideration when composing designs.

Posted November 06, 2012

Software operates the products and services that we use and rely on in our daily lives. It is often the competitive differentiation for the business. As software increases in size, complexity, and importance to the business, so do the business demands on development teams. Developers are increasingly accountable to deliver more innovation, under shorter development cycles, without negatively impacting quality. Compounding this complexity is today's norm of geographically distributed teams and code coming in from third-party teams. With so many moving parts, it's difficult for management to get visibility across their internal and external supply chain. Yet, without early warning into potential quality risks that could impact release schedules or create long term technical debt, there may be little time to actually do something about it before the business or customers are impacted.

Posted October 24, 2012

CA Technologies has announced a major new release of the ERwin data modeling solution. This new release which is the second in less than a year provides a collaborative data modeling environment to manage enterprise data using an intuitive, graphical interface. It helps improve data re-use, optimize system quality, accelerate time-to-benefit and enable appropriate information governance—key objectives for IT organizations serving companies in today's highly competitive and closely regulated markets.

Posted October 16, 2012

It is an understatement to say we're witnessing an example of Moore's Law — which states the number of transistors on a chip will double approximately every two years — as we seek to manage the explosion of big data. Given the impact this new wealth of information has on hundreds of millions of business transactions, there's an urgent need to look beyond traditional insight-generation tools and techniques. It's critical we develop new tools and skills to extract the insights that organizations seek through predictive analytics.

Posted October 16, 2012

An educational and interactive webcast will review the findings of the 2012 IOUG Test, Development and QA Survey and discuss the best practices and issues that it highlights. This IOUG study was conducted by Unisphere Research, a division of Information Today, Inc., and sponsored by IBM. Presented by Kimberly Madia, WW product marketing manager at IBM, and Thomas Wilson, president and CEO, Unisphere Research, the webcast will be held Thursday, September 27, from 12 - 1 PM CDT. Attendees to the webcast will receive a copy of the study report.

Posted September 26, 2012

It seems easy to fall into a state where projects and activities assume such soft-focus that routine takes control, where one simply does necessary tasks automatically, no questions are raised regarding what is moving through the work-life production line and everyone is essentially asleep at the switch. Certainly, we may have one eye open ensuring that within a broad set of parameters all is well, but as long as events are basically coloring inside the borders we continue allowing things to just move along. In this semi-somnambulant state we can easily add columns to tables, or even add new entities and tables, or triggers and procedures to our databases, then eventually at some point down the road have someone turn to us and ask, "Why this?" or, "What does this really mean?" And at that point, we surprise ourselves with the discovery that the only answer we have is that someone else told us it was what we needed, but we do not really understand why it was needed.

Posted September 11, 2012

The whole world can be divided into two groups, these being splitters and lumpers. Design battles are waged across conference rooms as debates rage over whether to split or to lump. Splitters take a group of items divide them up into sub-groups and sub-sub-groups occasionally going so far as to end with each lowest level becoming a group of one. On the other side of the design fence, lumpers combine items until everything is abstracted into group objects covering very broad territory, such as a "Party" construct, or ultimately an "Object" object. Within data modeling, arguments arise, such as whether to sub-type an entity. Or perhaps lumping is discussed as the grain of a multidimensional fact is proposed. This debate underlies much of the decision-making involved in determining what domains to create within a data model. The split-versus-lump issue is ubiquitous and universal. The question to split or lump arises across many kinds of choices, in addition to the entity definition, table grain, or the domain grain mentioned in the previous examples; this issue is at the heart of deliberations regarding establishing functions, overriding methods, or composing an organizational structure.

Posted July 11, 2012

SAP has announced that the PowerBuilder Developers Conference will be held October 15-19, 2012, at the Venetian Resort Hotel in Las Vegas, concurrently with SAP TechEd Las Vegas 2012. The conference will be comprised of an opening keynote at SAP TechEd, followed by a PowerBuilder general session and PowerBuilder technical breakout sessions.

Posted June 27, 2012

In the dim, dark past of data warehousing, there was a time when the argument was put forward that "history does not change." It was posited that once a piece of data was received by the data warehouse, it was sacrosanct and nonvolatile. A fact record, once processed, was to remain unchanged forever. Dimensions, due to their descriptive nature, could be changed following the prescribed Type 1, 2, or 3 update strategies, but that was all. It was the expectation that due to their very nature, fact tables would become huge and in being huge would give poor update performance; performance so poor that updates would be virtually impossible to enact.

Posted June 13, 2012

It seems only reasonable that what one person can do, others can learn. On the other hand, taking people through training does not usually result in the creation of great new database administrators (DBAs). It often appears as if those who are exceptional at the craft operate at higher levels as they dive into a problem. Can training alone provide folks with the attention to detail, the urge to keep digging, or the ability to recall minutiae that allow them to rise from simply holding the DBA title to becoming someone who is a great DBA? Or must the genetic potential exist first, and then one might fall into the DBA occupation and astound those around them. It is very hard to say with any degree of certainty whether great DBAs are made or born; yet again the battle between nature and nurture arises.

Posted June 06, 2012

Embarcadero Technologies has introduced a new version of its database management and development platform, DB Power Studio XE3, which offers enhancements to further improve the performance and availability of databases.

Posted June 06, 2012

Plans are underway for an event specifically focused on Sybase PowerBuilder and tools that will be separate from TechEd but held at the same time and location, according to Christine Weber, marketing manager, Events, at Sybase, an SAP company. There will also be close to 100 hours of sessions specifically focused on Sybase database and analytics products at SAP TechEd 2012, Weber tells 5 Minute Briefing. "It is a good portion and it is focused on the traditional kind of content that we have always done with Sybase." Sybase-specific content will include tips and tricks on how to use existing products, as well as previews of what's ahead in new product releases. Now that the call for papers has closed, Sybase is going through its approval process for the presentations. Well over 200 presentations were submitted - "a good problem to have," Weber notes.

Posted May 22, 2012

10gen, the company behind MongoDB, has announced its support for MongoDB with Node.js. This includes an official Node.js driver as well as commercial support from 10gen for MongoDB-backed applications developed with Node.js. Node.js joins the existing set of programming languages and environments 10gen supports, including Java, PHP, C#, Ruby, Python, C++, C, Perl, Scala, Haskell and Erlang. Launched in 2009 and sponsored by Joyent, JavaScript-based Node.js is designed to help developers build data-intensive, real-time applications that support large numbers of concurrent users and devices.

Posted May 01, 2012

Quest Software has released version 11.5 of its Toad for Oracle software, the flagship product in the Toad portfolio of productivity software for database developers, DBAs, and analysts. Drawing on community feedback from their two million users, Quest has introduced a number of new features and improvements, most notably a new social intelligence component. Toad for Oracle 11.5, Quest contends, will allow users to take advantage of the best ideas and practices from the community and further increase user productivity.

Posted April 25, 2012

Sybase has announced that Ford Motor Company will centralize all of its logical and physical modeling functions with SAP Sybase PowerDesigner, the data modeling software and metadata management solution for data, information, and enterprise architectures. The solution provides the capability to generate Data Description Language (DDL) for Ford Motor Company's database platforms, including all of the leading databases like Sybase ASE, DB2, SQL Server, Teradata and Oracle.

Posted April 25, 2012

Solution development work is usually accomplished via projects, or a combination of programs and projects. This project perspective often leads to thoughts of documentation as project-owned. And while many documents are project-specific, such as timelines, resource plans, and such, not everything is project-specific. Unless projects are established in a fashion whereby each is very limited in scope to the creation or enhancement of a single application or system, specification and design documents belong to the final solution and not to the project.

Posted April 11, 2012

1010data, Inc., provider of an internet-based big data warehouse, has announced the launch of a new software tool that enables 1010data's customers to automatically segment and analyze huge consumer transaction databases and produce statistical models with specificity, even to the level of social groups, families and individuals. For the first stage of the launch, 1010data is making the tool available in an invitational beta release for retail, consumer goods, and mobile telecom companies. "In all consumer-driven industries, customers are demanding to be treated as individuals, not boomers, tweeners, or dinks - dual income, no kids," said Tim Negris, vice president of marketing at 1010data.

Posted March 22, 2012

Kalido, a provider of agile information management software, unveiled the latest release of the Kalido Information Engine, which helps organizations decrease the time for data mart migrations and consolidations. With this new release, customers will be able to import existing logical and physical models and taxonomies to build a more agile data warehouse. Enabling customers to take advantage of existing assets and investments "is going to dramatically reduce the time and the cost that it takes to bring together data marts into more of a data warehouse scenario," says John Evans, director of product marketing at Kalido.

Posted February 23, 2012

The cost for new development can often be easily justified. If a new function is needed, staffing a team to create such functionality and supporting data structures can be quantified and voted up or down by those controlling resources. Money can be found to build those things that move the organization forward; often, the expense may be covered by savings or increased revenue derived from providing the new services.

Posted December 01, 2011

It is not magic. Building a successful IT solution takes time. And that time is used in various ways: obtaining an understanding of the goal; mapping out what components are necessary and how those components interact; testing components and their interaction; and finally migrating those components into the production environment - otherwise known as analysis, design, development, testing, and deployment. Regardless of the methodology employed, these functions must always be addressed. Different approaches focus on differing needs and aspects. But any complete methodology must fill in all the blanks for accomplishing each of these tasks.

Posted November 10, 2011

When assembling a database design, one of the keys for success is consistency. There should be more than just similarity in the way things are named, the manner in which tables or groups of tables are constructed; the manifestation of these elements should follow standards and practices that are documented and understood. If one tries to rely on the idea that individual developers will simply look at existing tables and glean standards via osmosis as they add on or create new tables, then one does not actually have any standards at all.

Posted September 14, 2011

HP is offering a series of new software solutions designed to improve collaboration among application development and delivery teams. The new HP ALM software solutions include HP Service Virtualization 1.0, HP Application Lifecycle Intelligence (ALI), and HP Agile Accelerator 5.0. "Without a performance management system, it's difficult to measure success," Matthew Morgan, senior director of worldwide product marketing at HP Software, said at a press and blogger briefing at launch day for the product line. Application metrics "should be digitized and automated, and not sit on an Excel desktop."

Posted July 25, 2011

Rally, a vendor in Agile application lifecycle management (ALM) and a Gold-level member of Oracle PartnerNetwork, has announced that Rally Enterprise Edition 2011.2 has achieved Oracle Validated Integration with Oracle's Primavera P6 Enterprise Project Portfolio Management (EPPM) 8. The integration will allow project managers to more easily view and analyze critical data from Agile development teams, enabling them to adjust investment priorities in the organization's portfolio of IT projects.

Posted June 22, 2011

WhereScape, developer of WhereScape RED, an agile IDE for managing data warehouses, has introduced WhereScape 3D, a data-driven design tool for planning and reality-testing data warehousing and business intelligence projects. According to the vendor, using WhereScape 3D, organizations are able to reduce project risk by planning accurate, user-tested projects up front, in hours or days rather than weeks or months.

Posted June 13, 2011

Occasionally, one sees a data structure abomination. This atrocity involves an object of almost any type, in almost any database wherein the object has a start date but no end date. It is not that the finish date currently has no value and is null; it is that the end date does not even exist on the table structure. The stop date was never intended to exist. The object in question starts, but it doesn't ever end.

Posted June 08, 2011

Alpine Data Labs, developer of Alpine Miner, a solution for big data predictive analytics, has received $7.5 million in Series A funding. In addition, after 15 months of product development, the company also announced its 10th production customer and its formal launch in the U.S. market. According to Anderson Wong, Alpine Labs CEO and co-founder, organizations cannot extract all the possible value from their data because it is growing faster than they can analyze it, they don't have enough resources with analytics expertise, and the tools they're using are too complex to get to the answers they need quickly.

Posted May 13, 2011

Oracle has submitted a proposal to the Eclipse Foundation to create a Hudson project in Eclipse and contribute the Hudson core code to that project. Hudson is a industry-leading open source "continuous integration" (CI) server that increases productivity by coordinating and monitoring executions of repeated jobs, making it easier for developers to integrate changes to the project and for users to obtain a fresh build.

Posted May 12, 2011

Dates are important. Without dates how can anything be planned? However, due dates have been know to increase in importance in the delivery of software solutions. Sometimes the due date becomes such an overwhelming creature of importance that the date is more important than following best practices, more important than verifying that what is built is correct, more important than the solution team gaining a proper understanding of the work they are attempting to perform.

Posted April 05, 2011

Embarcadero has introduced DB PowerStudio for DB2 to provide DB2 developers and DBAs with a toolset that brings greater functionality and efficiency to their development, database administration and performance tuning tasks. According to Embarcadero, the new DB PowerStudio for DB2 combines several essential database tools into an aggressively priced suite that extends beyond the capabilities of IBM's database utilities. Following recent introductions of similar toolsets aimed at Oracle, SQL Server and Sybase ASE users, DB PowerStudio for DB2 meets the needs of both DB2 database developers and administrators, helping both improve the performance and availability of their databases.

Posted March 21, 2011

The understanding of object states and their transitions obviously is of great importance to the solution developers because as processes are built they will need to support each state and every possible transition. Additionally, knowledge of object states and transitions is of vital importance to the data modeler because the data must be persisted across each of those states, and often the state of an object needs to be easily identifiable. A data model minimally requires a reference table, along with the varying entities that reference that table (the foreign keys tracking an individual object's status). Specific states drive variations of required attributes or combinations of those attributes that apply to one state and not another. The logical definition of the database can identify these variations through the use of supertype/subtype constructs.

Posted March 09, 2011

CA Erwin Data Modeler r8, a solution for collaboratively visualizing and managing data across multiple systems, was announced by CA Technologies. Enhanced visualization and different ways to visualize data models in non-traditional formats is a main theme of the release, Donna Burbank, senior director of marketing, Data Management, CA Technologies, tells 5 Minute Briefing. The goal is to help ERwin, an established product with a 20-plus-year track record, to reach new audiences, in particular, business and non-technical users, as well as to provide improved workflow for traditional CA Erwin Data Modeler users.

Posted February 23, 2011

CA Technologies says it is now shipping a toolset for collaboratively visualizing and managing data across multiple systems. The product, CA Erwin Data Modeler r8, offers enhanced visualization and different ways to visualize data models in non-traditional formats. "We completely ripped out our old user interface and put in a new one," Donna Burbank, senior director of marketing, Data Management, CA Technologies, tells 5 Minute Briefing. With features like graphics, themes, fonts, and auto layout, she observes, "It is really more like a drawing tool on the front with the power of a data modeling tool behind it."

Posted February 22, 2011

A new toolset designed to streamline common and complex tasks associated with developing, administering and optimizing Oracle databases, has been announced by Embarcadero. The new toolset, called DB PowerStudio for Oracle, combines several of Embarcadero's most popular database tools into a suite that is intended to make it easier for Oracle developers and DBAs to shorten database development and administration times, and to improve the performance and availability of their databases. The toolset is available in a Developer Edition and a DBA Edition.

Posted February 17, 2011

CA Erwin Data Modeler r8, a solution for collaboratively visualizing and managing data across multiple systems, was announced by CA Technologies. Enhanced visualization and different ways to visualize data models in non-traditional formats is a main theme of the release, Donna Burbank, senior director of marketing, Data Management, CA Technologies, tells 5 Minute Briefing. The goal is to help ERwin, an established product with a 20-plus-year track record, to reach new audiences, in particular, business and non-technical users, as well as provide improved workflow for traditional CA Erwin Data Modeler users. "We completely ripped out our old user interface and put in a new one," says Burbank. With features like graphics, themes, fonts, and auto layout, she observes, "It is really more like a drawing tool on the front with the power of a data modeling tool behind it."

Posted February 15, 2011

Embarcadero is providing SQL Server developers and DBAs with a toolset designed to simplify many common and complex tasks associated with developing, administering and optimizing SQL Server databases. The new toolset, dubbed DB PowerStudio for SQL Server, combines several of Embarcadero's most popular database tools into a lower-cost studio offering. DB PowerStudio for SQL Server is available in two editions, a Developer Edition and a DBA Edition. Both editions include tools and capabilities including Rapid SQL, an integrated development environment that simplifies SQL scripting, query building, object management and version control in live databases or offline source code repositories. The overlapping Developer and DBA Editions reflect the converging roles of DBAs and developers in the database space, Scott Walz, senior director of product management for Embarcadero, tells 5 Minute Briefing.

Posted February 08, 2011

Oracle has announced a new version of its data modeling tool, Oracle SQL Developer Data Modeler 3.0. The new release supports collaborative development through new integration with popular open source version control software, as well as the incorporation of user-defined design rules and transformation scripts.

Posted February 02, 2011

Referential integrity helps manage data by enforcing validation between related entities. This enforcement follows logical semantics behind the database design -- i.e., an employee can only work for an already defined department; a prescription can only be written by a health care practitioner with the proper authority. A Foreign Key on an Employee table rejects data when any attempt is made to insert or update a row with a department value that does not already exist as a department identifier within a Department table.

Posted February 02, 2011

When designing a system an architect must conform to all three corners of the CIA (Confidentiality, Integrity and Accessibility) triangle. System requirements for data confidentiality are driven not only by business rules but also by legal and compliance requirements. As such, the data confidentiality (when required) must be preserved at any cost and irrespective of performance, availability or any other implications. Integrity and Accessibility, the other two sides of triangle, may have some flexibility in design.

Posted January 07, 2011

Back in the 1970s, the ANSI SPARC three-tiered model arose, foreshadowing a smooth intertwining of data and architectural design. The three tiers concept isolated the physical storage needs of data structures independent of business' perception of these structures. The three levels were comprised of schemas labeled external, conceptual, and internal, with each level describing the data in focus from varying perspectives.

Posted January 07, 2011

QSM Associates is shipping an updated version of the SLIM software lifecycle management solution, which incorporates support for Agile software development methodologies. A product of McLean, Virginia-based QSM, Inc., the new release, SLIM Suite 8.0, addresses development concerns with Agile methodologies, which emphasize iterative approaches working as small teams with business users.

Posted December 21, 2010

How does one know what one doesn't know? When evaluating what one knows, it is hard to know where to begin. The wise men say, "The more you know, the more you know you don't know." If one believes such commentary, what is known constitutes the tip of the proverbial iceberg. Databases have an easier time with such missing circumstances. If the rows of a database table are lacking referents, an outer join query filtering for NULLs might detail for you all the missing items. In developing and delivering projects, such a reference list for our minds to link to does not exist, for an outer join or anything else. Often, we do not know everything that needs to be done, particularly as a project starts. The difference between success and failure is not so much what one knows, but in how one handles the gaps between what is known now and what needs to be known before one finishes.

Posted November 30, 2010

The year 2010 brought many new challenges and opportunities to data managers' jobs everywhere. Companies, still recovering from a savage recession, increasingly turned to the power of analytics to turn data stores into actionable insights, and hopefully gain an edge over less data-savvy competitors. At the same time, data managers and administrators alike found themselves tasked with managing and maintaining the integrity of rapidly multiplying volumes of data, often presented in a dizzying array of formats and structures. New tools and approaches were sought; and the market churning with promising new offerings embracing virtualization, consolidation and information lifecycle management. Where will this lead in the year ahead? Can we expect an acceleration of these initiatives and more? DBTA looked at new industry research, and spoke with leading experts in the data management space, to identify the top trends for 2011.

Posted November 30, 2010