Data warehousing: The next generation of usage

In the beginning, there were mainframes - computer behemoths that occupied whole rooms and buildings, churning, blinking, beeping and droning, storing information and eventually producing oversized green-and-white-striped printouts.

And although companies today use more contemporary tools such as desktops, laptops and hand-helds to get information into computers, those mainframes still whir in many back rooms.

When management needs reports on company activities that combine information from the mainframes and information from the servers, a need develops to harvest data from seemingly incompatible sources.

One of the most interesting developments in the computer technology arena is the concept of the data warehouse - a group of technologies that communicate with all of the data sources, pull all of the information from those sources together, and allow the user to create a query that accesses all of the information at once.

Accountants are primary facilitators in helping their clients move into the era of data warehouses and data mining, or pulling information from the warehouses. While most accountants themselves may not be computer professionals who can create a data warehouse for their clients, accountants are on the front lines, helping clients determine what information is useful to them to help with running their business.

On the forefront of companies that can benefit from data warehousing are companies that have multiple source systems. "If the company's running on one standard system, like enterprise resource planning or one standard custom solution that they put in their organization 10 years ago, there's very limited reason to move toward a data warehouse solution," said John Runte, senior manager and leader of the technology practice in the consulting practice at Virchow Krause & Co. "But companies that have been through several mergers and acquisitions, have acquired new companies, companies on old systems - it's tough to get that consolidated reporting. Typically, there needs to be more than one system within an organization to justify a data warehouse solution."

The data warehouse goes beyond the concept of a relational database, where information is stored in tables of rows and columns and then accessed based on the relationship between the items in the tables. With a relational database, all of the information is typically stored in the same place and in tables that were created to go together. With a data warehouse, typically, the information was originally stored on different computers, using different programs, and is not necessarily organized using traditional database techniques.

So many variables

Sometimes the reason a company needs a data warehouse is that information is stored on different and incompatible computer systems. At other times, the information is stored in a large variety of programs that are each well suited to a particular task, but that weren't designed to work together with other types of programs.

Incompatibilities in entering data from one program to another also prevent programs from easily sharing information. One department might enter product numbers, while another might use product names. One department might use nine-digit zip codes, while another department uses five-digit codes. One department might use employee initials, while another department uses full names or employee numbers. When consolidating information from independent companies, the likelihood of incompatible information is far greater.

The data warehouse is set up to take advantage of Online Analytical Processing, or OLAP, a type of database software that provides a user interface for communicating needs, or queries, and very quickly retrieving the required multidimensional information from diverse sources.

"In big companies, you don't have a corporate database, you have hundreds of databases," explained Benjamin Taub, CPA, founder and chief executive of Ann Arbor, Mich.-based Dataspace Inc., a consulting firm that specializes in business intelligence, data warehousing and data integration. "In a big car company, for example, we'll have a sales company that tracks each of our customers, what we sold them. We'll also have a distribution system that shows where the car came from, and a warranty system with warranty claims by customer. If we want to know what is the likelihood of a customer to re-buy from us if he's had x number of warranty claims, you can't really tell that from having three separate systems - you need to tie all of that data together."

OLAP provides the user with the tools to perform this type of drilling down through the various data systems at a very high speed.

The data warehouse is constructed as a back-end database that collects all of the data from all requested sources, makes the data compatible and accessible, and has a front-end user interface that is designed for ease of use.

"One of the really hot things right now is the concept of management dashboards," explained Taub. "There may be a series of graphs or dials on the screen, showing month-to-date sales versus budget, or versus last year to date, or labor utilization. As a business, you're setting goals, and hopefully they are measurable goals, and the dashboard shows you where you are in relation to your goals."

With the dashboard feature, the user sees alerts and warnings when a specific business parameter occurs. The dials have indicators that resemble the speedometer on a car dashboard. "The tools the executive sees are very user-friendly, very easy to navigate," said Virchow Krause's Runte. "For example, in a distribution type of company where service levels are extremely important, if service levels drop beneath a certain threshold, there's a series of certain warning bells that go off with reports and alerts that go out to the management team. The concept is a much more proactive warning that goes off, rather than seeing it on a month-end report."

With regard to the cost of a data warehouse system, Taub estimated that 20 percent of the cost is spent on original development, and the rest of the cost is incurred as the system is developed. Much of the data warehouse is custom programming with companies that have several different types of computers and systems handling their data.

"Fifteen years ago, everything was custom, and data warehouse projects were big, long, expensive projects," said Runte. "The tools have evolved to have packages where literally there's no custom programming at all."

The newer a company's computer system is, and the more coordinated the data storage is, the more likely it is that an off-the-shelf program will at least get a company started on a data warehouse. Then, as the company starts to explore possible uses for the information, customization can be added to the original program.

Factors that affect cost include the quantity of data, the number of users, the complexity of data, the number and complexity of the source systems, and how clean the data is, i.e. whether or not the data will have to be restructured before it can be used in the data warehouse. Taub provided an example of a hospital system that requested a name in one field of an entry screen, but nurses were entering an "x" in the field. While they may not have needed the field for their purposes, the data was incomplete and names needed to be entered before the data could be used properly in the data warehouse.

Cost is also affected by the scope of the project, how much the company is trying to achieve, and the number of reports that the company wants to obtain.

Often what happens is that, once the data warehouse is implemented and users actually see how it works and begin to realize its potential, requests increase for additional alerts and reports and information.

Know why you want it

Data analysts agree that a company should have a clear idea of what it wants to accomplish with the data before acquiring a data warehouse. The American Institute of CPAs' Center for Excellence in Financial Management suggests that the following questions be asked before implementing a data warehouse for the first time:

* What is the purpose of the data warehouse?

* What is the long-term versus short-term vision of how the data warehouse will look and how it will be used?

* How will the data warehouse satisfy needs that have been identified?

* Who will use the data warehouse? Analysts? Strategic planners? Senior executives? Will customers or suppliers be allowed access?

* After the data warehouse is implemented, what measurements will be used to determine its success?

Taub suggested that companies should try to analyze how they want a particular business process to look in the future, and what needs to be done in order to achieve that process. If the answer is that to get there, a data warehouse is needed, then it's easier to justify the investment.

"The correct way to build a data warehouse is to focus not on the technology and not on the data, but on supporting a business process that you want to change - what am I expecting to gain by having this new business process?" said Taub.

Return on investment can be difficult to measure with a data warehouse, particularly when the benefits are expected to continue on an ongoing basis. "You can certainly justify the return on investment if you did time studies of every time a manager asked an analyst a question and they went out and did the activity and you logged that time," suggested Runte. "You would be able to come up with a cost of how you did it before, versus the new way, the data warehouse way, and you should be able to justify the savings."

"But," he added, "we have not found the magic answer in terms of putting a benefit on good information within the organization."

Not just your company's data

Frequently, a data warehouse will contain information not created by the company, such as demographic or census information, or product information created by a source outside the company. That information is accessible with the company data for purposes of comparison and analysis, or for dissemination.

James A. Rock, a Washington-based IT systems analyst, described a data warehouse that he worked on, in which a company's information from activities around the country was stored in a data warehouse along with demographic information and political districts. The company was able to produce reports based on its transactions with constituents in each political district and then provide that information to elected officials from those districts for purposes of lobbying.

For a large company with hundreds of thousands of transactions, the man-hours necessary to create that type of information would have been prohibitively expensive, but with the information stored in the company's data warehouse, a simple query retrieves the necessary information and sends reports on a regular basis to the elected officials.

Steve VanArsdale, a writer, programmer and consultant specializing in the area of data warehouses, stated that all data processing to date has been a prelude. "The computer systems industry is entering a new state in which computing will finally deliver the promise to business. The data warehouse will manage data in ways that will produce new knowledge, new intelligence, new efficiencies and profits for businesses."

For reprint and licensing requests for this article, click here.
Technology
MORE FROM ACCOUNTING TODAY