What we know reflects who we are.
In business, this means the data coming out of your research office are not really cut-and-dried facts. Rather, they are multi-layered concoctions, with each statistic resting on perspectives that make them the most telling expression of a point of view. A good data analyst knows how to unearth these perspectives and relate them to the specific questions that need to be answered.
Remember the first big decision you worked on? Your boss knew the result would have a big effect on your organization. Before proceeding, she wanted to know, "what are the facts?" It seemed like a straightforward request. With all the data collected at company expense or all those books of statistics in the library, surely anyone could produce the facts so a decision could be made. But when you got to work, you quickly discovered that the "facts" you had did not fit the questions you were asked. To succeed, you returned to your boss and made sure you knew exactly what the question was.
Good analysts recognize that data are not facts. Facts provide the answers to specific questions. Assumptions underlie each table, chart, or graph, and these assumptions shape the apparent effect of data on our thinking. Facts answer questions-without the questions, all we have are data, and data are not good for anything until they are turned into facts.
The first two charts that accompany this article are drawn from the Bureau of Transportation Statistics and the Federal Highway Administration. They present "simple facts" used by auto makers, city planners, energy analysts, and consumer-products companies. The first map shows the percent of households across the 50 states that do not have an automobile. The second one shows trends in gas consumption since 1936.
The first map gives the impression that large parts of the United States have a substantial proportion of households that do not own cars. One state-New York-stands out as having more households without cars than those with cars.
The first line chart shows a seven-fold increase in gas consumption since 1936. Even on a per-capita level, the rise is still more than 300 percent. At the same time, the efficiency of cars (measured by number of gallons it takes to go 10,000 miles) has apparently remained about the same, and the amount of gas used per car has gone up only slightly.
These charts raise more questions than they answer because they seem to run counter to commonplace beliefs. After all the emphasis on efficiency, can gas consumption really be increasing? And is it really true that vast areas of the U.S. have large numbers of households without cars? We will come back to these questions later.
THE PACE OF CHANGE It's easy to criticize a particular presentation because it does not reflect our own assumptions. But the real mistake is to use facts that were gathered to answer one question to try to answer a different question. It's hard to handle ambiguity when we just want to present the facts, unless we recognize and accept the context that shapes how we use data. First of all, picking what to collect and what to leave out of the raw data limits the information we can present. Gathering data is costly in money, time, and attention.
Second, timing the collection and reporting of facts reflect our beliefs about how fast events move. If we think that changes occur rapidly, we might ask for monthly, weekly, daily, or even hourly reports. Some investors even follow the minute-by-minute fluctuations in the Dow Jones Industrial Average, because they know that timing a stock sale down to the right minute can net them a few extra dollars.
If the change to be measured is slow, we might use last year's or last decade's numbers to estimate where we are today. For example, medical researchers often rely on datasets that are compiled over several decades and reflect patterns of living or habits prevalent at that time. However, because they are focused on the relationship between habits and health, the findings are still current and meaningful for their purposes.
Third, the questions we ask stand for hidden inquiries-the kinds of analytical information we need. For example, statisticians at the Census Bureau and other public agencies are usually focused on general statistical information. They look for series that can benchmark a wide variety of comparisons. They have a tough time guessing what new twist a policymaker might want from their results. But business decision makers, and many government people who have to implement real programs, need more focused information. They don't want to know about migration streams. Instead, they want to know where their next shopping center should go. In most cases, government data only provide the denominators-the general trends against which a specific data user can benchmark work. Intimate understanding of the market and the policy problem, and deep appreciation of conditions or tastes is essential.
THE TAILOR'S FINGER Ed Spar is executive director of the Council of Professional Associations on Federal Statistics and a veteran data analyst. He remembers the time he did a project for a clothing manufacturer. This household-name company wanted to know if a new line of coats would sell. Ed's company at that time, Market Statistics, was called on to do a demographic analysis of the potential market. By combining test and population data, he turned out a careful analysis predicting the size of the potential market.
After Ed's work was complete, he made a presentation to the management. They were grateful for the information. Yet Ed later learned that the real decisions about which coats to make were in the hands of the senior partner who had been with the business since its inception. He sat in the back room of the executive offices making decisions with his thumb and index finger as he tested the feel of the cloth. The accumulated wisdom of years of practice was as important as the data Ed had carefully compiled.
In fact, data can only take us so far. They show the general trend, but an organization's decisions are specific. That is why judgment educated by facts is at the heart of decision-making.
How analysts state the facts reflects the roles they play in organizations. That's because particular responsibilities shape their judgments and influence the questions we ask. In large companies, a department's particular perspectives shape the way they use data and turn numbers into facts. Sometimes groups risk losing a larger company-wide perspective.
Headquarters wants to collect data and present facts to support the whole business. Yet individual departments often end up interpreting data to reward and support the results they need for their unique efforts in the same marketplace. For example, consider the strategic decisions about data collection and reporting made by large companies like American Express. These data must measure efforts both within individual departments and across the whole business.
Companies that market credit cards often divide their staff into two separate divisions: one works to increase the number of subscribers, and the other to convince more establishments to accept cards. The first judges success by the number of new cardholders, and the second defines success by the number of businesses that decide to accept cards.
However, the profits of a company like American Express also depend on the number of people and establishments that remain as subscribers or continue to accept the card. That's because AMEXCO's profits depend on use. This sometimes leads to conflicts between the needs of the individual divisions and those of the entire company. At American Express, this kind of conflict was often expressed as a different statement of the facts.
The top executives at companies like American Express work to establish common views that become company standards. Recognizing the problems inherent in data produced by competing departments, AMEXCO brought in management consultants to help improve communication. Both divisions at American Express "recognize the value of combining their separate perceptions into one unified view," according to consultant Craig Buxton. As he puts it, his goal is to ensure that American Express "values what they measure and measures what they value."
Buxton showed management that the measures they used overly valued financial results, thereby making it hard to see how profits were actually earned. Reports of short-term outcomes overshadowed information on customer and establishment satisfaction. Management was impressed by studies showing that the number of locations accepting the card was closely tied with increased profitable card use.
Research done by the American Express division that develops business services helped to change the focus of performance measurement. Previously, AMEXCO assessed how profits were related to the types of cards in use and the proportion of establishments in an area accepting the card-what they call density. Both use and penetration (the number of establishments accepting the card in a given geographic area) generate income and costs. The studies showed, however, that in areas where heavy card users are concentrated, the density of businesses accepting the card is key to increased profits.
Executives at American Express moved quickly to convince competing corporate groups to share an integrated set of facts focused on enhancing overall profits. They redefined the meaning of profitable users to focus on the "life cycle" of card and establishment use. Thus, net income over time became the most highly sought-after fact.
Real business decisions turn on the way factual questions are asked. In the American Express example, it was possible to ask if more people are members, or if more businesses accept the card. Alternatively, an analyst could ask if card use generates the optimal levels of long-term overall profitability. Each question implies a different set of facts, because each asks similar data a distinct set of questions.
ANOTHER VIEW OF GAS To understand how this might work, let's look back at our two initial charts. What if we changed the questions we were asking these data? How would it change the appearance of the graphs and the facts, as we know them?
The implied question behind the map of the 50 states is, which states have large concentrations of households without access to an automobile? The answer to this question would be valuable when decision-makers are allocating resources to states. Federal transportation programs that target state governments for assistance might want the data presented in this way, because it shows at a glance the states where mass-transit programs would be more important. In business, car-rental agencies might use the U.S.-by-state map to understand the regions and states where their business could target local residents in addition to out-of-town travelers.
The second map shows what happens when we change the question that we are asking about the concentration of carless households. The implied question for this figure is, which counties have large concentrations of households without access to an automobile?
These facts would be useful for planners who are targeting services or programs to specific local markets. Presented in this way, the data could be useful to local planners of mass transit and car-rental agencies. It also would include businesses that provide retail service-such as supermarkets and retail establishments.
Both maps use the same data. However, they are startling in their apparent differences. Which one is "right?" It may be tempting to conclude that the county map has the "right" answer because it is more specific. But that conclusion misses an important point. The state map is more "right" whenever resources-federal money, business staff, rental cars, or taxis-are doled out on a state-by-state basis. The county map is "right" when resources are targeted to specific counties.
The map that is right is the one that matches the way resources are allocated or decisions are made. The particular perspective of your organization probably determines the "unit" of allocation. If the central headquarters disperses money, people, or material to regional or state offices, the state map may actually show the impact that allocations based on the concentration of households without cars would have. Smaller businesses and state or local governments would find the county map most telling.
Using the same approach to examine the two line charts shows how another kind of assumption shapes the facts about gas consumption. The first line chart shows what has happened to gas usage since the Depression. The second one shows the same data benchmarked from the oil crisis of the 1970s. Once again, important differences in the "facts" arise from the standard against which comparisons are made.
The 1936 chart shows that gas usage has increased dramatically since the depression. It also shows that during World War II, when rationing was in effect, the amount of gas used per car declined, but the efficiency of gas use-the amount per mile traveled- eroded. This could have happened because cars were getting older.
A different picture emerges when gas use is benchmarked to 1978. This approach shows that the amount of gas used since 1982 has increased, but that consumption by 1995 is nearly 15 percent higher than it was in 1978. Also, consumption per person and consumption per mile declined after 1978. Since 1990, consumption per car increased more than per person or per mile.
Once again, neither one of these charts is right or wrong. They both use the same data. Planners who want to track the long-term standing of the United States in using energy will certainly find the 1936 chart more useful. Those concerned with estimating the amount of gas consumers will use today and next year will most likely find the 1978-to-present chart more telling.
In a new gas crisis, showing how America responded to gas shortages in the 1970s would help businesses plan for increased costs and reduced economic activity. On the other hand, long-term investors can see that events like the gas shortages of the 1970s have only a limited impact, for example, on investments made by young people for their retirement.
THE LIMITS OF FACTS The examples in this article illustrate how the facts we take from data critically depend on the questions we bring to data. Does this mean that there are no mistaken analyses and no correct ones?
Not at all. Rather, the examples suggest a standard each of us can use when assessing the meaningfulness of a data-based presentation. We can ask if the underlying assumptions contained in the data match the business decisions that confront us. If they do, we are safe in using the data as facts. If they do not, we need to transform the data into a presentation that answers our questions or look for more pertinent data.
Our examples illustrate the point that the questions people ask, which reflect their values and needs, shape what they think are facts. Thus, when using statistics, it is important to understand how the implied questions shape what we believe are the facts.
It is also important that we understand the difficulty statisticians face when they are asked to provide answers. The technical decisions statisticians make are never entirely objective. They respond to the questions implied by their own perspectives, roles, and assignments. If users ask different questions but use the same tables, they may be mistaken about the facts. That is one reason why it is helpful to go back to the original data for many analyses.
This does not mean that general-purpose statistics are useless. Luckily, there are many kinds of analysis that need to have facts addressing the same or similar questions, and for these, general-purpose statistics like the U.S. census are a good solution. But we need to be aware that general-purpose statistics are for general use only when they can answer questions that imply the same assumptions.
Looked at this way, the key skill in statistical work is twofold: first, knowing what the limits of each dataset are; and second, knowing how to make appropriate use of data to obtain the answers to the particular questions we have.
TAKING IT FURTHER For more perspectives on the topic of this article, see Brent D. Slife and Richard N. Williams, What's Behind the Research? Discovering Hidden Assumptions in the Behavioral Sciences (Sage Publications, 1995); Nicholas Eberstadt, The Tyranny of Numbers. Mismeasurement and Misrule (The AEI Press, Washington, DC 1995); Theodore M. Porter, Trust in Numbers: The Pursuit of Objectivity in Science and Public Life (Princeton University Press, 1995); and Daniel Melnick, "Organizational Perspectives and the Federal Statistics Agenda: How What We Know Reflects Who We Are," in the proceedings of the Seminar on Statistical Methodology in the Public Service (Office of Management and Budget Statistical Policy Working Paper, August 26, 1997). Additional information about transportation statistics can be obtained from the Bureau of Transportation Statistics Web site, http://www.bts.gov, or by calling (800) 853-1351. BTS also has a fax-on-demand service at (800) 671-8012. Single copies of printed reports and CD-ROMs are free. Mail requests can be sent to the Bureau of Transportation Statistics, 400 Seventh Street, SW, Room 3430, Washington, DC 20590. Contact the author at Dan Melnick Research, P.O. Box 57233, Washington, DC 20037-7233; e-mail [email protected]