The FlowingData blog is required reading for anyone into data visualization. Part inspiration and part hands-on tutorial, the site brings to light examples that tell simple, elegant stories with complex data. Its author is Nathan Yau, a UCLA PhD candidate in statistics. Last fall he released his first book, "Visualize This," which is full of beautiful visualizations and code snippets for the geeks who'd like to make more. He sat down with AdAge's resident stats geek Matt Carmichael, virtually, for a Q&A.
AdAgeStat: Data reporting and visualization is all the rage in (print and online) news media these days. This is good in many ways, but often leads to a lot of malpractice. What news organizations do you think are doing it well, and are there some that you feel are overdoing it?
Mr. Yau: The New York Times does it best. WSJ and Washington Post also often do a good job. As a whole, news orgs are getting better, and I don't think anyone is overdoing it. If anything, they're not doing enough, as data journalism grows.
There are many online media companies, however, that overdo it in the sense that they make a lot of big, flashy graphics that have little to no content. I tend to avoid anything with an embed code at the bottom.
AdAgeStat: Can visualizations make it easier for journalists/pundits/advertisers, etc. to "lie with numbers?"
Mr. Yau: Sure, in the same way that it's easy to lie with words and pictures. Visualization is another way to tell stories, so there are, of course, going to be people who tell incorrect stories. The good news, though, is that with the growing availability and accessibility of data, it's usually straightforward to verify facts. My rule of thumb is to never trust a chart that doesn't specify its source, and I think people in general are getting better at spotting the B.S.
AdAgeStat: In journalism, there's now a lot of pressure to wear many hats in budget-strapped newsrooms. Many writer-type folks are working on their own graphics. Is that a bad idea? Should we leave it to the pros or are there some simple rules that can keep us in line?
Mr. Yau: I love that journalists are putting on multiple hats. And I love it when analysts put on their journalist hats. The process of understanding a dataset is iterative, where you start with questions, look to the data, and then ask more questions. Rinse and repeat. When someone has both storytelling and analysis skills, it's much easier to do that back and forth, and those who understand data make better graphics.
But if I were to pinpoint one thing, it'd be to think critically about your data. Where did that data come from? How was it derived? How does it relate to the physical world?
AdAgeStat: Your book includes a lot of helpful code examples. How intertwined are graphics, design and coding these days? What's a good balance of skills? Does this question need to be answered in a Venn diagram?
Mr. Yau: Again, this comes back to iteration. The best graphics come from those who have multiple skills. That's not to say that those who only know a subset can't make good graphics, but knowing code, design and analysis lets you ask more questions, and ultimately, find more answers. Programming and analysis, which go hand-in-hand these days, let you explore larger, more complex datasets; and design helps you frame your results in a way that's understandable to a larger audience.
AdAgeStat: How important is visualization to storytelling today?
Mr. Yau: Very important. Especially online, where stories are limited to a browser window. Obviously not every story needs a data graphic, but when you have a lot of data, it's much easier to show what's going on vs. trying to explain every detail. With interactive graphics, readers can study the data on their own.
AdAgeStat: In the industry I cover (advertising), marketers (clients) often feel overwhelmed with the data available to them. Agencies love to churn out data, or as they say, "actionable intelligence." So for the two sides of the agency/client relationship: If I'm a client getting pitched by a potential new agency, or an agency offering some new creative work, what should I watch out for? And if I'm an agency, how can I make my case more credibly and honestly?
Mr. Yau: If you're a client, you're looking for people who know data. Visualization is often mistaken for an exercise in graphic design, but actually, graphic design is a part of visualization. You end up with pointless, fluffy visualization when data is like an afterthought in the design process. As an agency, you should have an interest in that entire process.
AdAgeStat:While a well-crafted table might be the best way to convey a set of data properly, can there be value to creating a visualization that's eye-catching as a design element? In other words, as journalists, we often hear that our readers don't want just a big table, they want to be able to quickly visualize things in an eye-catching, unique way. Is that a fallacy?
Mr. Yau: Yes, but it depends on your audience and purpose. If it's an ad whose only purpose is to draw attention, then obviously there needs to be some flare. In a lot of these cases though, I don't think visualization is the medium you're looking for. If, however, you're writing a story for The New York Times, a color-coded table could be exactly what you need.
AdAgeStat: What are some tips for telling a story visually to an audience that's less data-inclined than say, you or I are?
Mr. Yau: This goes back to those earlier links, but you have to get inside the head of your readers. Assume they know nothing about the data that you've been looking at for a while, and explain the details. Label axes, highlight the interesting points, and explain what's there. Don't make it the reader's job to figure out why a graphic is there in the first place.
AdAgeStat: Do you have some favorite/ least-favorite visualization types?
Mr. Yau I don't have a favorite method, but I do appreciate simplicity, especially when the data displayed is complex.
This is the second in a series of AdAgeStat Q&As with data-visualization experts. Earlier we spoke to Edward Tufte, the author of the ground-breaking "Visual Display of Quantitative Information." Previous Q&As have focused on researchers who have extensively studied pieces of the demographic puzzle: OMD 's Erin Bilezikjian-Johnson about Millennials, Edward Glaeser about the Triumph of Cities, Leo Burnett's Carol Foley about drivers of human behavior, The Patchwork Nation's Dante Chinni about the role of geography in segmentation, Joel Kotkin about suburbs and immigration, Richard Florida about mega-cities, Paco Underhill about women, Euro RSCG's Rose Cameron about men, Tammy Erickson about Generation X, and behavioral economist Dan Ariely about how to use everyone's irrationality to your advantage.