Personal Data Mining

By Published on .

Profile from Daytum.
Profile from Daytum.
Most Popular
Thanks in part to the increasingly popular field of information visualization, data is starting to break its nerdy bonds and tell compelling, accessible stories—witness the recent Sprint "Anthem" spot from Goodby, Silverstein & Partners, which breaks down the world of mobile communication in numerical terms, all while maintaining a healthy level of entertainment. Now, personal data mining websites like Daytum and YourFlowingData are bringing that data-driven storytelling beyond experts, designers and computer scientists to everyday internet users via social media channels.

Both sites are tools for users to track and tabulate their behavior and render those data sets visually, with functionality that draws from social media habits. Daytum lets users build Facebook-like profile pages composed entirely of charts, text and graphs, while YourFlowingData uses Twitter for data entry. Daytum, which came out of beta for public use this month with more than 10,000 members, provides Web and mobile data input interfaces and users customize their pages from there--they can choose to track any conceivable data and then select a visualization technique from a number of options. Since the site launched in 2008, Daytum users have measured everything from the quantity and brands of beer they consume, the miles they walk and the emails they receive to the last irrational fear they experienced.

YourFlowingData personal stats reader.
YourFlowingData personal stats reader.
YourFlowingData, which is currently still in beta, visualizes data sets similarly, though only tracks a limited set of behaviors. Hundreds of beta users—site creator Nathan Yau is hoping to take YFD public this fall—send direct Twitter messages to @yfd using predetermined keywords, a practice he calls "life-tagging." For example, a beta user tweets "gnite" at bedtime and "gmorning" upon waking to determine time slept. Yau started the site late last year with just weight and sleep stats tracking, but has expanded to entertainment, potty time, smoking, mood, and what he calls YFD pulse. Once tweeted in, all the data is then plotted on a personal stats reader (above). Beyond simply visualizing the data, the site also incorporate server-side time series analysis to alert users of trends and patterns.

While both concepts are relatively easy to use and draw from established internet behaviors like filling in Web forms and micro-blogging, the user is ultimately responsible for data entry. When considering the mass appeal of personal data services, the question is how many people have the motivation to Twitter every time they go to bed, or record all the beers they drink to input online the next hungover morning?

"I think there is certainly a potential for Daytum to be used in similar ways to Facebook and Twitter," says Nicholas Felton, who founded Daytum with interactive designer Ryan Case. "A lot of people already use Daytum to make structure of some of the more mundane information that might wind up on Twitter otherwise. Though, there are still very few people willing to note the shoes they wear each day."

"The state of self-surveillance right now is that there is no mass appeal," adds YFD's Yau. "It's for data junkies because, even though people can collect data and understand pie graphs, in the end, they still have to do their own analysis. The masses don't want to or know how to do that."

Though, with the development of technologies embedded in everyday objects to track data--what's been dubbed "everyware"--personal data mining could become possible without any active participation from the user. For example, right now, runners don't need to record every mile they run at the gym themselves; with Nike+, their shoes can do it for them.

Daytum and YFD are among a number of personal data mining services already online. The most similar concept is Mycrocosm, from the Sociable Media Group at the MIT Media Lab. (Researcher Yannick Assogba was not available for comment.) Covering more niche audiences, Dopplr is a hub for user-generated travel itineraries and Monthly Info and PMS Buddy helps women track their menstrual cycles. Trixie Tracker is a tool for parents looking for patterns in their new baby's activity and Bedpost can be used to chronicle sexual encounters.

From Felton's 2008 Annual Report.
From Felton's 2008 Annual Report.
Felton, a New York-based graphic designer, says that data is becoming an increasingly accessible way in which everyday people understand the world and themselves. Felton started down the personal tabulation road in 2005 with his own data profiles, the Feltron Annual Reports, above. (For more on his reports, everyware and information visualization, check out the upcoming June issue of Creativity Magazine.)

"I think data has also become a narrative mechanism that people understand now," he says. "It's like the MasterCard verbiage. The nut of the story is in the numbers. People all have Excel. What I do is take the output of Excel and make it look a little nicer. But it's also the stories. People are getting used to the fact that you can tell a story on a graph or make a narrative through infographics and are also willing to put in a little bit of effort to puzzle it out."

Beyond just capturing stories, YFD's Yau, the UCLA PhD candidate in statistics behind the popular data visualization blog FlowingData, thinks that data portraits might be able to teach us something new about ourselves. The idea for YFD grew out of Yau's personal experiment to use Twitter to identify factors that influence his weight.

"Let's say someone has tagged whenever they're happy and sad," he says. "YFD might find patterns in what makes a person happy and what makes him sad, like what that person ate, if they exercised, hung out with friends, or a combination of something completely unexpected. I think this is where personal data mining is headed: statistics for nonprofessionals, where people can collect data, explore with visualization, and setup their own analysis.

"Toss machine learning [the ability to identify trends from limited data] into the mix and there's some power in that. When people can really learn about themselves in a non-data way, when data becomes less disjointed from real life and more intertwined with the everyday, there will be huge appeal."

In this article: