Hedonometer

Happiness

It’s what most people say they want. So how do we know how happy people are? You can’t improve or understand what you can’t measure. In a blow to happiness, we’re very good at measuring economic indices and this means we tend to focus on them. With our flagship instrument at hedonometer.org we’ve created an instrument that measures the happiness of large populations in near real time. We also link to applications of the instrument to literature, movies, and news.

Our Hedonometer is based on people’s online expressions, capitalizing on data-rich social media, and we’re measuring how people present themselves to the outside world. For our first version of hedonometer.org, we’re using Twitter as a source but in principle we can expand to any data source in any language (more below). We’ll also be adding an API soon.

So this is just a start — we invite you to explore the Twitter time series and let us know what you think.

Hedonometer Team

Hedonometer.org is based on the research of Peter Dodds and Chris Danforth and their team in the Computational Story Lab, including visualization by Andy Reagan, at the University of Vermont Complex Systems Center, and the technology of Brian Tivnan, Matt McMahon and their team from The MITRE Corporation. Many others have contributed to the research we've conducted using the instrument, their names appear near the bottom of this page.

Frequently Asked Questions

General

“Where does the word ‘hedonometer’ come from?”

The economist Francis Edgeworth coined the term in the late 1800’s to describe "an ideally perfect instrument, a psychophysical machine, continually registering the height of pleasure experienced by an individual." [wikipedia]

“How are words assigned a happiness score?”

To quantify the happiness of the atoms of language, we merged the 5,000 most frequent words from a collection of four corpora: Google Books, New York Times articles, Music Lyrics, and Twitter messages, resulting in a composite set of roughly 10,000 unique words. Using Amazon’s Mechanical Turk service, we had each of these words scored on a nine point scale of happiness: (1) sad to (9) happy. You can explore the average scores of each word on our words page, or download the entire list from the publication supplement here. On a few occasions, we've updated the word list to include new terms that were uncommon when the original survey was conducted.

“What is being measured by the instrument?”

hedonometer.org currently measures Twitter’s Decahose API feed (formerly Gardenhose). The stream reflects a 10% random sampling of the roughly 500 million messages posted to the service daily, comprising roughly 100GB of raw JSON each day. Words in messages we determine to be written in English are thrown into a large bag containing roughly 200 million words per day. This bag is then assigned a happiness score based on the average happiness score of the words contained within. While "bag-of-words" approaches can be problematic for small collections of text, we have found the methodology to work well at the large scale.

“If this is an instrument, it should have a knob somewhere.”

Is that even a question? Well, we do have a knob. It allows us to tune the relative importance of the most emotionally charged words by removing neutral words from consideration when determining the happiness of a given day. It also allows us to remove words that receive widely varying scores when rated on Mechanical Turk. Many profanities received average ratings between 4 and 6 due to the bimodal nature of their word score distribution. Details on the choice of Δ h_avg = 1 can be found in figure 2 of the foundational publication for Hedonometer. We also mask a small set of words with average sentiment outside of the neutral range. These are words whose scores we determined to be inappopriate for the task, typically because they are highly context dependent.

“What does the hedonometer say about people who don’t tweet?”

Tweets represent a non-uniform subsampling of all utterances made by a non-representative subpopulation of all people. However, there are hundreds of millions of people presently using the website to express their activities and interests, and as such it is an important social signal. According to Pew, 1 in 5 adult Americans use Twitter, including our current President as you may have heard.

“What about the demographics of Twitter? Aren’t they non-representative?”

Yes! And Twitter’s demographics have also changed over time. Nevertheless, we’re using Twitter as our initial data source for a few reasons:

We have found that our measure of happiness correlates very well with traditional surveys of well-being (see here for details).
Twitter provides a stern test for our instrument due to the enormous amount of data we receive and must process in real time.
We can focus in on Twitter communities to gain a sense of what people are expressing (e.g., countries and cities); and
Twitter continues to become a more and more important collective, global media voice, and is thus an important story in itself worthy of scientific analysis.

“Why does the day of Osama Bin Laden’s death have such a low happiness score?”

Many people presume this day will be one of clear positivity. While we do see positive words such as “celebration” appearing, the overall language of the day on Twitter reflected that a very negatively viewed character met a very negative end. It was a day of complex emotion which is best explored in the word shift for the day, rather than the single number of its average happiness.

“Where can I learn more about the hedonometer and other work being carried out by the Computational Story Lab?”

In our Computational Story Lab blog we describe research projects in which we use our hedonometer to characterize happiness variations with respect to geography, network topology, demographics, and socio-economic data. For example, here’s a map of the US with cities colored by happiness:

For the full story of our hedonometer algorithm, please read our foundational paper describing its construction:

Temporal Patterns of Happiness and Information in a Global-Scale Social Network: Hedonometrics and Twitter.
PLoS ONE, 6, e26752, 2011. [pdf] [journal url]

Future

“What about languages other than English?”

We have scored 10 languages, revealing a universial positivity bias in human language. For more information, see our paper in PNAS. We have analyzed other languages as well, and are working to get sentiment timeseries online.

“How will you deal with context?”

We are currently developing a principled method to identify relevant phrases, for example to deal with the multitude of both positive and negative uses of profanity. We expect to be scoring phrases instead of words, where appropriate, in the near future.

“What about other emotions?”

We are currently building a large-scale database of word-based measures for emotions other than happiness and sadness such as fear, anger, and surprise. We intend to incorporate these emotions into future versions of the hedonometer.

Data and consulting

In addition to the raw data shown on this site, we are working to provide detailed analysis around brands, financial products, and US politics at Quokka Labs. More information is available on the website: quokkalabs.io.

Credits

UVM’s Computational Story Lab:

Peter Dodds, Chris Danforth, Jane Adams, Sharon Alajajian, Nicholas Allgaier, Thayer Alshaabi, Michael Arnold, Catherine Bliss, Eric Clark, Emily Cody, Ethan Davis, Todd DeLuca, Suma Desu, David Dewhurst, Danne Elbers, Kameron Decker Harris, Fletcher Hazlehurst, Sophie Hodson, Kayla Horak, Ben Emery, Mike Foley, Morgan Frank, Ryan Gallagher, Darcy Glenn, Sandhya Gopchandani, Kelly Gothard, Tyler Gray, Max Green, Laura Jennings, Dilan Kiley, Isabel Kloumann, Ben Kotzen, Paul Lessard, Ross Lieb-Lappen, Kelsey Linnell, Ashley McKhann, Andy Metcalf, Tom McAndrew, Sven McCall, Josh Minot, Henry Mitchell, Lewis Mitchell, Kate Morrow, Eitan Pechenick, Michael Pellon, Aaron Powers, Andy Reagan, John Ring IV, Abby Ross, Lindsay Ross, Aaron Schwartz, Anne-Marie Stupinski, Matt Tretin, Lindsay Van Leir, Colin Van Oort, Brendan Whitney, and Jake Williams.

The MITRE Team:

Brian Tivnan, Matt McMahon, Ivan Ramiscal, Mike Shadid, Pete Carrigan, Zach Furness, Zoe Henscheid, Garry Jacyna, Matt Koehler, and Karine Megerdoomian.

Many thanks and acknowledgments go to these people:

Mike Austin, Jim Bagrow, Josh Bongard, Josh Brown, Jim Burgmeier, Melody Burkins, Kate Danforth, Andrea Elledge, Maggie Eppstein, Bill Gottesman, Laurent Hebert-Dufresne, John Kaehny, Jim Lawson, Juniper Lovato, Aimee Picchi, Andrew Reece, Tony Richardson, Taylor Ricketts, Melissa Rubinchuk, John Tucker and Toph Tucker.

Thanks to Thiago Lins and David Peterman for helping identify events to annotate.

And special thanks go to Jonathan Harris and Sep Kamvar for their initial inspiration.