It looks like science has finally caught up with the age of social media. We now have an algorithm that one could feasibly call the breathalyzer of Twitter, and we've learned about it just in time for St. Patrick's Day.
Headed by lead author and graduate research assistant Nabil Hossain, scientists at the University of Rochester developed the algorithm to differentiate drunk tweets from sober ones, all in the name of, among other things, averting alcohol-related incidents and accidents -- and in some cases, death.
Using geotagged accounts in two comparable but vastly different locations -- NYC and Monroe County in upstate New York -- the team amassed a collection of roughly 11,000 geolocated tweets in total. This data was then fed to an algorithm machine called the Amazon Mechanical Turk, which then "analyzed the tweets in more detail," according to the MIT Technology Review.
To figure out which tweets were sent by very inebriated humans and which were not, the algorithm picked up on certain words such as "home," "TV," and "bath" (example: "got home safe about to pass out in the TV and watch a bath") and used them as identifiers. (I'm sure "drunk" was probably on that list, too.)
So why were tweets used? It turns out that they can be the most precise form of "self-reporting."
"Nearly all previous work on geolocating latent states and activities from social media confounds general discussions about activities, self-reports of users participating in those activities at times in the past or future, and self-reports made at the immediate time and place the activity occurs," explained Hossain in the study, reiterating how real-time updates on social media can provide a specific kind of accuracy regarding anything from eyewitness accounts to specific mental states.
"Activities, such as alcohol consumption, may occur at different places and types of places, and it is important not only to detect the local regions where these activities occur, but also to analyze the degree of participation in them by local residents," Hossain continued.
The scientists were also able to track the distance from a drunk tweeter's watering hole to their home -- again, using geolocation to find the two points where tweets were sent and to subsequently chart the distance. The study suggested that residents in Monroe County have a tendency to drink "further than a kilometer from home," while citizens of New York City have a higher density of drinkers, probably because it's a "crowded city" with "highly dense alcohol outlets."
"We believe that such grids are regions of unusual drinking activities," added Hossain. "We can explore the social network of drinkers to find out how social interactions and peer pressure in social media influence the tendency to reference drinking."
In turn, this could also help prevent both accidents and fatalities related to alcohol consumption -- as the MIT Tech crew points out, roughly 75,000 alcohol-related deaths occur every year in the U.S. alone.
"Our results demonstrate that tweets can provide powerful and fine-grained cues of activities going on in cities," researchers say.
Photo: Hotel de la Paix Genève | Flickr