The wealth of information mined from social networks is often squandered as scientists fail to account for basic factors such as demographics and biases, according to a new study from Montreal's McGill University and Pittsburgh's Carnegie Mellon University.

In a Science journal entry titled "Social media for large studies of behavior," researchers state that data scientists often reach conclusions without factoring in variables that are accounted for in other fields such as statistics.

The study compares the fallacies in social media research to one of the most infamous errors in telephone polling, in which an undersampling of Harry Truman supporters resulted in the publication of the 1948 headline "Dewey Defeats Truman." Similarly flawed methodologies are plaguing research based on social networking data, the study concluded.

Before information from sites like Facebook and Twitter can yield truly meaningful insights, scientists must realize there are problems with the way data is typically analyzed from social media networks, according to Derek Ruths, an assistant professor in McGill's School of Computer Science. For now, many of the insights gleaned from social media are often about as reliable as the 1948 telephone poll.

"Rather than permanently discrediting the practice of polling, that glaring error led to today's more sophisticated techniques, higher standards, and more accurate polls," states Ruths. "Now, we're poised at a similar technological inflection point."

Each social networking platform has a unique audience and one or more demographic is skewed, though, researchers rarely account for them.

"Instagram, for instance, has special appeal to adults between the ages of 18 and 29, African-Americans, Latinos, women and urban dwellers, while Pinterest is dominated by women between the ages of 25 and 34 with average household incomes of $100,000," the study stated.

The study also concluded that research based on social media is often oversimplified and even too subjective, as data scientists sometimes target issues that can be easily classified without the figures from a social network. Furthermore, data scientists often aren't privy to the knowledge about how social networking platforms aggregate news feed content.

The very design of social networking platforms influences the way users behave, further skewing insights when unaccounted. For example, some users would share more on social networking platforms that cater to anonymity than they would on a site the pushes openness.

Spammers and bots also taint the samples collected by researchers, the study noted. With just a little bit of cash, users of social networking sites can pay reputation firms for positive comments and masses of followers.

"Many of these papers are used to inform and justify decisions and investments among the public and in industry and government," says Ruths.

ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Join the Discussion