VOXTER - Social Media research is biased

6 May 2016

Social Media research is biased: How to keep it social but get better data


Social media is the new go-to source of big data for researchers, and has great advantages for observing natural behaviour of large groups. Or does it? As with any research methodology, it’s important to understand where bias comes into the system, and whether the trade-off of quality vs. volume is worth it.

We outline 8 major sources of bias in social media below, from an imaginary user's perspective, and then how our proprietary research software tackles these biases.

1. Selection bias

Your go on Twitter, read a few things in your stream, and maybe retweet something, maybe post something.

Whose tweets are you reading? Friends, family, people with similar interests. However broad a research sample is, each micro-group is highly selective, and this influences the sorts of things people say and how they interact. That is over and above the fact that Twitter users are not (yet) demographically representative of the population, and it is difficult to say what sorts of people are likely to be the most active on Twitter.

2. Self-reporting bias in the system

You go on LinkedIn, look at a few posts, and profiles, learn a few tips, make a few comments.

Inherent in the system is interacting for professional benefit, and people tread very carefully to praise and interest each other, and there is very little dissent. Motivation, honesty, in short, quality of response is very hard to gauge. Other social media networks have similar, but less obvious attributes.

3. Self-reporting bias in your network

You are a teenager, you have one Facebook account for your friends, and another that’s visible to your parents and family.

Your own personal social network can profoundly affect what you say, and your persona online. The same person can be provocative and conservative. Which one’s opinion should be trusted?

4. Engagement bias

You see a post by a family member that espouses political views you profoundly disagree with. Underneath that is a post of a panda sneezing. Which one do you engage with?

You are presented with a stream of content that you can skim very easily and choose what to engage with. This introduces bias in terms of content and engagement, and also leads to…

5. Polarization bias - the TripAdvisor effect

You’re going on holiday and check out TripAdvisor for reviews of your chosen hotel. How much do you take these reviews as a representative sample of guest experience?

Of course people are much more likely to post a review on TripAdvisor if they have had a very good or very bad experience. Maybe sets of reviews can be compared against others, but it is hard to generalise about a hotel. In the same way, different groups of people are more active on social media, and people express extremes of opinion. Extrapolation to public opinion as a whole is a significant exercise. See here for some interesting research.

6, 7, 8. Acquiescence bias, confirmation bias, and herd behaviour

A controversial political post appears in your stream with a large amount of likes (maybe because it has a large amount of likes!). Maybe it has a comment or like from a friend of yours. Does that affect your response?

It’s called social media, not anti-social media. Never mind any fundamental human and cultural propensity for agreement and socialisation, this is the underlying purpose of social media. Individual confirmation bias, the tendency to support views similar to your pre-existing beliefs, and acquiescence bias, the tendency to agree in general, is built in to the system. This can lead to herd behaviour, which can cause gross distortions in data.

Can these issues be addressed?

At Voxter we understand the good and bad in social, interactive systems. On the one hand they are engaging and fun but on the other they are very difficult to pick apart and correct for imbalances. Our approach has been to keep the engagement and fun but to sacrifice some of the ‘social’ in the system to reduce bias at source:

  • Participants interact anonymously with avatars replacing names and identities.
  • Posts are not marked for how popular they are which means that participants respond to content directly.
  • The interaction is designed to encourage a wide range of activities of participants and balance them, so that one can influence the discussion even without producing any new content.
  • Checks on participants influence on the conversation are in place for a more balanced distribution of influence.
  • Our participants can be sampled according to need, and at scale.

Research into large-scale group interaction at the London School of Economics has led to our software design. We believe it strikes a better balance between quality and volume.


Contact Us

Please get in touch to discuss your project, find out more, or request a demo.

   07941 086 056


All addresses are stored securely and we don't share your details.

Thank you. We will be in touch shortly.

© 2019 d-Governance Ltd. All rights reserved.   Privacy Policy Terms of Use
Voxter is an MRS Company Partner