- New machine learning algorithm focuses on news sources rather than just scraping individual claims, to detect fake and bias news.
- For new news articles, it achieved an accuracy of up to 70%.
- It requires only 150 contents to determine whether the source can be trusted or not.
Social media platforms have made it extremely easy for anyone to share and spread information on the internet. This has given rise to the proliferation of fake news which is usually generated either to alter people’s sentiments and influence big events like political elections, or to attract traffic and make income by displaying ads.
While many tech giants are putting significant resources into building their own fake-news-detecting systems, researchers at MIT and Qatar Computing Research Institute believe that the best strategy to detect fake news is to focus on the news sources rather than just analyzing individual claims.
Using this approach, they have developed a new machine learning based method that determines whether a source is trustworthy or not. Basically, it characterizes entire news media, forecasting the factuality of reporting.
How Does It determine Biased News?
The idea behind the system is if a site has published wrong facts before, there is a decent probability they will do it again. Analyzing other contents on such websites can help system determine which sites are likely to do it in the first place.
To reliably identify fake news, one can look for common linguistic features like structure, complexity, and sentiment in the article. For instance, most fake news use emotional, subjective and hyperbolic language.
In this study, they experiment with several features derived from
- Contents from target news sources,
- its Twitter account and Wikipedia page
- its URL structure
- the number of visitors it gets
They gathered data from a website Media Bias/Fact Check. With the help of human reviewers, this website examines the factuality and biases of nearly two thousand news sites, including popular media sources and thin content farms.
These data were fed to a machine learning model, which is developed to classify sources the same way human-reviewing-website does. The model yielded impressive results: for new news articles, it achieved an accuracy of 65% at determining whether the article has a low, medium or high level of factuality, and it was 70% accurate at determining if the content is right-leaning, left-leaning, or moderate.
Image credit: MIT
The researchers claim that the system requires only 150 contents to accurately determine whether the source website can be trusted or not. Thus, it can filter out fake news before they spread too widely across the internet.
Researchers are currently working on the system to improve its accuracy and make it work in conjunction with conventional fact-analyzers. If the system provides ‘weird or confusing’ outputs on a specific topic, manually-reviewing platforms could quickly check those results and determine what validity should be given to different perspectives.
The authors have also generated an open-source dataset of almost one thousand news websites tagged with accuracy and bias scores. They also plan to roll out mobile apps to help people step out their political bubbles. Moreover, they will try to train the system to work with other languages as well. They want to go beyond right/left bias and model other forms of biases that are more relevant to other regions.
This type of algorithms could help people understand what bogus sites look like and the type of article they tend to publish.