- Researchers Develop a new method, based on machine learning technique, to identify fake users on social media platforms.
- The algorithm is based on the assumption that fake users tend to build inappropriate links to other users.
- The results show that the algorithm is capable of detecting both genuine and malicious users.
Identifying fake users has become the first priority for social networking companies, especially after the targeted use of social sites by Russia to influence USA’s elections, and failure to safeguard user privacy.
Now, researchers at the University of Washington and University of the Negev have built a novel generic unsupervised learning algorithm to locate fake users on social networking platforms like Twitter and Facebook.
The new algorithm is based on the prediction that fake users tend to create inappropriate links to other users in the networks. Researchers have incorporated a link prediction method into an anomaly detection model that doesn’t need any prior graph knowledge.
How The Algorithm Works?
The graph topology has been utilized to build a novel generic method for detecting anomalous vertices in large, complex networks. The algorithm has 2 key iterations based on machine learning techniques.
- Create a link prediction classifier for estimating link probability between two users.
- Create a new meta-feature set, based on the features generated by link prediction classifier.
Researchers have proposed 7 new features that are supposed to be effective predictors for detecting anomalies. To determine which of the new feature have the most influence, they examined their importance using Weka’s information gain attribute selection algorithm.
Reference: SpringerLink | doi:10.1007/s13278-018-0503-4 | BGU
Then they conducted an extensive experimental evaluation on 3 types of complex networks – real-world networks with labeled anomalous vertices, real-world network with simulated anomalous vertices and fully simulated networks.
Talking about datasets, they used 10 different networks, including Flixster10, Dblp8, Yelp, Academia.edu, ArXiv5 and Twitter.
Red vertices show anomalous vertices, and red edges have the lowest probabilities of being fake
The results show that the algorithm is capable of detecting both genuine people and malicious users on real networks, including Twitter. It outperformed other anomaly detection technique, and according to the developers, it has a potential for numerous applications, especially in the cybersecurity field.
What’s Next?
Developers plan to examine the algorithm for other kinds of networks, like weighted and bipartite graphs. They will also study what happens to the properties of the network when random edges and vertices are joined.
They will be further showing how the same algorithm can be used to detect hijacked accounts in social platforms. Moreover, it could be interesting to see what scale of a Sybil attack would require to be implemented so that it’s no longer possible to differentiate between real and fake vertices.
Read: 22 Free Social Network Analysis Tools
For now, researchers have published all code and data online, including the real-world datasets containing labeled fake IDs. Anyone can use it as an open framework, enhance future vertex anomaly detection methods, and compare their outcomes.