Academic
Publications
Uncovering social spammers: social honeypots + machine learning

Uncovering social spammers: social honeypots + machine learning,10.1145/1835449.1835522,Kyumin Lee,James Caverlee,Steve Webb

Uncovering social spammers: social honeypots + machine learning   (Citations: 10)
BibTex | RIS | RefWorks Download
Web-based social systems enable new community-based opportunities for participants to engage, share, and interact. This community value and related services like search and advertising are threatened by spammers, content polluters, and malware disseminators. In an effort to preserve community value and ensure longterm success, we propose and evaluate a honeypot-based approach for uncovering social spammers in online social systems. Two of the key components of the proposed approach are: (1) The deployment of social honeypots for harvesting deceptive spam profiles from social networking communities; and (2) Statistical analysis of the properties of these spam profiles for creating spam classifiers to actively filter out existing and new spammers. We describe the conceptual framework and design considerations of the proposed approach, and we present concrete observations from the deployment of social honeypots in MySpace and Twitter. We find that the deployed social honeypots identify social spammers with low false positive rates and that the harvested spam data contains signals that are strongly correlated with observable profile features (e.g., content, friend information, posting patterns, etc.). Based on these profile features, we develop machine learning based classifiers for identifying previously unknown spammers with high precision and a low rate of false positives.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Though many techniques for controlling malicious users have been proposed, such as machine-learning based techniques [4], Sybil-defense schemes [5], and so on, large amounts of spam continue to plague the popular OSNs...
    • ...As reported in [4], spammers in Twitter repeatedly post tweets containing URLs of their affiliate web-sites, hence we use thenumber of repeated URLs in recent tweets to identify suspicious users...
    • ...Note that this feature is not a confirmatory test for spammers, since some promoters also repeatedly post the same URL to advertise their websites or services [4]...

    Saptarshi Ghoshet al. Spammers' networks within online social networks: a case-study on Twit...

    • ...Many anti-spam methods such as TrustRank [11], Bad-...
    • ...Corresponding algorithms include TrustRank [11] and BadRank [15], which need a seed set of Web pages with labels of trustiness or badness and propagate these labels through the link graph...

    Zhicong Chenget al. Let web spammers expose themselves

    • ...One strategy is to detect spam users based on the relationships of the Twitter users [3][4], similar to the methods for discovering spasm on other social network services [5][6][7]...

    Anqi Cuiet al. Are the URLs really popular in microblog messages?

    • ...We lter out spammers, promoters, and other automated-script style Twitter accounts using features derived from Lee et al.’s work [14] on Twitter bot detection, so that the test set will consist of primarily \regular" Twitter users for whom location estimation would be most valuable...

    Zhiyuan Chenget al. You are where you tweet: a content-based approach to geo-locating twit...

    • ...In addition to the studying the social graph, recent work on social network spam uses machine learning to classify spam tweets [13], determine Twitter influence [2], and classify spam MySpace profiles [9]...

    Chris Grieret al. @spam: the underground on 140 characters or less

Sort by: