Academic
Publications
A Learning Approach to Spam Detection based on Social Networks

A Learning Approach to Spam Detection based on Social Networks,Ho-Yu Lam,Dit-Yan Yeung

A Learning Approach to Spam Detection based on Social Networks   (Citations: 7)
BibTex | RIS | RefWorks Download
The massive increase of spam is posing a very serious threat to email which has become an important means of communication. Not only does it annoy users, but it also consumes much of the bandwidth of the Internet. Most spam filters in existence are based on the content of email one way or the other. While these anti-spam tools have proven very useful, they do not prevent the bandwidth from being wasted and spammers are learning to bypass them via clever manipulation of the spam content. A very different approach to spam detection is based on the behavior of email senders. In this paper, we propose a learning approach to spam sender detection based on features extracted from social networks constructed from email exchange logs. Legitimacy scores are assigned to senders based on their likelihood of being a legitimate sender. Moreover, we also explore various spam filtering and resisting possibi lities.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Other work has also attempted to group senders based on recipient [13,19,24]...

    Nick Feamster. Outsourcing home network security

    • ...SNARE bears some similarity to other approaches that classify senders based on network-level behavior [12,21, 24,27,34], but these approaches rely on inspecting the message contents, gathering information across a large number of recipients, or both...
    • ...Various previous work has also attempted to cluster email senders according to groups of recipients, often with an eye towards spam filtering [21,24,27], which is similar in spirit to SNARE’s geodesic distance feature; however, these previous techniques typically require analysis of message contents, across a large number of recipients, or both, whereas SNARE can operate on more lightweight features...

    Shuang Haoet al. Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic...

    • ...Recent work [12],[13] has used features for the senders extracted from social networks in supervised learning or in creating white/blacklists...

    Yu-Sung Wuet al. Spam detection in voice-over-IP calls through semi-supervised clusteri...

    • ...The other group attempts to exploit non-content information such as email header, email traffic [5], and email social network [2][4][9][10][13] to filter spams...
    • ...In [13], certain network related features are extracted to characterize each user...
    • ...Based on this observation, both communication reciprocity (CR, first proposed in [10]) and communication interactive average (CIA, first proposed in [13]) are presented to capture this difference...

    Chi-yao Tsenget al. Incremental SVM Model for Spam Detection on Dynamic Email Social Netwo...

    • ...To tackle the problem of email spam many methods are presented, some of which are being used [6-10]...
    • ...[9] Authors in this paper propose a machine learning an ti-spam method that studies the behaviour of an email server in order to distinguish legitima te user behaviour from spam user behaviour...
    • ...Phonetic String Matchin g [6] Pro Mail [7] Zombie Based [8] SMTP Logs Mining [9] Honey Spam [10] Memorybased [12] Keywor d-based [13] Genetic program ming [14] Bayesi anbased [11] Black list [4]...
    • ...Another similar approach presented in [9] uses SMTP logs to analyse user profiles...
    • ...Five of those can be u sed in detecting spam that is not in English language, lik e [4, 7- 10]...
    • ...Compared to content based approaches, meta-data based methods create user profiles to combat spam a nd are applicable to non-English language [7-9] except [4] , which uses blacklisting approach to prevent spam...
    • ...However metadata based methods [7-9] can detect spam if they ha ve spammers’ profile in their training data set...
    • ...Apart from this, techniq ues like Promail and SMTP Log [7, 9] are creating user profi les to detect spam, but based upon user profile interconne ction it might be difficult to implement them in public emai l services...
    • ...Finally we also found that most of method s use supervised learning which require manual interventi on and can be a maintenance issue compared to other 3 unsupervised methods [7-9]...

    Pedram Hayatiet al. Evaluation of spam detection and prevention frameworks for email and i...

Sort by: