By - Pruthivi Raj Behera, Shreya Goel and Roshan S

Email is a hot target for spam. Image Source

As we all know that Spam Classification is a classic problem in Machine Learning domain. Although it sounds easy but the classification of messages into spam or legit poses a huge challenge for Machine Learning beginners. Some of the challenges that we faced include:-

  • Length of message and emails: Since the length of an email is average of 100–150 words but they can extend possibly into 1000’s of words (for example, in 1 training instance, the email had a length of 8300 words!). …

Pruthivi Raj Behera

Master's Student at IIIT Delhi.

