Concrete5, Advanced Comments and spammers


Over the last few months I've noticed a marked increase in spam comment posts. During that time I've periodically gone through and deleted the spam so my site isn't reduced in quality. As I've been very busy the last couple months I haven't had time to post any of the cool new stuff I'm working on, let alone take the time to analyze where the spam was coming from and figure out how to combat it.

During the last few days I've had more time to dedicate to solving the comment-spam problem. Along the way I learned something interesting about the Advanced Comments Concrete5 add-on in addition to spotting the spam source.

References:

 

A bit of History:

I first noticed comment spam popping up on my blog about 18 months ago. Each BW article had hundreds of posts of dubious gramatical quality and I needed to find a way to stop this. The default Concrete5 captcha system was failing me, so I installed Advanced Comments- a free Concrete5 add-on which allowed me to use ReCaptcha to guard my posts.

ReCaptcha seemed to do the trick up until a couple months ago. At that point I noticed a TON of spam comment posts that took awhile to cleanup. It's been a storm ever since then as I've had to deal with the junk comments. Looks like Recaptcha is broken and has been for some time. I guess my website only recently became a target or I'd have probably noticed junk comments sooner.

At first I tried disabling comments on articles that didn't have any comments (to minimize entry points). Then I noticed that you can set an automatic 'Comments Closed' timeout using Advanced Comments. Once I activated that feature I was sure my spam problems were over as most of the spam is concentrated on a dozen articles that are 6 months old and older.

 

Additional Leg Work:

Unfortunately, I learned that while the web page may say 'Comments Closed' the spammers can still get through. Their tools don't use screen-scraping: instead, they are making API calls which still allow comments even though the 'Comments Closed' window has passed. So be warned: just because you and your users can't post additional comments on articles / blog posts, someone else may be able to if your captcha program doesn't block this at the API level.

My next step in troubleshooting was to open up FireFox and use FireBug to see what happens when I post a comment. I found that an http POST command was sent to the server whenever I 'posted' a comment. I also noticed that just about everything else (like browsing) uses http GET commands. Armed with this knowledge I went to the command line and used this command to filter the httpd logs down to just those that involved HTTP POST Operations:

cat /var/log/httpd/websiteName.net-access_log | grep "POST" > PostRequests.txt

I then used vi to parse the log and see if there were any common IP addresses. While there were a few duplicates what really caught my attention was fact that most of the spam posts came from 96.47.225.*

I performed an ARIN.net IP Address search and found that the majority of this spam traffic originates from IPTelligent, LLC which is a Florida based DataCenter. Looks like other people have had the same problem judging by the first 5 google results for IPTelligent, LLC:

SpamTelligent.png

 

Armed with this information I was able to blacklist the IPTelligent, LLC subnet without concerns for blocking any legitimate users. I went into Concrete5's Dashboard -> System Settings -> IP BlackList page and entered this to prevent those servers from posting spam:

96.47.224.*
96.47.225.*
96.47.226.*
96.47.227.*
96.47.228.*
96.47.229.*
96.47.230.*
96.47.231.*
96.47.232.*
96.47.233.*
96.47.234.*
96.47.235.*
96.47.236.*
96.47.237.*
96.47.238.*
96.47.239.*

 

No comment spam since blocking these ranges. I hope it lasts! :)

Followup: Looks like blocking those IP ranges only blocks people from logging into my site, not from leaving comments.

I'm adding IPTables firewall rules to block ranges like this now. Hope it works:

-A INPUT -s 96.47.224.0/20 -j DROP

 

Followup 2: (July 2013): Blocking IP addresses gets old FAST. There are always more coming online. After getting hundreds of spam postings every night for a few nights in a row I was pushed over the edge and replaced my captcha system with Are you a Human?, which is a MUCH superior captcha service and has reduced my spam comment volume to 0.