Scott on Writing

Musings on technical writing...

Comment Spam - Scripts or Brute Force?

I've always assumed that comment spammers are using scripts to spread their evil, evil comment spam.  My assumption is based on the following:

  1. Brute force comment spamming - actually visiting the site and entering a comment in by hand - is slow and inefficient.
  2. I personally know many bloggers who use CAPTCHAs on their site but leave commentAPI wide open, and their comment spam has plummeted to near zero.  CAPTCHAs, though, are no biggie if you are brute forcing the comment spam entry, so if CAPTCHAs are stopping people it must be because of screen-scraping type scripts.  (However, you'd think that it wouldn't be long before the bad guys smartened up and started using commentAPI to inject their spam.)

However, I am certain that a sizable percentage of comment spam is injected through brute force means.  Some poor slob taking time out of his life to visit a blog and post a comment in the hopes of improving his site's pagerank.  And some of these comment spams are getting more clever, addressing other comments so as to appear valid, but hiding the spammy URL in the author's name portion.  For example, today a comment was added to my last blog entry by a Mr. Stephen Bauer, MD, who happens to be a noted asperger specialist.  Why he was commenting on my blog, I'm not sure, but his comment was definitely on topic.  He said:

I agree with "haacked". This topic cannot be stressed enough in today everchanging, fast-moving times. Andrew was dead-on with his MSDN example. That has hit me many times with them. Other culprits are the various "ASP" websites out there that change their URLs.

Keep it real. Err, keep it the same!

The problem (other than the fact the name being used is clearly a fake)?  The URL linked to from Stephen Bauer, MD points to a linkfarm site.  This is an example of comment spam.  In fact, I'd wager the last line - “Keep it real.  Err, keep it the same!” is a marker of sorts, that this spammer can use at a later date to see if I allowed such comment spam entries to exist on ScottOnWriting.NET.

I detest comment spammers more so than email spammers.  Sure, the volume with email spam is astronomically higher since the spammers have perfected their email spamming trade, but in the same token, the anti-spam tools for email have caught up as well - SpamBayes automatically keeps several thousand spam emails per month out of my Inbox.  The comment spam will get worse as the spammers perfect their trade, I'm sure, but hopefully we'll see a similar rise in comment spam-fighting tools.

posted on Thursday, April 14, 2005 11:49 AM

Feedback

# re: Comment Spam - Scripts or Brute Force? 4/14/2005 12:16 PM Ben Strackany

Heh, Scott, I hope you're not calling me a poor slob. :) I've been following your comment spam posts for a while & adding occasional comments.

I usually add my home page URL when I comment, which hosts my blog & some other pages. But it's not a trackback, which makes me wonder (from a philosophical POV mostly) where the line is between salient comments & comment spammers? Obviously if I use a fake name and/or link to a link farm, that's pretty spammy, but what about what I'm doing right now?

# re: Comment Spam - Scripts or Brute Force? 4/14/2005 12:19 PM Scott Mitchell

Ben, I don't mind people linking back to a homepage in a comment. If I did, there wouldn't be a URL field in the comment portion. But basically adding a comment that doesn't add to the quality of the discussion (as the comment spammer did - his post appeared as if he read the discussion, but added nothing) AND if it points to a site that's clearly there to either (a) scam the search engines or (b) make money selling assorted, all natural, male enhancement pills and creams... well, then I'm going to delete the comment.

# Re: Comment Spam - Scripts or Brute Force? 4/14/2005 1:41 PM haacked@gmail.com (Haacked)

I've noticed something similar with people who will post something short like:

"Good!"

An appeal to my vanity almost worked, but after deleting it, I kept getting the comment over and over again. "Good!" "Good!" "Good!"...

# re: Comment Spam - Scripts or Brute Force? 4/15/2005 5:59 AM Richard Dudley

Scott,

There's a little of both available. On my DNJ blog, I would receive a lot of comment spam, and started investigating it a little. The category link is http://dotnetjunkies.com/WebLog/richard.dudley/category/1306.aspx. I found a software tool called Reffy (http://dotnetjunkies.com/WebLog/richard.dudley/archive/2004/09/11/25263.aspx) that automates blog spamming.

Also, I haven't kept these messages, but I have received solicitations from services that pay people (I'm guessing 3rd world at pennies a day) to visit blogs and post comments to get the links back to your site. I'm assuming that's why the comments are usually very short or unintelligable if in English. These services actually brag about their results. Next time I get one, I'll post it up.

# re: Comment Spam - Scripts or Brute Force? 9/28/2005 11:46 PM disha gupta

give me the samples of comment spams at rargdgak@yahoo.com if you want to get rid of the comment spams as i m working on a comment spam fighting tool

# re: Comment Spam - Scripts or Brute Force? 12/16/2006 3:52 PM Jeff Atwood

I think you may be exaggerating the impact of human-entered spam. One example does not make a pattern. In the last two years, I've seen maybe 6 of these on my site, total.

You can always compliment CAPTCHA-- which stops the 99.99% of comment spam that is automated-- with something like Akismet, which is basically a distributed blacklist. That would catch any common URLs entered by human beings.

Title:  
Name:  
Url:
Protected by Clearscreen.SharpHIPEnter the code you see:
Comments   

Add To Your Reader

My Links

Archives

Post Categories

 

I am a Microsoft MVP for ASP.NET.
I am an ASPInsider.
<May 2008>
SMTWTFS
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567

Comment Stats

DayTotal% of Total
Sunday 1866.8%
Monday 37913.9%
Tuesday 45316.7%
Wednesday 50418.5%
Thursday 53519.7%
Friday 49418.2%
Saturday 1666.1%
Total 2717100.0%

Hour1Total% of Total
12:00 AM 652.4%
1:00 AM 682.5%
2:00 AM 622.3%
3:00 AM 742.7%
4:00 AM 572.1%
5:00 AM 1033.8%
6:00 AM 1084.0%
7:00 AM 1585.8%
8:00 AM 1716.3%
9:00 AM 1475.4%
10:00 AM 1716.3%
11:00 AM 1816.7%
12:00 PM 1886.9%
1:00 PM 1696.2%
2:00 PM 1605.9%
3:00 PM 1324.9%
4:00 PM 1073.9%
5:00 PM 923.4%
6:00 PM 913.3%
7:00 PM 963.5%
8:00 PM 833.1%
9:00 PM 782.9%
10:00 PM 792.9%
11:00 PM 772.8%
Total 2717100.0%

Comments by Blog Entry Date/Time

Day Entry MadeAvg.Total
Sunday 5.54144
Monday 5.22339
Tuesday 4.28419
Wednesday 7.67637
Thursday 6.90607
Friday 5.48411
Saturday 5.33160
Total 5.842717

Hour1 Entry MadeAvg.Total
12:00 AM 5.0035
1:00 AM 1.002
5:00 AM 0.000
7:00 AM 7.0035
8:00 AM 5.35107
9:00 AM 6.32278
10:00 AM 6.47246
11:00 AM 4.41181
12:00 PM 6.88330
1:00 PM 3.00111
2:00 PM 5.41222
3:00 PM 8.64285
4:00 PM 4.0589
5:00 PM 5.92154
6:00 PM 4.52113
7:00 PM 9.67174
8:00 PM 9.80147
9:00 PM 5.05111
10:00 PM 5.4265
11:00 PM 4.5732
Total 5.842717

Learn More About Comment Stats
1 - All times GMT -8...


Blog Stats

Favorite Web Sites

My Books

My MSDN Articles