Spam-Tember

spam-0Those of you who are bloggers, I don’t know how much you look through your Spam Comments list. I delete spam without looking at it too much. But you must go to the list to click the button, so you can’t avoid seeing some of it. Sometimes there’s a new twist on the basic trick: “I’m a real comment! No, really, I am!! Please let me through!!!”

But most of it becomes familiar in a short time. You see the same comments vaguely praising your post without actually saying anything about it. Some of it makes you chuckle a little; some of it makes you despair. It’s a kind of constant background noise.

Then last September it seemed like there was a lot more spam than usual.

Now I can look back and see that, sure enough, there was a lot more spam in September. And it wasn’t a sign of rising spam, but of a “bumper crop” month (it was way down in October):

spam by month (7 months)

Spam counts for the last six months (plus so far this November).

A key question is whether this happened to everyone or just me (or some subset of everyone). I posted a lot in October, but not much in the months prior.

Did some new spammer organization come online and flood the web before the filters caught on? What’s interesting above is that October is also lower than June, July or August.

When you look at a year’s worth, September stands out even more:

spam by month (1 year)

One year’s worth of spam. A bumper crop in September!

There’s been a lot more spam this year, at least on this blog (I’m really curious what changed):

spam by year

Nearly four times as much spam in 2014 compared to 2013.

Almost 200,000 spam comments this year. I’m glad I don’t have to wade through comments sorting it all out myself!

I’ve come to trust the WordPress Spam Filter. I no longer worry about false positives — I checked for a long time and never found a legitimate comment in the Spam section. In over three years, no one has emailed me to complain I didn’t authorize their genuine comment.

comment spamThe occasional false negative does show up, but often by the time I show up to moderate it, WP has already bagged and tagged it. Only rarely do I ever have to consign a comment to the spam heap (and thence to the trash heap).

So bravo to WordPress for keeping the spam canned!

I haven’t written about spam so far this year. [If you like, you can read the 2012 rant or the 2013 rant.] As I said in the lead, it becomes a kind of mostly similar background noise. Yet a few of them have caught my eye.

A while back there was a rash of spam that all followed a similar pattern:

Gosh, I wish I would have had that inoromatifn earlier!
I’m not quite sure how to say this; you made it exmterely easy for me!
Pin my tail and call me a doeykn, that really helped.
Thanks for sharing. Your post is a useful conuirbttion.
I’m imsepserd you should think of something like that.
Thanks for sttniarg the ball rolling with this insight.

What’s going on here is that spam comments need to be unique in hopes of slipping by filters.

You may have noticed how, if you accidentally post the same comment twice, WordPress rejects the comment, telling you that you appear to have already said that. That’s just a first line of spam defense.

spam keyThe spam engine above intentionally misspells a word (differently each time) in order to create multiple comments from a single line of template text. The hope is that a short comment seems more typical of comments today. (Plus, the longer any spam comment is, the more likely it will “blow its cover” for not reflecting the post’s content.)

Some very cleverly quote the post’s title into the boilerplate, but still never mention any details about the post itself. Others post irrelevant, obviously random, text. These are all attempts to create “new” and seemingly relevant messages that escape filtering.

Speaking of “blowing its cover” perhaps you’ve noticed a massive (384-line) spam comment that looks decidedly weird:

{
{I have|I’ve} been {surfing|browsing} online more than {three|3|2|4} hours today, yet I never found any interesting article like yours.
(((381 more lines)))

Some spammer hasn’t set up their engine properly or has defective code, because what’s being sent is the entire template! Which basically looks something like this:

{ comment#1 | comment#2 | comment#3 |}

Each comment looks something like this:

text text text {A|B|C} text text {D|E|F} text text {X|Y|Z} text text text

The basic idea is that the engine, every time it comes to a set of braces, picks one of the members inside. First it picks the comment, then inside the comment it picks various words to use. The fake example above generates 27 distinct comments:

text text text A text text D text text X text text text
text text text A text text D text text Y text text text
text text text A text text D text text Z text text text
text text text A text text E text text X text text text
23 more combinations

That 384-line template has 61 distinct comments in it, but because of the word substitutions, it can generate tons of individual comments. The first comment template — the first part of which you see quoted above — can generate 2304 distinct comments!

spam folderAh, the tricks they try. I saw some using letter substitution, using Unicode characters that resemble certain letters (e.g. “tɦat theƴ“) to try to create “new” comments.

Asking questions is a common trick. Many spammers have asked if they need to learn HTML to blog or does blogging require a lot of work or how did I get my theme. I especially appreciate the irony of:

Do you have a spam problem on this website; I also am a blogger,…

One that very nearly slipped through, which I very nearly approved as a valid comment, went like this:

We’re a group of volunteers and starting a new scheme in our community. Your site provided us with valuable info to work on. You have done an impressive job and our entire community will be grateful to you.

Except that I saw the same comment, or ones just like it, on other pages and on my other blogs!

I’ll leave you with two more that caught my fancy. This one almost seemed legit at first:

Pleased meet up with you! I am Preston and my wife doesn’t take pleasure in at several. My family lives in Nebraska. Taking good care of animals is his profession but he’s always wanted his own home based business. What he really enjoys doing is golf and he’ll be starting something else along with them. I’m not competent at webdesign anyone might want to check my website: Mia Parking

Which appears to be Miami Parking or something. Why they’d even need to spam is beyond me, but whatever. But I was impressed by Preston and his wife and their strange story.

And finally, this cute little gem:

It’s going to be ending of mine day, but before end I am reading this enormous piece of writing to increase my know-how.

Enormous!

And — as usual — it will be fun to see what spam this post attracts! 😀

About Wyrd Smythe

The canonical fool on the hill watching the sunset and the rotation of the planet and thinking what he imagines are large thoughts. View all posts by Wyrd Smythe

20 responses to “Spam-Tember

  • Hariod Brawn

    I don’t seem to be able to see those spam graphs that you display here W.S. – is there something I’m overlooking can you tell me?

    • Wyrd Smythe

      Hey Hariod. Go to your Dashboard and look in the upper left for “Akismet Stats”.

      • Hariod Brawn

        Duh! What an idiot – so obvious! Anyway, I only started in April this year, but from May through to August inclusive I averaged 1,285 spam p.c.m. September jumped to 3,764. October 11,293. First 12 days of this month 7,258.

        I’m pretty oblivious to spam really, other than that I empty the folder every day – and I never check what’s in there. Is there any reason why I should pay attention either to the levels or to the content of spam do you think?

      • Wyrd Smythe

        So you saw a big jump in September and a bigger one in October. At the rate November is going, it could be bigger still. Apparently “they” have found you, my friend. It just took them a few months to notice you. My average last year was just over 4,000 — in 2014 it’s at 12,500 YTD. (In 2012 it was well under a grand.)

        I don’t know that the levels are meaningful other than high volume means more spam. There are reports of false positives (non-spam comments tagged as spam), so some check their spam folder. I did for a long time, never saw anything, and now I tend to just delete the spam. Hopefully I haven’t deleted any earnest comments. No one has ever alerted to me to having done so.

      • Hariod Brawn

        By the way, I have noticed that almost all of my spam – say, 95% – is responding to an image as displayed in my gallery and not to an article of text. That is totally the opposite of what happens with bona fide comments wherein very few comments are made in the gallery.

      • Wyrd Smythe

        You could try removing that image, even if just for a while. I’ve had specific posts attract large volumes of spam, so I’ve disabled comments on those posts for a while. When I’ve re-enabled comments, the volume doesn’t seem to resume on that post.

      • Hariod Brawn

        “You could try removing that image” The thing is, it’s loads of different images that attract the spam, not a few in particular. But tell me W.S. – why is it a bother in any case if Akismet catches everything (my accuracy rate is 99.98%)?

      • Wyrd Smythe

        Ah, I understand. No, not much you can do, but — as you say — it’s really not a problem.

      • Hariod Brawn

        Excellent – you had me worried there for a moment; especially as SAP seemed concerned about it too.

  • SelfAwarePatterns

    I chopped my spam volume (as well as drive by trolling) dramatically by cutting off comments on posts over 30 days old. I seldom had a valid comment on that older material, and it cut the spam tab down from hundreds every time I looked at it to a few dozen.

    I’ve seen a handful of valid comments end up in spam. The effected commentor almost always let me know and I was able to quickly fish their post out. But I have found a couple of my regular’s comments there before.

    That said, it’s rare enough that if I let the spam tab grow too large, I don’t hesitate to just empty it.

    • Wyrd Smythe

      Heh, I noticed you shut off comments on older posts. I understand the logic. Only thing is, when I discover a new blog, I often poke around in the back files for interesting posts. And, being me, I usually want to comment. For example, I wanted to comment on your Ender’s Game post (and one other — can’t recall what it was right now).

      Absolutely right about the older posts attracting spam, though! On occasion I’ve had one post that seemed like honey to spam flies. I shut off comments for a while and then re-enabled them, and the high volume of spam on that particular post didn’t resume. It would be interesting to know how they pick their targets.

      • SelfAwarePatterns

        I’ve toyed with the idea of keeping comments on for older posts, but requiring authentication. I’m hesitant because I do have some commenters who don’t authenticate, and I’d hate to lose them. But maybe requiring that they have a WordPress, Twitter, Facebook, or Google+ account isn’t much of a burden.

        But it seems like spam would cease to be an issue with that arrangement.

      • Wyrd Smythe

        It’s probably not a big deal. I’m likely an outlier (seems like I almost always am). Have you received other comments about closed past posts?

      • SelfAwarePatterns

        I have actually, although it’s definitely not an avalanche. You’re the third person to say something in the months since I cut off older posts. And the idea of completely eliminating spam has some appeal.

      • Wyrd Smythe

        Does it have to be 30 days? Maybe 60 (or even 90) gives people a better chance to catch up. I go through long periods where I don’t do much blogging, and periods where I don’t do much blogging (and a lot much unvisited). Then I binge for a period to catch up. (But again: Outlier, I)

        Still, new people do come along from time to time, and they may well follow a search link to an older post. I guess it’s a question of how likely they are to comment and how critical it is to allow it versus how big a pain spam is or how big a pain authorization is. I can say that the older a post is, the less taken aback I’d be about it being closed.

        [shrug] I dunno. Spam hasn’t risen to the pain level for me, so my opinion is pretty sideline.

  • Doobster418

    My two biggest spam months this year were June, with about 17,500 and October, with more than 26,000. September saw only about 13,000. But it appears that November will be a record breaker. It’s only the 12th of the month, and I’ve already got 17,355. At this rate I’ll him close to 40,000 by the end of this month. I do try to clean out my spam every day because if I let it go for two or three days, several hundred will accumulate there.

    And, believe it or not, I do remove them one at a time because some of them can be quite humorous.

    • Wyrd Smythe

      Interesting. June was the first high month this year for me (about the same count as you), but then July and August both topped it (about 23K each). September was my whopper — yours was October. And your November may look like my September.

      I suppose as they go through their lists of blogs we each get our turn being shot at.

      It is fun to read them sometimes (although my eyes glaze after a while). What’s interesting is to see how their attempts evolve over time.

  • dianasschwenk

    I also get spammers trying to sell me stuff. I have found a handful of legitimate comments in my spam, some from people who have commented numerous times, so I still check my spam folder. 🙂
    Diana xo

  • Wyrd Smythe

    Ha! Just deleted two spam comments on this ancient post that’s about how lame spammers are. Funny!

Leave a reply to Hariod Brawn Cancel reply