Jack Vinson who has a great blog posted a comment to my blog on comment spam. Dr. Vinson wrote;
Lumpy- The name change thing is a good start. I wouldn't mind more details on what you did to rearrange the order in the templates.
I assume with the plug-ins you will look into MT-Blacklist and the new SpamLookup. I've been using MTB for a while, and it seems to catch the majority of the garbage that is thrown my way by the robots. It's statistics on my site say that 9700 things have been blocked, and 300 have been moderated since earlier this year. Of those 300, most have been spam as well. So it isn't perfect.
SpamLookup is newer, and people have been saying good things about it.
(Lumpy converted his URLs to hyperlinks)
I sent him a hasty reply and informed him that I would write more on it soon. Thus today's blog. First more detail on what I initially did to my comment template and index template to deal with this issue.
The first thing I did was rename my comment script and trackback script. This effectively blocks some spambots. Unfortunately, a well written bot will get by this. Nonetheless, it is a start and a quick fix.
To do so all you need to do is change the name of "mt-comments.cgi" to "somethingelse.cgi" and inform the "mt.cfg" file. (Of course, you can rename it to anything you wish.)
The configuration file is installed in the same directory as the comment script. You should look for a line that reads:
CommentScript mt-comments.cgi
Change that line to:
CommentScript somethingelse.cgi
You would do the same for the trackback file, of course, you would need a unique name.
Next, I moved on to the layout of my index file. I did the following:
- Change the order of what would normally be "post a comment" and "trackback". However, this will only fool dumb bots.
- On your actual comment templates do the same thing.
- I also did something else. I later changed it back after I was having success with the plug-ins but I have since altered my preview template to demonstrate what I did for my readers. (Dr. Vinson could not see what I did because I changed it back.) Bots will target certain items. The text in your template is, more often than not, not one of them. Humans read the text. I tried to take advantage of this. I changed the labels on the "post" and "preview" buttons. This would make them do exactly the opposite. I informed the humans of this in my text. I suspect some bots went in circles for my preview script was called over 1200 times.
- Lastly, I added "dummie" fields. I added a field to each of my forms, which does nothing. This will also confuse bots but not humans.
Neither of the above are the "optimal" solutions but I suspect that these tactics worked for two reasons:
- These bots seem to be written with the minimal amount of effort possible. If they can be written to simply follow a set pattern, it is likely that that is they way they will be written. (It is much simpler to write code that says "make two right and then a left" rather than "if the first street is comment lane go down it otherwise continue on until you find it and then go down it").
- My site templates were, although heavily modified, originally were the free templates from blogstyles.com. The free templates from that site are very popular. When I originally tweaked the templates and style sheets I never rearranged the order of buttons and such. Simple changes confused simple bots.
After I left the above changes in effect for about three weeks, I moved on to plug-ins, which are a better solution.
There are currently two plug-ins that combat spam running on my server. One very effective and one no blogger who allows comments should do without. The other is also effective although I am not sure I agree with its method.
MTBlacklist is the most logical plug-in to start with.
First of all, it allows for comment moderation, which means that comments are not visible on your site until they are approved or unless you allow registered users to post comments unapproved. Comment approval alone is not enough. Imagine having to click delete for 1000 spam post. Some type of auto-deletion based on keywords and other variables is also needed.
Some type of filtering is also needed. MTB also offers that. It will auto-delete known spammers. This saves the user much time.
Bayesian filters work for e-mail spam because spammers use similar tactics and keywords. So why wouldn't similar tactics work for comments?
It will. MTB offers key word filtering. It seems very effective as well.
Spaahaus.org estimates that about 80% of all e-mail spam originates from 200 known spam organizations. So blocking known spammers off a good list helps with e-mail spam. The same holds very true for comment spam.
Any blacklist, however, is only as good as the list it uses. It is very likely that spammer will switch IP addresses, change names used and new spammers will arrive. The list will need some type of updating. MTB allows the user to add to the list AND report to a centralized list so that all MTB users can be made aware. All MTB users can thereby pool resources and check this centralized list automatically. I feel this works very well.
MTB also automatically blocks comments which have a high number of URLs in them. This is a very effective tactic for spammers are notorious for doing this. The default setting is 5 URLs. I, personally, have found it very effective.
MTB also allows the user to stop comments on post older than a certain number of days. This is also a good idea for most blogs. Once a spambot targets a certain entry it will keep spamming it. My site, however, is mostly tech geek stuff. I often get legitimate comments from months past. I, therefore, disable this feature.
Speaking of disabling. The other feature any decent plug-in should have is the ability for the user to customize it for their needs. After all, it is possible that your blog is about Texas Holdem' Poker. If that is the case, you would not likely consider such comments spam. MTBlacklist offers that. You can add/delete keyword, allow more or less URLs, and tweak any aspect of it.
Having used the MTB for over a month now, I am very happy with it. I like it and am going to keep it.
MTblacklist is a requirement for any Movable Type blogger who allows comments. I did not keep statistics on how effective it is at catching spam but I am very happy with the results. If we go by the numbers Dr. Vinson mention, it blocked all but 3% of all the spam comments to his site. I would say those are good numbers.
About a two weeks after installing the MTB, I installed the "no follow" plug-in. I think the "no follow" plug-in was useful for it kills spamming at the root of the problem.
The no follow plug in works by adding rel="no follow" to the hyperlink tag. MSN, Yahoo and Google bots which crawl the web will ignore such links for the purpose of spidering and rank. These two reasons are why comment spam occurs.
The "no follow" plug-in however is a mixed bag. I think it is effective but rather defeats the purpose of allowing legitimate people to post URLs that should be crawled.
The Six Apart Guide to Fighting Comment Spam likens fighting spam to combating shoplifting. I think this is a good analogy. The no follow attribute, however, is like stocking your store with items that no one would want to shoplift. I guess, in most respects, this works.
I doubt that people who wish to post legitimate comments will be deterred by no follow. Legitimate comments are based on content and content related. The motivation is not site ranking.
There is a major downside in my opinion. Now please do not take this the wrong way. I do blog because I enjoy writing. If one does not enjoy writing, blogin' is not a good medium of expression. However, I also desire people to find my blog, read my blog and comment on my blog. If someone posts a legitimate comment with legitimate links they should be crawled by the bots to properly index the web and adjust page ranks.
While on the subject of being "listed", I must comment on another item in the Comment Spam guide by Six Apart. They suggest that you do not list your blog with the many blog listing sites because it is believed that spammers harvest blogs to comment spam at those sites. I started listing my site at all of the blog sites I could after I was being spammed. I noticed no increase in the number of spams received, in fact, since MTB and no follow, they have decreased.
I do agree with making some aspects of your blog obscure: rearrange the forms, add extra fields to the forms, rename the scripts and include any human check you can think of... Most of all, be unique. If every store in the world has the same layout and product they will certainly be systematically laid bare by shoplifters.
However, hiding your store with its unique product in a cave, not advertising and not listing it in the phone book is certainly not the best marketing technique. I am keeping MTB but have already removed the no follow plug-in. I am going to wait a few weeks and try the SpamLookup plug-in and see how that works out.
My conclusion, I would say that both are effective but no follow defeats one of the main reasons to blog. Blogin' is, for many, an effort to share ideas and network. This means they must be, at the minimum, findable. This makes search engine ranking relevant and important.
In order to effectively combat spam and effectively blog, bloggers are going to have to figure out a way to block comment spammers and promote their blog. Let's face it, most of the bloggers out there want to be read. Obscurity in the blogshpere, although an option to cut down on spam, is not, in this author's opinion, a very logical one. It makes no more sense than obtaining an e-mail addy so odd nobody could guess it and then never giving it out.
Comment spam must be dealt with in a manner which cuts down on the blogger's time in eliminating it. MTB achieves that. MTB is absolutely a plug in that every MT user should have and Six Flags should take measures to incorporate it into the next release.
