Plagiarism Today has an excellent article about spamblogs, the problems faced by Google/Blogspot, its spread to MSN Spaces, and why this is likely to be a trend:
"The bitter truth is that the Web is more vulnerable than ever to splogging, not because of clever spammers but because of ill-prepared hosts. While Google responded to pressure from the blogging world to do a better job policing its service (though the effectiveness of its response is up for debate), other hosts have not taken any clear steps and many are completely unable to handle the problems that they face now."
Yes. This has been a discussion topic on Winds following our (continuing) ban on blogspot.com in comments or trackbacks. Personally, I believe we're headed for a blog future in which owning your own domain will be the only viable option to avoid fairly widespread blacklisting. As the PT article notes:
"Being a successful Web host is no longer just about having the best features or good servers and easy to use tools. It's also about having an effective abuse policy that not only frees up precious resources for legitimate users, but makes you a good neighbor on the Web.
Simply put, no one wants to use a service that has a bad reputation or has even been blacklisted for generating too much junk and, in a Web where sharing information and ideas is critical to survival, being blacklisted, can be a death kneel to an otherwise sound service.... We just want to run our sites, search our data and read our favorite pages in peace.
However, it's up to the hosts to create that and, frankly, I don't think most are up for the challenge."
I tend to agree; here's the whole piece if you want to read it.
I see a future in which the free sites are training/experimental grounds, and the more all-inclusive ones like Blogspot or MSN Spaces are their own little gated communities, accepting each other's links but not accepted or accepting very much beyond that radius.
That's sad, but the absence of meaningful penalties or enforcement against spammers makes it more or less inevitable.








Frankly, it's a problem of trust. The Internet was built on a model of complete openness and trust, and as someone who was on it before the Morris worm and spam and so on, it was wonderful. But it was unrealistic when placed outside of a controlled-access environment: the world is simply filled with assh*les, and we all have to get used to that.
I believe that the Internet's life, as it currently is constituted, is quite limited. I think that there will be a successor with most of the following characteristics:
- Low level transports (equivalent to TCP) featuring encryption as an option, and cryptographic signatures at each hop as a requirement, to provide privacy and non-repudiation (the latter being essential to stopping spammers)
- Trusted identity providers (coupled with webs of trust, this is a powerful way to white- and blacklist who gets to post, email you, etc)
- High-level services (like email and net news and blogs) with unforgeable, unspoofable connection information; these would be coupled with the ability to block an identity from any transit through your network/system/application
- Distributed squelching of zombies (enabled by non-repudiation at the transport level)
- Webs of trust, combined with point-to-point transport-level encryption
- smarter protocol filtering, particularly having all protocols inactive by default and only allowed by explicit agreement between trust partners
- more connectivity and less dependence on single routes (somewhat like how the Internet was originally configured, but probably without open routing)
By building these features in to the communications stack itself, it is no longer necessary for every single site to be constantly fighting spammers and hackers - the network itself would remove the ability to hide (unless you trust anonymous connections, as some sites undoubtedly would), and would automatically squelch attacks or large-scale spamming. In other words, the network should be adaptive enough to take care of the vast majority of abuses that it does not make impossible.
Joe those words are like music to my ears.
I'm developing the a blog toolkit that works no matter what your domain is -- it's cross-domain. All the rest of the widgets lock you into a specific server. As the web becomes more "gated", there's still going to be a need for the domains to connect to one another, I believe. I can imagine groups of blogs -- flocks of blogs if you will -- all of people with similar interests. Tools are going to be needed to allow both the contributors and the readers to collaborate across domains.
The problem with securing the network is that the web is built on anonymous communication. This goal is antithetical to coordinated usage, however. So there has to be a solution in which both goals are reached to some degree. I don't want spam, but I don't want to restrict folks in China from seeing and interacting with the real world either.
It might all end up being team blogs. That seems to be very popular, and is a great way to keep fresh material coming out. Then perhaps the blog has some kind of admission test or registration that is controlled by the blog, not the hosting service. The hosting service could keep track of blacklists and such at the network level.
On a personal note, I've been getting a boatload of spam from a hotm@@l this week. My hosting provider is offering a barracuda firewall for an extra three bucks a month, but I can't decide whether to get it or not. If I could turn trackbacks on, I might consider it. Does anybody have any experience with hardware firewalls/anti-spam devices and blogs?
Daniel, we don't...
And as folks might imply from the way he wrote it, (hot mail) is now on our blacklist for that very reason. Many of them were trackback spams, which means (since the service doesn't do web hosting) that they were EXPLICITLY SENT as nuisance spams. We've seen a sharp rise in that as well.
Another brick in the wall.
I might give it a shot and let you-all know.
Any sort of hardware help couldn't be that bad. As I understand it, some of those hardware spam/spyware filters are getting very sophisticated nowadays. For $3, I'll have them hook it up and turn off my blacklist and turn on trackbacks.
I'll let you guys know how it goes. Perhaps there is some help from all of this mess at the hardware level. I doubt it, but it doesn't hurt to try, right?
Trackbacks are a particularly easy thing for spammers to exploit. Joe, if you're interested, I have a small Python script and basic explanation here which shows you just how easy it is to send a trackback ping.
On that note, WordPress and its Akismet anti-spam plugin rock for preventing spams from trackbacks and comments. Movable Type is particularly vulnerable because of how badly exposed its trackback and comment scripts are to spammers.
It is not where you are located. It is who do you trust.
Software that integrates trust is the answer.
White lists as well as blacklists.
This is a subject I follow, and it seems to me that the primary impetus to colonize hosting services such as Blogspot is to evade the domain costs imposed by the efforts at domain banning. If you go back and read material a year or two old about junk, you will note one key theme was to increase junker costs by requiring them to cycle through domains, each of which costs money. What we are seeing now is a solution to that problem, where the junkers avoid the domain cycling costs by using someone else's domain.
What I have seen in a few places is a requirement for a credit card, even if it is a free service. This makes the transaction sufficiently heavy weight to reduce the problem to acceptable levels. I suspect that is more the wave of the future than gated communities.
MikeT;
There are some simple tweaks to MovableType that (in my observations, at least) greatly reduce the amount of trackback junk. Basically, the TB interface is changed to use the basename of the post, instead of a numeric identifier, which stops junks from "running the numbers". It doesn't stop it, but it does seem to slow it down significantly.
I'm not knowledgeable in this area of spamming, but I'm confused on how blogspot could be used for trackback spam when they require a CAPCHA entry for every new post.
#10 knox has a very valid point.
Since I started using the blog-spot provided turing test I have gotten one spam bit in 6 months. Before that I was getting around 10 a day.
And even the 1 spam was manually entered. And it was hidden in a bunch of old spam I had not yet deleted.
Joe - perhaps the answer is a better test.
If we did web-of-trust I wouldn't have to obfuscate my blog URL (look at my blogparent).
knox;
One doesn't need to have any posts on blogspot to do trackback junk. The trackbacks are not sent from blogspot, but from other zombie machines via techniques similar to what MikeT describes.
I should not post this, but then the odds of a novice spammer wandering across this instead of some other of the millions of web pages that have it are low.
The way you defeat a capcha is to get the image and check the bitmap against a database. If there is a match, use that word. If there is not, put that into a list of capchas to be decoded. Then, send out spam emails to the list you've already bought or harvested, offering free porn with no charge: all you have to do is go to the handily provided URL and type in this word (at which point insert the capcha). When you get a hit at the URL, which conveniently has a number identifying which capcha was used, you take the "secret word" entered, and put it into the database for that capcha.
Over time, you can get a pretty complete list of all of the capchas used at a given site.
All for the cost of a web site (available free but for your time), a database (available free but for your time) a host to run the database ($1000 or less), a list of email addresses (you can use your database host to run a free harvesting bot), and a couple of porn CDs (less than $100 additional cost), or rip off free porn sites on the Internet already.
Jeff,
Neat trick.
Now what if the capchas are changed at random for every post?
You would have to keep your posting window open and it would have to be non-timed (i.e. if no post in x minutes a new capcha is called for)
As I said - the one bit of spam I got looked like a guy being his own bot.