Devon Young’s tech blog is gone (Been 86’d—surely a phrase, but I don’t know its meaning). This is sad, because there was a lot of interesting things there. Now, the...
Devon Young’s tech blog is gone (Been 86’d
—surely a phrase, but I don’t know its meaning). This is sad, because there was a lot of interesting things there.
Now, the good news: thank to Opera’s fantastic caching system, I can still read it: Opera does not refresh a page when re-opening a session and this is GREAT (he is the only to do that, as far as I know). This is the second time that this feature helps me—it even doesn’t refresh if you close a tab and unclose it
.
Now, thanks to Opera and Devon Young’s approval, I decided to post some posts.
Remove 75% of the spam on WordPress 2.1
original post
[…] The spambots weren’t using the form to post anything, they were just directly accessing the wp-comments-post.php file. Which surprised me. It didn’t surprise me that they were doing it. It surprised me that WordPress was allowing that. So I realized I had to create a small change in the script, that should’ve already been coded in by the WordPress coders before release. So at the start of the file, before any of the code that’s already there, I added this line:
if(strstr($_SERVER['HTTP_REFERER'],"devonyoung.com")) {
Then after all the code, I added these 4 lines:
} else { echo "You tried to bypass the form."; }
These 4 lines of code, reduced spam by about 75%. Yet, some spambots are somehow still posting comments. So I checked really quick and discovered that the ones still getting their posts through, aren’t using the comment form or the wp-comments-post.php. Huh? Now that surprises me. Anyone know how they can manage that? I know a little about WordPress, but not enough to know how this works. Likely, the spambots are exploiting a security hole or a bug in the system. I’m using WordPress 2.1.
Spamproof email address, with CSS
I think this is a case of parallel invention, since Devon Young doesn’t mention any inspiration, but this had already been done by the Polish Literary Moose (interview).
Anyway, here it goes:
p a:before { content: "soy"; } p a:after { content: "devonyoung.com"; }
This address is selectable with Opera (and Opera only; others will use their eyes). It is the only one I know to do this. That might be worth including in any Opera evangelism.
A Google limitation
I copy the whole of it, that’s better.
As if I wasn’t having enough problem with their Feedfetcher bot. Apparently Google thinks www.devonyoung.com is equivalent to devonyoung.com, because it applied my www’s robots.txt to my entire domain.
I used Google’s automatic URL removal system to remove all references to my outdated www. subdomain from the Google database.
I clicked the link titled “Remove pages, subdirectories or images using a robots.txt file” and typed http://www.devonyoung.com/robots.txt in the appropriate textbox. That robots.txt has disallow: / in it, so no crawlers will bother with the defunct subdomain. Knowing that robots.txt lists what you want a spider to ignore, I expected that what was listed, would be what would be removed. What else could it mean?
This morning, I wake up to find that my entire devonyoung.com domain is missing from Google. Try it, search for it. It’s gone. You can’t even google my name to get it. So apparently Google cannot distinguish between a www. and the main domain itself.
That was irritating, but I thought that fixing this would be relatively easy. I just go in again and type http://devonyoung.com/robots.txt (without www.) in the appropriate textbox, right? No. This caused Google to give me a notice that every single thing at my main domain will further be gone (again?), and not just the things I list in the robots.txt as ignorable.
What makes this even more painful and annoying is this note at the URL Removal System:
all pages submitted via the automatic URL removal system will be removed from the Google index temporarily for six months
This is more than fairly irritating. For 6 months, my entire website won’t be listed in Google. I only wanted the outdated and defunct stuff removed, and I get burned.
Oh, one more thought on this subject. They never confirmed with me that I had any authority to delete all references to the domain from their database. That just sounds like a huge security risk. I imagine I could sign up using an anonymous hotmail address and remove all references to microsoft domains from Google’s database. I bet Google would hear about that.
Sorry for the rant, but I thought you might all want to know about this for SEO reasons.
NewsFire limitation
I used it for a while (now I use NetnewsWire). In case you don’t know, it is a cool-looking RSS reader for OS X (it is made by David Watanabe. Nuff said).
Bottom line: NewsFire does not respect 301 notification (301 status code means file moved, so stop looking for it here. It is there now.
).
Oh, by the way: Google’s RSS bot (Feedfetcher) don’t give a shit neither. Yep, Google sucks on this, he ignores my Redirect permanent.
Devon has a nice name for these rude bots: Deaf user agents.