A quick idea with a quick implementation...

I’m slowly sludging my way through this DW code cleanup/conversion process…and since it’s brain-numbing at times, I’m allowing myself to take little tangents here and there.

Today’s tangent took me back to the semantic hacker challenge. I’m still not sure if I’ll enter (the deadline is in a few days), but if I do my basic idea would be that of a semantic related spam filter.

In fact, to play around with the concept I’ve already implemented it half way on this very blog.

Basically, when I publish a new post, a semantic signature is generated via the API they provide. I log this signature and I also break out the labels they give me into tags (which I then auto assign as tags for the post). This is all step one.

Step two relates to user comments. I basically do the same thing - when someone posts a comment I use the semantic hacker API to generate signature for the given comment.

That’s as far as I got with it tonight (I didn’t have a ton of time because I was also doing my first live fantasy football draft of the year for much of the night!)…

The next step, step three, will involve matching of the signatures…the basic idea being that a comment signature will have to match on some user-defined level to the post’s signature…if it does, it will automatically appear on the site…if it doesn’t it’ll be flagged as potential spam and need admin (my) approval before it appears on the site.

If it works well enough, I might even take it to the next level and just auto-delete any comment that doesn’t match at or above the min. level I say.

Overall I think it’s going to be a much nicer system for users than requiring a captcha…and I think it could be very useful. I’m just not sure what a business plan would be around this (part of the idea of the semantic hacker challenge is to create a business using the technology - since I don’t have that part right now I’m not sure it’s worth entering).

I probably won’t do much with it, but I also think this type of thing could be a great thing to apply to reviews…if you could generate a signature for the item being reviewed (I’m thinking full text of books and articles), then generate a signature for each review…you could use the match percentage to display the most relevant reviews first…

Anyway that’s just some thinking out loud…I’m off to bed now as I have to go get a cavity filled at 7:00 am tomorrow (before work). :thumbsup:

This post has received 39 loves.


This is the personal blog of Kevin Marshall (a.k.a Falicon) where he often digs into side projects he's working on for digdownlabs.com and other random thoughts he's got on his mind.

Kevin has a day job as CTO of Veritonic and is spending nights & weekends hacking on Share Game Tape. You can also check out some of his open source code on GitHub or connect with him on Twitter @falicon or via email at kevin at falicon.com.

If you have comments, thoughts, or want to respond to something you see here I would encourage you to respond via a post on your own blog (and then let me know about the link via one of the routes mentioned above).