There has been a large amount of chatter growing around the inernet that Google search may have reached a tipping point…that the search results are starting to be less and less useful as more and more low quality/spam finds it’s way towards the top of many search result screens. * I’ve long been fascinated by Google’s approach, and the math (as well as the overall process/systems) that they use to solve the problem of finding the ‘right’ stuff throughout all the worlds’ data. So this is a discussion I’ve been paying close attention to, and something that I’ve actually been thinking about for a really long time (pretty much since my first interactions with Google search back in the early 2000s). So here’s my opinion:
Google search isn’t really broken…the key assumption is.Let me try and explain what I mean. When Google was first conceptualized the internet was clearly a different place. The information available, and the way that it was available, mapped more closely to that of the academic world. So the approach of treating references to other web pages, much like citations to other bits of work, made a lot of sense – and clearly worked. I think that the reason it mapped so well was that, at that time, most web pages were still created “by humans for humans”. This made the idea or assumption that a link showing up on a web page was more-or-less a 'vote’ for that page completely reasonable (and in fact I would go so far as to say it was pure genius at the time). But since that time the web - which has continued to explode in quantity of data - has become much more personal, a lot more social, and very much automated. And I think each of these things has played a part in the recent perception that “Google is broken, or at least breaking”. Let’s take a quick look at each in a bit more detail so I can futher explain my thinking. The Web has become more personal. As the internet has grown, so have the number of people using/relying on it. And as that number has grown, so has the amount of data we each provide. Many of us now store our schedules, our contacts, our likes, dislikes, and so much more throughout the various internet services we’ve come to rely on. As humans when we give a little, we expect a lot…and we assume even more. In regards to the internet (and Google by extension) we expect that if we are sharing so much (personal) data with 'it’…'it’ should know us. I expect, like any of my 'true’ friends whom I’ve shared endless bits of exeriences and data with, Google should know that when I say Giants, 99% of the time I mean the American Football team the New York Giants. But this is really isn’t the type of 'knowledge’ that Googles algorithms are focused on (yet)…and I suspect (without any 'real’ inside knowledge) that they aren’t really storing as much data in a personal way as most people think (ie. they know what people are clicking on for sure, but it’s unlikely that they *really* pay attention to who is clicking on what…especially in any usable way right now.) The Web has become a lot more social. When Google first came about, the majority of the web was a one-to-many platform (at least the web that Google was concerned with)…information sharing was more formal with pulisher-to-reader or teacher-to-student sort of models. Conversations around content were more often longer, thought-out debates driven by specific intentions. Today’s web has become much more about one-to-one or at least many-to-many conversations. Which means that conversations often occur more casually and more rapidly. Less thought is put into the majority of the conversations, and yet, more data is actually shared. In today’s web, if you find a link of quality worthy of a 'vote’ you are more likely to share that link via these casual conversations (ie. Twitter or Facebook) than you are in longer form (ie. blogging or other, bigger web pages). The Web has become very much automated. For many reasons, creating and publishing content on the web has become a lot easier. This has helped increase the amount of useful data on the web by leaps and bounds, but it’s also helped to increase the amount of data as a whole in orders of magnitude. Add in the fact that 'real money’ can be directly tied to getting to the top of Google search results and you quickly have a motivating reason behind automating content generation and publication. So the goal becomes more about getting your links published than it does about getting your content actually consumed (by humans). Since you can now easily automate the content creation, the content publishing, AND the content consumption (and actually make money from it)…the focus of many has shifted from trying to get your content consumed (by humans) to trying to get Google to rank your content high in it’s results. Open questions There are, of course, other factors adding to the perception that Google is breaking…and there is a lot more to each of the things I mentioned above then just my quick overview of what I’ve been thinking about related to this 'problem’. But more than anything, what remains in my mind are a bunch of open, random, questions and thoughts…here are a few:
This post has received 39 loves.
Kevin has a day job as CTO of Veritonic and is spending nights & weekends hacking on Share Game Tape. You can also check out some of his open source code on GitHub or connect with him on Twitter @falicon or via email at kevin at falicon.com.
If you have comments, thoughts, or want to respond to something you see here I would encourage you to respond via a post on your own blog (and then let me know about the link via one of the routes mentioned above).