Bing, Google and “real time content” search
Lots of reports over the last couple-a days are hitting the wires on how Microsoft Bing and Google are attacking the problem of adding feeds from Twitter and FaceBook into their respective search indexes. Cool. I personally want my stuff discovered and leveraged by folks the world over as I am proud and confident in the content I publish….be it my blog posts, my tweets, my videos, etc etc. To me as a blogger, this is a good thing, offering additional possibilities of exposure and distribution of my creations.
But I am still wresting with the meaning of ‘real time’ in this context…e.g. how content sources like tweets and public Wall Posts meaningfully get ranked, injected into these types of searches and ultimately yield better ‘discover-ability’ of my content.
But first, humor me. Perhaps I am slow, but I’m not understanding ‘real time search’. To me, ‘real time’ is indicative of push-based notification. In other words an event happens and software triggers a process to push this event and it’s notification package to a recipient. Example: Someone tweets and my desktop tweeting software indicates for me via pop up that a new tweet has occurred…all in ‘near’ real time.
Search, however, is inherently post facto. It is a mechanism of forensics to help humans parameterize what they are looking for on the interwebs and the search tools diligently do their job to identify and produce results in a fairly consistent ‘page rank’-oriented way. Then you start the laborious process of hunting through the results to find the needle in the haystack and hope it’s relevant. In other words, an event happens, the search engine indexes this event (even if moments after it was created in near real time) and thus allows it to be discovered later via search. I think that our CEO Todd Vernon summed up three distinct categories for search really succinctly in his post on this subject of examining real time search…
- DISCOVERING: Something is happening, it may or may not be something I care about, and I don’t know it’s happening. I usually find out about it via some source, personal network, facebook, twitter, digg, CNN, etc. I don’t have a specific mission to know it before I find out about it.
- ALERTING: Something is happening and I knew ahead of time I wanted to know about it when it happens. I usually find out about these things via some source such as Google Alerts or Filtrbox. Alerts behave a little like a broad based search with asynchronous results that come back some day.
- SEARCHING: Something is happening or has happened or simply exists. I want to know more about it. I generally want the best answer, or the most recent answer. I may want the best recent answer, but that’s highly subjective and generally defaults back to my trust in the source as the tie breaker.
So the above known (and I’ll cease and desist on the real time versus retro active search debate for the time being), I have some additional and more important concerns, mainly related to the sheer vastness of the twitterfaceverse and how anything relevant can be discovered from these fire hoses.
Assume someone searches for a term on Bing and they then proceed to the new Twitter timeline result set to poke around at the results. It’s a comprehensive result set at best. But, how am I discovered? Again I’m lost in the sea of the twitter chatter. And as importantly (to me, anyways), how is my trusted network discovered in connection and in context to me during a search as I really want their words, thoughts and images discovered along with mine to help a searcher form an opinion.
To solve this, we did things quite a bit orthogonally here at Lijit to the approach outlined above. We’ve essentially take the ‘internet’ and boiled it down to very explicit and succinct publisher-defined networks. Think of it as filtering the internet’s vastness by producing results based in absolute relevance between sets of trusted associations. If you’ve visited my site, you’re invariably a cycling geek like me. And if you’ve gone so far as to search my site, it’s likely you’ve done so with some strong intent and precision in what you are looking for. Lijit issues results based upon the term’s relevance in all my content and that of my network to provide a super tight snapshot of information we serve to the reader. And like all the content we dish up, my Tweet’s and TwitPics will all be displayed in addition to those tweets, etc from my network and their chatter about the same topic. We order and display this in very obvious ways including thumbnails to ease discovery….

Net-net, we’ve been doing this ‘real time social media content generation tool’ discovery for some time along with a wrath of on line content sources and have applied relevance to it all. Ultimately when the discussion is all boiled out, the results Lijit produces in contrast with what Bing or Google produce is not unlike the Apples and Oranges analogy and at its core not a debate per se. One yields intentionally focused and ‘tight’ results, the other broad and ‘loose’ results. Both, however, having purpose and fulfilling different search needs. But when pointed at the topic of discovering ‘real time’ micro-blogging content, I am not sure how the ‘broad’ assists in making discovery simple and efficient. These tools are so fundamentally rooted in their network (read: people to people) associations that sifting through the vastness of RT’s, bit.ly’s and tinyurl’s from anonymous and unknown authors and knowing which result to trust and use would be a mind bending, overly complex and time consuming task. But again, all this is strictly my opinion and how I use and value social network tools like Twitter.
So Tweet on. We’ll ensure it is found and understood when its searched for via your trusted Lijit search box on your site.






