Every year, since 2004, Technorati has published the State of the Blogosphere report. The State of the Blogosphere report is considered the best up to date source of information about size, motivations, and practices of long tail publishers.
This year, Lijit helped Technorati by supplying some information from within the blogs that make up the Lijit Network. Lijit performed the analysis on the raw data and only supplied the aggregate insights documented in this post.
Analysis was centered on four distinct areas of interest including Search Engine Referrals, Blogroll Promotion, the Impact of Twitter, Advertising and Analytics.
Data for this post was collected from two primary sources both directly collected by Lijit.
The first source of data was the ~11K active Lijit publishers that have the Lijit Search Widget installed on their publications. Lijit builds a unique search corpus for each publisher. This search corpus includes the publisher’s publication, his user-generated content, and the network of the publishers who influence the publisher (i.e., his Blogroll). This network of influencers results in a crawl footprint of over 2.5M publications that we actively index in order to maintain the search functionality on the 11K publisher sites. The second source of data used in this post comes from information gathered on those 2.5M sites in the extended network.
Data was reduced to something we refer to as the ‘typical publisher’. For some measurements, some publishers were omitted from the sample when in our opinion the specific publisher or publishers represented a singularity in the data that masked the typical publisher substantially. In addition, for some of the analysis points, we removed publications with less than 100 page views a day. Where lower page view publishers were removed we point it out. As page views drop into lower numbers some of the data begins to skew and it begins to get difficult to distinguish active and inactive publishers.
Search Engine Referrals
A typical site within the Lijit publisher network receives 27% of its page views from clicks on results in horizontal search engine result pages. As expected, the highest single source of referrals to the typical publisher site is Google at 23.5%. Yahoo and Bing were next, accounting for about 3.2% of referrals. Twitter and Facebook were nearly identical and total about 1.6% of traffic.
|Direct to Site
|Site Self-References + Other Sites
Lijit categorizes publications into 23 topical/vertical subject areas. The Tech vertical saw the highest percent of page views from search engine referrals at 41%. The remaining topical areas were fairly consistent with regards to percent referrals.
The percent of page views that come from search engine referrals is fairly constant with the audience size of the publication. The exception to this are publications of less than 100 page views a day that receive a slightly larger percent of page views from search engine referrals at around 30%.
It’s unclear why smaller publications get a larger percent of page views from search engine referrals, but may be linked to the ever growing length of horizontal search engine queries. According to a Hitwise January 2009 Search report, over 50% of queries are now 3 terms or more on the major horizontal search engines. This suggests that as the length of the average query string gets longer, more referrals get passed to smaller publications due to the specificity of the queries. This is a positive trend for smaller publishers.
Based on the 2.5M publications crawled by Lijit, the number of blogs in the average blogroll is 47, a surprisingly high number. Although not always a prominent feature on a publisher’s site, cross promotion of bloggers by other bloggers is clearly a significant factor in publication readership growth.
The typical publication within the Lijit network of 2.5M sites appears in 6.4 other Blogrolls. In other words, the typical blog is pointed to by 6.4 other blogs. The difference between a blog appearing in 6.4 other Blogrolls and pointing to an average of 47 other blogs is largely due to blogs pointing outside of the Lijit crawl footprint. The Blogosphere is a very large place.
The Impact of Twitter
Publications with greater than 100 page views a day received on average 0.83% of their page views from Twitter referrals. This percent tracked very closely to Facebook referrals at 0.80%. Publications below 100 page views a day saw a higher percent of page views from Twitter referrals than Facebook referrals.
Besides horizontal search engines, Twitter is the largest driver of referrals to the typical publication.
Lijit Search aggregates user-generated content that a publisher generates, into search results that display on the publisher’s site. Aggregating this content around a publisher’s site creates a stronger brand association for the reader with that publisher and site.
The most common user-generated content source included within a Lijit Search profile is Twitter. About 50% of Lijit publishers include Twitter in their Lijit Search results. This is a change from prior years. In 2007, 26.6% of publishers included Twitter as a content source in their Lijit Search results. In 2008, 42% of Lijit accounts included Twitter as a content source within their Lijit Search results. In 2009, 50% of publishers included Twitter as a content source within their Lijit Search results.
Twitter was by far the fastest growing content source to be included by Lijit publishers. Clearly, publishers embrace the micro-blog format. Going forward, Lijit intends to track the percent of publishers that use Twitter for blog post promotion as we suspect this number is quite high.
Advertising and Analytics
As Lijit crawls the extended network of publications, we track the widgets and tags we find on those publications. For the first time, Quantcast overtook Google Analytics as the most frequent analytics tag found on publications. This is likely due to Quantcast tags being included in some publishing platform templates.
Comparing 2008 to 2009, there has been a 68% increase in the number of sites with Ad tags installed. This indicates to us that monetizing sites is high on the priority list of most publishers.
Last year, when we ran the analysis, Google Ad tags made up 67% of the Ad tags found. This year that percentage has dropped to 47%, indicating publishers are experimenting with other Ad networks. This is probably not an indication of publishers leaving Google but rather publishers trying other Ad networks and using Google at the end of the Ad rotation.
More Data to Come…
With Lijit’s install footprint of 11K active installed base and a crawl footprint of 2.5M publications, Lijit is becoming the defacto source of information from within publications. Starting in 2010 Lijit will publish a more comprehensive study of what’s happening inside the Blogosphere.