Yesterday I mentioned that WordPress was the most-used content management system among Technorati’s top 100 bloggers, and I wondered what versions of WordPress they were using.
As it happens, discovering the WordPress version is fairly simple most of the time, so I wrote a Perl robot to gather that information from the top 100 Technorati bloggers. Unlike CMSWire, I found only 29 of the 100 were WordPress blogs. (The difference of five blogs is probably due to fact that CMSWire checked theirs by hand, and my system was automated.)
Most Top WordPress Bloggers Use the Latest Version
Below is a graph representing the WordPress versions used by the top 100 Technorati bloggers. As you can see, most of them are pretty up-to-date with their WordPress version, which is what I expected, since big-time bloggers can pay for webmasters to keep up with that stuff.
Now I wanted to find out what versions of WordPress bloggers in general (i.e. not just the top 100) were using. You can read below how I got my data, but the here are the results culled from over 5000 blogs.
33% of Bloggers Use WordPress as Their Blogging Platform
The first graph below shows the breakdown of blogging platforms in general. As you can see, the plurality of blogging platforms—45 percent—falls under “other.” That means just that I didn’t go to great lengths to find what the platform was, so it could be anything, even a well-disguised WordPress setup.
The next-largest slice of the pie—33 percent—goes to various WordPress versions, and Blogspot at 17 percent rounds out those with a statistically significant percentage.
The next two charts show the breakdown of WordPress versions. The first is more finely-grained, showing all point releases, and the second groups the same data by major release. The two parts of the data, “Theme” and “Feed,” reflect the differences in how I acquired the data. I included them here to show that their results were similar.
What stands out is the fact that in general bloggers are doing a good job updating their WordPress versions. Interestingly, bloggers overall seem no less likely to keep up-to-date than the top bloggers. (The longer statistical tail of older versions is probably due to its coming from a larger dataset than that of the top bloggers.) And now that WordPress versions starting with 2.3 nag users to upgrade whenever newer versions are released, we can expect the tail to shrink in height.
The core WordPress developers seem set on including automatic upgrades in a future WordPress version, but I think the version data suggest that automatic upgrades are solving an insignificant problem. Especially considering how much core WordPress code would have to expand for this one feature (to judge by the automatic update plugin, it would be about a 13% increase) and also considering the numerous potential security complications, I’m convinced that core automatic upgrades are a bad idea. But that’s a topic for another post.
Appendix: How I Gathered the Data
The first thing I did was figure out how to coax version data from the blogs. For this I used two methods, shown in the data above as “Theme” and “Feed.” The “Feed” method requested feeds from all of the blogs. If it got back an XML response, it looked for the “generator” tag. The “Theme” technique checked to see if CSS stylesheets were in a wp-content sub-directory, and it looked for the generator meta tag in the page header. Because of the variety of themes, the “Theme” method is slightly less reliable, because some people (about 15%) remove the version information from their page headers.
The next step was to figure out how to get a decent-sized dataset of regular bloggers. There are surprisingly few straightforward ways to do this. Technorati lets you browse blogs grouped by about twenty categories, split into fifty pages of ten each. I wrote a script to grep the blog links from those pages, and it gave me about 5,300 unique blogs, forming my dataset. The data are obviously biased towards those blogs that update with Technorati, but considering default WordPress installs ping Technorati via Pingomatic, that shouldn’t affect the results significantly. The data are also probably skewed towards those blogs that are maintained (abandoned blogs are less likely to show up in Technorati), but that’s fine by me, as abandoned blogs are by definition going to have outdated versions of WordPress.