On Usage Figures

May 21st, 2011 § 9 comments

Among the more eye-popping num­bers asso­ci­ated with LinkedIn’s recent ini­tial pub­lic offer­ing is the 100,000,000 mem­bers it claims. What do those hun­dred mil­lion peo­ple do with their LinkedIn accounts? If they’re like me, they qui­etly ignore the end­less spam but never quite moti­vate to unsub­scribe. Or maybe they occa­sion­ally click through a link returned by a Google search, only to dis­cover the limp résumé of some sad sack look­ing to escape the Enter­prise rent-a-car counter, not the super cool and attrac­tive “Sean Takats” that they went to high school with and are stalking.

I’ve been think­ing a lot about these kinds of num­bers as the Zotero team pre­pares for a major sum­mit this sum­mer. In our first few years, we used to mea­sure Zotero’s growth in terms of down­loads, but we quit doing so well over a year ago, when that num­ber was north of four mil­lion, hav­ing dou­bled from two mil­lion just a few months ear­lier. We stopped because down­loads are never a very accu­rate mea­sure­ment of adop­tion, and they are espe­cially prob­lem­atic for Zotero, which is avail­able from a vari­ety of repos­i­to­ries. Most users get our soft­ware from either zotero.org or addons.mozilla.org, but Zotero has also popped up else­where, mainly because we don’t restrict its dis­tri­b­u­tion in any way. In the absence of any other met­ric, how­ever, down­loads are bet­ter than noth­ing, and Mende­ley for exam­ple still uses down­loads to arrive at its fig­ure of 900K+ “peo­ple,” accord­ing to Ian Mulvaney’s recent code4lib talk. And when it comes to com­mer­cial prod­ucts like End­Note, we of course have no idea at all.

A sec­ond way to mea­sure usage would be to tally user account reg­is­tra­tions. Cur­rently zotero.org hosts 620,000 accounts. Note that I say “accounts” and not “users.” Indeed there’s no rea­son to think that this fig­ure is any­thing more than very slightly more reli­able than down­loads. Zotero was around for years before we even had server accounts, and we have never aggres­sively pushed users of Zotero to reg­is­ter accounts by con­fronting them with a sign-up form before offer­ing the down­load. We think server accounts pro­vide incred­i­bly valu­able func­tion­al­ity, but we also feel that it’s a lit­tle sleezy to try to co-opt peo­ple into sign­ing up for some­thing they don’t want. So the “real” num­ber could be much higher! Among that mass of accounts, there are hun­dreds of thou­sands of real, active researchers but also, inevitably, count­less spam­mers wait­ing to be weeded and dor­mant accounts sit­ting idle. Or maybe it’s much lower! But even if we were to pre­tend that all 620,000 accounts were tended to by highly moti­vated schol­ars, we would still be faced with an order of mag­ni­tude drop when com­pared to down­loads. A quick look at Mendeley’s peo­ple direc­tory reveals a sim­i­lar dis­crep­ancy: it lists fewer than 70,000 user accounts, which is noth­ing to sneeze at but of course well south of the down­load fig­ure. How many accounts does Ref­Works have? Again, we can’t know.

A final way would be to count how many peo­ple are run­ning Zotero each day. Because Zotero auto­mat­i­cally checks for updated trans­la­tor code on a daily basis, we know that at least 275,000 instances of Zotero ran today. But wait a minute, what’s with this “instances” and “at least” busi­ness? Well, maybe some peo­ple are run­ning more than one copy of Zotero on a sin­gle machine. We could account for unique IP addresses, which moves the num­ber down slightly, but then we would ignore mul­ti­ple instances of Zotero shar­ing a sin­gle pub­lic IP address. And of course, this fig­ure only accounts for copies of Zotero that have auto­matic updates active, and that man­aged to con­nect to the inter­net. Other soft­ware ven­dors could pre­sum­ably track sync activ­ity or other met­rics to arrive at anal­o­gous figures.

The basic moral of the story, if you haven’t already guessed, is that these num­bers are all pure shit, though some are clearly worse than oth­ers. All we can do is pro­vide an hon­est expla­na­tion of how they’re derived.

Tagged , ,

§ 9 Responses to On Usage Figures"

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>