Zotero Versus

Brian Croxall recently lit up the comment feed at the Chronicle with his ProfHacker comparison of “Zotero vs. Endnote,” where the debate centered mostly around issues of citation fidelity. As Fred Gibbs notes, however, “while citation formatting is one major reason to use bibliographic software, it isn’t necessarily the only or even primary reason, especially in the humanities.” Zotero’s citation functionality was always imagined merely as bait: by providing this labor-saving functionality, Zotero would encourage each user to move her research into what amounted to a fully searchable and shareable relational database that could be subjected to text mining and other analysis. Here researchers could begin to do truly remarkable and new things with their evidence.

A few commenters, as well as Fred, tried to shift the discussion toward the issue of cost and openness, and in particular to Zotero’s status as free/libre open source software (FLOSS). Many of Zotero’s most dedicated users have championed the software in the name of FLOSS, but this line of argument frequently falls on deaf ears, or even ears that are conditioned to reject FLOSS as somehow anti-market or anti-capitalist. From my perspective, FLOSS in and of itself is a fairly unpersuasive argument for using Zotero, akin to knee-jerk calls to “Buy American!” in the 1980s, when the USA still did some manufacturing. Buying American and using FLOSS might make one feel some sense of moral superiority, but at the end of the day can those feelings still paper over our sense of existential dread when faced with driving to work in our crumbling K-cars or cobbling together a dissertation with shitty research software?

Just as Zotero’s citation management functionality is a means to an end, so is licensing and developing the software as FLOSS. Far from just ideology, FLOSS has allowed Zotero to leverage relatively limited financial resources to outperform vastly larger and better funded competitors, old and new. Zotero’s annual operating overhead is only in the low six figures. This amount covers in-house development, outreach, and infrastructure costs. In comparison, EndNote and Mendeley each have operating costs that are an order of magnitude greater (or even more). And of course there’s an even higher, hidden cost for these platforms: the expectation of substantial profit, which necessarily impinges on sustainability.

Why should any researcher care about these issues? Defenders of Zotero have often voiced concerns about “lock-in” with proprietary, for-profit software. Users might find themselves unable to migrate their data out of one of these commercial solutions at some later date. But even if this worry were valid — and I don’t know that it is — lock-in in and of itself isn’t necessarily a bad thing. Who would complain about being locked-in to the very best solution, particularly if that solution also didn’t cost any money?

Unfortunately, the closed, for-profit software option has never been the very best solution, and there’s no sign that that situation is changing. This isn’t ideology speaking; it’s history. EndNote has been derided for well over a decade for its 1990s interface and predatory “upgrade” cycles. New features come late or never, and the software has yet to embrace online research and collaboration. Mendeley, while far newer and theoretically nimbler, has likewise only slowly moved to provide the basic, core functionality that active, publishing researchers require. It’s entirely likely that “features” like journal abbreviations, citation page numbers, and subcollections will eventually make their way into Mendeley, or that EndNote will one day discover the internet, but the mere fact that these things haven’t yet transpired speaks volumes about the priorities of their parent corporations.

Because it’s FLOSS, Zotero has been able to add and refine features thanks to the contributions of hundreds of volunteer developers and the feedback of hundreds of thousands of users. The technological success of this model is undeniable: Zotero’s open-source citation engine, entirely rewritten by Zotero user Frank Bennett, and the thousands of user-contributed style files the engine uses have already been adopted by Mendeley and Papers, and a representative from XXXXX has expressed interest in doing the same.1 (Update: The individual who wrote regarding XXXXX and CSL clarifies that the communication was made in a personal capacity, not as a representative, and so I’ve removed the software’s name.)

Wikimedia Commons Credit: Stan Zurek

And of course, there is no reason to think that any of these parties is acting in the interest of serving ideological interests. Indeed, if we look at how they publicly address FLOSS, we find ambivalence and disdain. Mendeley only admitted its use of Zotero code when confronted, and avoided any mention of the provenance of its citation styles for years. Frank’s citation processing engine, despite saving countless hours of development and support, earns faint praise. Papers likewise initially only confessed its planned use of the citation styles when probed on Twitter.

Liberating researchers from the constraints of commercial software development has been good for research, not for ideological reasons but for technical ones. It has also been extremely good for commercial competitors, who recognize the value in openly developed software. What’s not at all clear is that attempting to put the genie back into this particular proprietary software bottle will sustain any of the remarkable momentum gained over the past few years, or whether innovation will continue to be stunted or stifled in pursuit of illusory financial gain.

Wikimedia Commons Credit: Finn Rindahl

As commenters on Brian’s post noted, there is a real cost associated with moving between research software, and it’s inevitably in the interest of for-profit entities to keep those costs as high as possible. Right now the market won’t bear very high costs, but that’s largely thanks to Zotero, not because it’s free but because it’s FLOSS. EndNote, Mendeley, and the rest simply aren’t equivalent players, because the market that they’re squabbling over is checked in growth and in all likelihood doomed to decline so long as there is a strong FLOSS competitor.

  1. To my knowledge, not a single publication style file has ever been contributed by a non-Zotero user. []
  • Pingback: Yvonne Perkins()

  • Pingback: Constance Wiebrands()

  • Pingback: Sean Takats()

  • Pingback: William Allen()

  • Bruce

    The “sustainability” question is an interesting one. For commercial players the question becomes “what do we do if people don’t pay us to use our software? ” For primarily grant-funded FLOSS projects like Zotero, the equivalent is “what happens when the grants dry up? ”

  • Rintze Zelle

    Zotero vs. the World

    I think that when it comes to choosing a reference manager, Zotero’s open source nature is very effective in attracting (or creating) power users, as it allows them add or improve features to scratch their own itches (and often learn new skills doing so). Of course, as these features get incorporated, this also benefits “regular” users. Zotero proved an important nurturing ground for the Citation Style Language, and the Zotero user community contributed most of the CSL styles in existence. Similarly, without Zotero Frank Bennett would probably not have undertaken his monumental effort to create the first CSL 1.0 processor. CSL, the collection of CSL styles, and Frank’s CSL processor have since found their way into other tools (Mendeley uses all three, Papers uses its own, closed source, CSL processor). In this respect, the open source nature of Zotero is stimulating innovation in the entire field of reference management software, arguably much more so than its closed source alternatives.

    And because grant funding takes popularity into account, the choice of regular users to use an open source reference manager can have significant impact on future innovation.

    If we limit ourselves to EndNote, Mendeley and Zotero, we see very different business models. EndNote has a very traditional (and proven to be successful) business model: licensing software on a per-user or institutional basis. In contrast, Mendeley and Zotero currently both offer premium accounts with increased online storage capacity. However, while Mendeley relies heavily on private equity, and is exploring institutional subscriptions, advertisement, and monetizing content, Zotero’s non-commercial model is based on grant money and donations.

    As a Zotero user, a question in which I’m very much interested is whether institutions be convinced to financially support open source tools like Zotero (another intriguing example is Octave, an open source alternative to Matlab). While many institutions are faced with budget cuts, financially supporting open source tools would allow them to compete much more effective with their commercial counterparts. If, as a result of such investments, institutions can drop subscriptions to those commercial tools, this might very well be an effective cost-cutting measure in the long run.

  • FLOSS projects have typically been deemed more sustainable because they presumably offer the possibility of a softer landing in the event that the project implodes. In other words, if the source is or has already been open, it will be easier for others to pick up the pieces and continue, or at least to extract their data intact and not via some terrible, lossy format like RIS.

    I don’t find this argument particularly convincing, since it relies on a doomsday scenario that’s exceedingly unlikely to occur with established, commercial players. Instead, what makes FLOSS, at least in the case of Zotero, appealing from a sustainability perspective is the dramatically lower overhead faced by the project. We spend a ton of money on infrastructure, but we have relatively low development costs because we can rely on our development community and on the integration of other GPLv3 tools.

    This relative economy, in turn, is essential when it comes to sustainability. Zotero is already at the point in terms of storage sales where it can pay for infrastructure and a decent amount of development indefinitely. Where grants will remain important in the short term is for huge changes in the code that will be even more transformative than Zotero Everywhere. But in terms of keeping the project humming along, we’re already in great shape.

    In contrast, something like Mendeley requires millions of dollars per year, and on top of this, there is the expectation that revenue will significantly exceed this overhead, at least in order to attract a buyer. That’s an extremely difficult position for any project to be in, and one that Zotero fortunately does not have to face. Yet at present Mendeley is working with a significantly smaller and less established user base and, for the moment, precisely the same business model. It has been a great model for us, but it’s clearly not going to sustain them.

  • Rintze Zelle

    For Zotero, is there anything holding back the periodic release of a) overhead costs (infrastructure and development) and b) income via grants and storage sales? (or is this information already available somewhere?)

    I think institutions would easier switch to Zotero if you could convince them with numbers that Zotero is in good shape. More information on the roadmap (which grants are used for what) would help as well (plus it would help in attracting more developers).

  • Steve

    At the end of the day, regardless of the open/closed nature of any project’s source code, a market with competition is better for *all* users than one without. I think all the products mentioned (as well as others that aren’t) are doing great things to benefit researchers in general, and in time, things will only get better (as they have proven to in recent history). And that, I think, is the point.

  • I’ll certainly agree with you that what you call “the point” does fairly describe Zotero, and I’m relieved to hear you say the same about your employer, Mendeley (if I may). Where I’ll disagree, perhaps only because I’ve been on the using end of these tools for a very long time now, is that all of these products are “doing great things to benefit researchers in general.” My colleagues and I see cynical moneymakers who are milking a runaway train of institutional money that could be far better spent elsewhere. Ask yourself, how much innovation was there in this space, even when there was market competition, until there was a critical mass of developers and users to produce new code and functionality that could be used by everyone, including Mendeley? Do you remember how lamentable your own citation functionality was until you began using CSL and Zotero plugin code, and how much it has improved yet again now that you’re using citeproc-js? We’re talking about the absolute core, sine qua non functionality of all these products, now inarguably best-in-class, and it’s entirely founded on three tightly interwoven and purely open projects. And what about Zotero’s open translation architecture? How valuable was that and the translators in developing your own system of ingesting online content? No one will ever know, nor will they ever be able to share the benefits of your own developments in this space. It’s absolutely your right to look at open source as a one-way street, but it’s disingenuous to pretend that it’s an equitable relationship.

    It bears reminding that Zotero is and always has been a research project in its own right. It’s part of the research projects division which I direct at the Center for History and New Media. As a result, the project remains 100% focused on the needs of practicing researchers and academics because, well, that’s not only my job it’s also me. Mendeley still claims to be run by researchers, but that’s less true by the day, isn’t it? That would be like Dan Cohen and me saying that Zotero is run by college guys. I mean, we were, once upon time, but not today. I suppose I failed to make a persuasive enough case in my piece, but I’ll say it again: these distinctions are not merely semantic or ideological. They have a real bearing on the kinds of output and and the quality of experience for individual users and for the research community as a whole, not least of all users of Mendeley.

    I’m really happy that you wrote here because based on what I can glean about your role and what I’ve seen you write in the past, you’re not at all in the cheerleading business, and so I’m extremely interested in what you have to say. Thanks for stopping by.

  • Pingback: Alfredo Mazzamauro()

  • Nathan

    There’s a case to be made for ‘Zotero et’, not just ‘Zotero versus’. Alongside the competition between brand-name reference management mega-apps, there proliferates an ecological community of synergetic mini-apps, none of which depends on the others nor entirely replaces the others. For a few years I’ve been using Zotero in combination with zot2bib and BibDesk. BibDesk is my primary reference management application (for various reasons that I won’t detail here), and zot2bib is a Firefox extension that essentially allows one to use Zotero as a web scraper for BibDesk. (BibDesk also has its own built-in web scrapers and connections to external databases, so it could be used without Zotero.) Because BibDesk stores its data in BibTeX files, it directly integrates with the TeX typesetting system and all of the associated bibliographic software, such as biber, biblatex, bibsql, librarian, and so on. All of this is free open-source software. Zotero et…

  • Pingback: Comprehensive Comparison of Reference Managers: Mendeley vs. Zotero vs. Docear « Docear()

  • Pingback: Des citations (et un lien)… | Le blog Zotero francophone()

  • Pingback: How to use Zotero with LyX - Zotero()