The Semantic Web, circa 1934

June 17th, 2008 § 0 comments

The Times has a great story today by Alex Wright on Paul Otlet’s early efforts to cre­ate a net­work of the infor­ma­tion akin to today’s Web. In spite of blovi­at­ing along the lines of “The hyper­link is one of the most under­ap­pre­ci­ated inven­tions of the last cen­tury” (Kelvin Kelly, quoted for the arti­cle, appar­ently both asleep dur­ing the tech­nol­ogy boom and never hav­ing read his own mag­a­zine, Wired), Wright’s piece treats Otlet’s work sur­pris­ingly fairly and is sen­si­tive to the promise and lim­its of his ana­log approach. On the deliv­ery side, Otlet imag­ined amal­ga­mat­ing the cutting-edge media tech­nol­ogy of the day: tele­phone, radio, tele­vi­sion. The glue for all this data would be the labo­ri­ous human-directed cat­a­loging and orga­ni­za­tion of information.

Of course there is a much longer his­tory to the attempt to forge uni­ver­sal net­works of infor­ma­tion. To a his­to­rian of France, Diderot and D’Alembert’s Ency­clopédie springs to mind. Span­ning 28 vol­umes of text and plates, pub­lished over the course of two decades, and includ­ing nearly 80,000 entries, the Ency­clopédie intro­duced read­ers to the cross-reference (the most under­ap­pre­ci­ated inven­tion of the eigh­teenth cen­tury?) and also explic­itly and implic­itly con­nected them to the rel­e­vant texts of the day, either through cited ref­er­ences or out­right plagiarism.

The suc­cess of the Ency­clopédie stemmed as much from the print tech­nol­ogy it exploited as from the extra­or­di­nary indi­vid­u­als who par­tic­i­pated in the project. Over 140 indi­vid­u­als con­tributed arti­cles. Some were experts in their fields, while oth­ers were gen­er­al­ists attempt­ing to syn­the­size a wide range of knowl­edge. A sin­gle con­trib­u­tor, the cheva­lier de Jau­court, pro­duced over 17,000 arti­cles, aver­ag­ing over eight per day. Yet even in the eigh­teenth cen­tury, this mas­sive endeavor could not keep pace with knowl­edge pro­duc­tion. Wikipedia of course today brings a far larger pop­u­la­tion of con­trib­u­tors to bear, but it effec­tively frames the prob­lem no dif­fer­ently, sim­ply apply­ing twentieth-century tech­nol­ogy to an eighteenth-century problem.

With Diderot and D’Alembert’s Ency­clopédie and Otlet’s Mun­da­neum, we get the sense of his­tor­i­cal actors con­fronting a com­ing tsunami in human knowl­edge. Both the eigh­teenth century’s explo­sion in print­ing and lit­er­acy and the early twen­ti­eth century’s new media chal­lenged exist­ing tax­onomies of knowl­edge. What’s miss­ing from today’s efforts, hinted at by the Times piece, is the human ele­ment. The old Stanford-era Yahoo was lim­ited but extremely use­ful because human beings cre­ated and pop­u­lated the tax­on­omy by hand. Google is today almighty, but it’s essen­tially a dumb inter­face, and as the cor­pus of dig­i­tal media con­tin­ues to mush­room we’re as likely to be rick­rolled or google­whacked as find the infor­ma­tion we seek. It remains to be seen to what extent machine learn­ing and data min­ing can iden­tify and weave together seman­tic mean­ing in dig­i­tal media.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>