Chunky or smooth?

While thinking about the current discussion about the proposal to merge several of Wikipedia's content policies into a new policy, Attribution, my gut feeling was that the core content policies (verifiability, no original research and neutral point of view) are better off treated as separate concepts that are nevertheless to be applied in conjunction with one another, rather than to try to join some of the concepts together. As I thought about it, the best reason I could think of to explain this reaction was to do with the way that the concepts operate in different ways, and thusly, how they need their space in order to operate properly.

The verifiability policy, as it is currently called, focuses on discrete "chunks" of content. The basic idea is that it should be possible for any reader to find the material in an extant reliable source. I'll call this the chunky level. The neutral point of view (NPOV) policy, on the other hand, operates at a higher level: it is concerned with what is done with these verifiable chunks of material, how they are put together. The core concept there is that, looking at the final article, all significant views on a subject should be presented fairly, in accordance with their prevalence (that is, not giving undue weight to any given view). The neutral-ness of individual chunks isn't important, rather the overall impression. I'll call this the smooth level.

The prohibition on original research sits somewhere in between these two in terms of the way in which it operates. It applies to individual "chunks" of content, in that each must not be original thought, but it also works on a broader level by prohibiting original research by synthesis. It's not a small picture or big picture thing: it's everywhere, at every level, from every angle.

There is undoubtedly some overlap between the policies on verifiability, NPOV and no original research, but I don't think that that's inherently a problem, nor do I think that when it becomes a problem that problem can be solved by merging the policies, because the policies operate in different ways.

Merging the NPOV and no original research policies, say, would lessen the force of the prohibition on "chunk-style" original research by focusing on the overall picture painted by the chunks when put together. Similarly, merging the verifiability and no original research policies - the thrust of the attribution proposal - lessens the force of the prohibition on original research by synthesis by focusing on the "chunk" level and not in the way that the chunks are used.

I think that the best way forward would be to merge elements of the policies and guidelines on sourcing into the verifiability policy, and rename that the "attribution" policy (the name "verifiability" is often misunderstood), and to maintain the other core content policies separately. Naturally, where overlap or bloat becomes a significant problem, then the policies need to be trimmed, and I think this is where efforts need to focus from now on.

How do they use us? Let me count the ways

A recent blog post from Steven Aftergood, analyzing the use of Wikipedia by the United States Government in certain intelligence products, has been attracting some attention recently, with the responses ranging from amused to angry. Aftergood approved of the use of Wikipedia "and other unorthodox sources" by the intelligence community, but gave the standard caveat on the inherent unreliability of some content:

"The relatively new attentiveness of U.S. intelligence agencies to Wikipedia and other unorthodox sources (including seems like a healthy development. Of course, like any source and moreso than some, Wikipedia cannot be used uncritically.

Last December, according to another OSC [Open Source Center] report, a participant in an online jihadist forum posted a message entitled 'Why Don't We Invade Wikipedia?' in which 'he called on other participants to consider writing articles and adding items to the online Wikipedia encyclopedia.... and in this way, and through an Islamic lobby, apply pressure on the encyclopedia's material.' "

The references to Wikipedia content identified by Aftergood don't actually amount to much, but they do provide an insight into the manner in which content is used.

In one instance that Aftergood identifies, Wikipedia was used as a source for the names of two children of a terror suspect, although the existence of the children was already known. This is an example of what seems to me to be one of the main types of use of Wikipedia as a source in "serious" contexts: using the content as "flavour", to add something to, or fill in gaps in, existing information. Another example is one of the earliest uses of Wikipedia as a press source, a 2003 Daily Telegraph article which touches on the Suez crisis, and then offers a link to the Wikipedia article on the crisis for readers unfamiliar with the subject to learn about it.

Another significant use in such contexts is to demonstrate the prevalence of an idea or a certain piece of knowledge in mainstream society or culture. A good example of this is the first use of Wikipedia as a source by a court, the use of the article "explorer" on the German Wikipedia by the German federal patent court (the Bundespatentgericht), in a trademark case, to show how the word has come to enter the German language. This is a little disturbing, given that part of the ideological underpinning of the project is that it shouldn't change existing thought on a topic, merely summarise the existing state of affairs, but fortunately it seems that this type of use is relatively rare.

A third category of use, one which I find particularly interesting, is the use of Wikipedia content ahead of other potentially available sources for the particular clarity or roundedness of the content. One example is the Federal Court of Australia's reference to the article "reality television" in a 2005 case to obtain a broad definition of reality TV. Another example is Australian politician Danna Vale's use of the English Wikipedia article on totalitarianism in a 2005 speech to parliament; the concept of totalitarianism was (probably) well known and understood to all present, but Vale turned to Wikipedia for a good expression of certain aspects of the concept. Yet another is the California Court of Appeal's reference to no less than eleven Wikipedia articles in its decision in Apple v Does. Indeed, this category of use seems to be very common among the instances of courts referencing Wikipedia (that is, when they're not criticising counsel for referencing Wikipedia).

I say that this last type of use is particularly interesting because it is probably closest to the main intended use of Wikipedia content: as a starting point, as a source that someone turns to first to get a quick understanding of the basics of a topic, and to obtain a starting point for further investigation. In short, it's using Wikipedia as an encyclopaedia.

There has been plenty of work so far in amassing lists of instances of Wikipedia being used as a source in various different settings. These lists stand to be an excellent resource for anyone researching the impact of Wikipedia today in terms of how it is used, and I for one would be very interested in seeing some research in this area. Understanding how Wikipedia is used, particularly how it is used in various contexts - such as in the press, in academia or in the courts - will be crucial for guiding the future direction of the project.

Vive le roi

A hot topic on the mailing list currently is a discussion about the nature of Jimbo's role on the English Wikipedia. In the discussion, I suggested that the constitutional monarchy is probably the best model for the project at the moment in terms of its governance structure. To understand why I think this is the best model, it's important to understand some Wikipedia history.

I should note at the beginning that I've only been around since October 2004, but I've developed what I think is a fairly good understanding of what went on before over my time with the project.

Originally, Jimbo exercised many important functions on the project, along with a few other select individuals, notably Larry Sanger (whose precise role is still subject to much debate, and has been since at least 2002). Gradually, power devolved, as other functionaries appeared to exercise various functions. Believe it or not, there were no sysops in the beginning; this feature wasn't added to the software until the beginning of 2002 or so, if memory serves me correctly (although I can't find a source for that currently). Here's the earliest list of sysops that I could find. The first sysops were developers like Brion, and people like Jimbo, and the functions devolved from there to be exercised currently by 1149 people.

The best example of devolution of authority is the creation of the Arbitration Committee at the end of 2003. Before the ArbCom was put in place, Jimbo performed the functions transferred to it, namely arbitration of serious disputes, including the authority to ban users. The often overlooked Mediation Committee was established at the same time, for the same purpose.

Jimbo no longer exercises these and other functions exclusively or regularly, though he reserves the right to do so. At the beginning of 2004, noone knew how the ArbCom would work out, and Jimbo reserved the right of executive clemency with respect to the ArbCom's decision, and even reserved the right to dissolve the ArbCom if necessary.

Jimbo's role looks like a horrible, poorly-defined mess, but looking at this through a constitutional history perspective, it seems fairly straightforward. Jimbo once exercised many functions, which are defined essentially by use: the functions that he had, such as arbitration, were the ones he exercised. These functions are now exercised by other functionaries, governed by their own policies, although Jimbo still has a potential to exercise them. Jimbo retains what are essentially reserve powers, to be used in extraordinary circumstances, while the day to day exercise of power is governed by the equivalent of a constitution (the arbitration policy, for example). Pressure from the community will serve well enough to force constitutional conventions on Jimbo's use of authority. As long as the conventions are not breached, everything's peachy.

I don't think it matters that his role isn't clearly defined. There's a doctrine in constitutional law that prerogative powers can diminish or even disappear entirely simply by not being exercised over a long period of time, which I think could be well applied to Wikipedia.

The last point I want to make is that while Jimbo's functions have been devolved, he remains a respected leader within the community, just as he was at the beginning of the project. It's a role just like that of Betty in many countries today, and it's a path well-trodden by other leaders of growing communities, in the open software field for example.

In summary, I'm comfortable with Jimbo's functional role not necessarily being rigidly defined, because I can appreciate the way it has diminished over time, and probably will continue to do so, and I can appreciate that not being rigidly defined can help this process. His leadership role is a function of his service, past leadership and the trust invested in him by many members of the community. Moreover, it is largely distinct from his functional capacity: Jimbo will remain a leader for as long as he continues to lead well, even while his functional role declines.

So to all the American editors clamouring for a constitution to define Jimbo's role: chill out and enjoy the Westminster System.

Experts and credentials

One of the recent hot topics lately, particularly in the wake of the Essjay incident, has been the idea that Wikipedia might consider introducing a system for validating credentials claimed by Wikipedia editors. One of the main reasons for such an idea was laid out by Jimbo in May 2005:

"[P]eople wonder, and not unreasonably, who we all are. Why should the world listen to us about anything? People think, and not unreasonably, that credentials say something helpful about that... although we never want Wikipedia to be about a closed club of credential fetishists, there's nothing particularly wrong with advertising that, hey, we are *random* people on the Internet *g*, but not random *morons* after all."

For as long as I've been editing - and probably for longer - there's been strong opposition to any form of credentialism on Wikipedia, for two main reasons, one cultural and one structural.

The cultural reason is that the community is strongly egalitarian, in that it essentially accords status by the quality (and, yes, the quantity) of work done by each editor. The Meta page on edit counting makes the excellent observation that Wikipedia (and the other Wikimedia projects too) is a gift economy, which is a type of economy distinct from a barter or market economy:

"A gift economy is an economic system in which goods and services are given without any agreement for immediate or future compensation. This differs from a barter economy - in which there is an immediate or expected quid pro quo... Typically, a gift economy occurs in a culture which emphasizes social or intangible rewards for generosity: karma, honor, loyalty or other forms of gratitude."

The structural reason is that the skills needed to create good content for Wikipedia are essentially aggregatory skills, deriving from Wikipedia's nature as a tertiary source. It is not about original thought, but about bringing together established thought from multiple - and indeed all - existing sources.

This is both a good thing and a bad thing. Opposing credentialism as a means of preserving the fundamental structural qualities of Wikipedia as a tertiary source is undoubtedly the right approach, as is operating a culture that rewards contribution. But this tends to ignore a large and very significant body of people: the readers of Wikipedia.

Readers I will define as those people who consume Wikipedia content without necessarily contributing to it; they exist outside of the culture and structure of Wikipedia and thus come to Wikipedia with completely different values. Many readers will be from cultures that place value in credentials as recognition of skills developed and knowledge acquired, and many will be from environments that are structured to respect and rely on credentials.

I don't think that these two worlds are incompatible. We can acknowledge the credentials of our editors without stepping over into credentialism. Under a verification system readers who are engaged enough to check out who has been writing the articles they are reading would be able to evaluate the compilers of the information in front of them in the same way that policies for verifiability and sourcing allow them to evaluate the information itself.

Acknowledging credentials would be a good first step in bridging a gap between two cultures.

