Sunday 31 August 2008

How Collective Wisdom Shapes Business, Economies, Societies and Nations (and Wikipedia articles)

Alaska Governor Sarah Palin was selected as presumptive Republican presidential candidate John McCain's running mate on Friday, and her Wikipedia article has seen a predictable explosion in editing activity. From the article's creation in 2005, up until the announcement on Friday, the article had been edited something like 900 times. Since then, however, it's been edited nearly 2000 times again.

What's more interesting is how the article was edited before the announcement was made. Ben Yates mentions this NPR story detailing edits made to the page by a user called Young Trigg, who may or may not have been Palin herself (or someone on her staff). But Young Trigg was not the only person editing the article.

The Washington Post reports on some analysis done by "Internet monitoring" company Cyveillance, which found that Palin's article was edited more heavily in the days leading up to the announcement than any of the articles on the other prospects for the nomination. A similar pattern emerged in relation to the articles on the frontrunners for the Democratic vice-presidential nomination: Joe Biden's article was edited more heavily than the other potential picks in the leadup to his selection as Obama's running mate last week.

Also similar were the types of edits being made: both Palin and Biden's articles saw many footnoting and other accuracy-type edits in the leadup to the announcements of their selection. As a final piece of intrigue, the editors making these edits about Palin and Biden were far more likely to also be actively editing McCain and Obama's articles respectively than were the editors editing articles on the other potential nominees.

There are at least two explanations for these patterns. The first is that the two campaigns, knowing full well who the nominees would be, were editing the articles in advance of the announcement to ensure that they were accurate (or to take the cynical view, to ensure that they were favourable), knowing full well that Wikipedia would be one of the major sources of information for the public - and for journalists and campaign staff too - following the announcements.

The alternative is more interesting, to my mind. Cyveillance, who did the analysis, is usually in the business of data mining in the business world, aiming to collate disparate sources of public information to predict financial and commercial events before they are publicly announced. Wikipedia may be performing exactly the same function: a variety of editors collating disparate pieces of information in a far more powerful way than any individual could. It's already (un)conventional wisdom that the betting markets are equal or better predictors of elections than opinion polls are: a basic application of the efficient market hypothesis. In a similar way, high profile, highly edited Wikipedia articles like these are the marketplace of the information economy.

Saturday 23 August 2008

Userpage Google envy

Brianna Goldberg, a Canadian journalist with the National Post, wrote on Friday about her efforts to become the number one Google result for her name. Her quest was sparked by discovering that the Wikipedia user page of another Brianna Goldberg was ensconced in the top spot.

The journalist Goldberg obtained advice from search engine optimization experts on methods for advancing her ranking, but still had difficulty displacing the userpage. Moreover, the article on the journalist comes in second to the userpage in results from Wikipedia. Wikipedia user pages are certainly highly visible: every time you sign an edit, you're creating a link to your user page.

This relates to a discussion from last month on the mailing list about whether user pages (and certain other types of pages) should be indexed by search engines at all. The Wikimedia sites already instruct search engines not to crawl certain pages, including deletion debates, requests for arbitration pages and requests for adminship, but there have regularly been calls for more types of pages to be restricted (see here for example).

So, should user pages be blocked from search engine crawlers?

Monday 18 August 2008

US court groks free content licensing

The US Court of Appeals for the Federal Circuit handed down an interesting and significant decision on Wednesday, which could have a number of valuable implications for the validity of free content licences.

The case, Jacobsen v Katzer, was about software for interfacing with model trains. Robert Jacobsen is the leader of the Java Model Railroad Interface project (JMRI), which releases its work under the Artistic License 1.0; Matthew Katzer (and his company Kamind Associates) produce commercial model train software products. It was alleged that either Katzer or another employee of Kamind took parts of the JMRI code and incorporated it into its own software, without identifying the original authors of the code, including the original copyright notices, identifying the JMRI project as the source of the code, or indicating how it had modified the original JMRI code.

Jacobsen sought an interlocutory injunction, arguing that since Katzer and Kamind had breached the Artistic License, their use of the JMRI code constituted copyright infringement. However, the District Court considered that Jacobsen only had a cause of action for breach of contract, not for copyright infringement, and because of this Jacobsen could not satisfy the irreparable harm test (in the case of copyright infringement, irreparable harm is presumed in the 9th Circuit), and was not entitled to an injunction.

Jacobsen's appeal to the Court of Appeals was against this preliminary finding. An assortment of free content bodies (including Creative Commons and the Wikimedia Foundation) appeared as amici curiae in the case, submitting an interesting brief containing a number of arguments that the Court of Appeals seemed to agree with.

The legal issue at stake in the appeal concerned the difference between conditions of a contract and ordinary promises (covenants, in US parlance). If a term in a contract is a condition, then the promisee has a right to terminate the contract. In the context of a copyright licence, if someone using the licensed material breaches a condition of the licence, they are then open to a copyright infringement action (unless they have some other legal basis for using the material). Contract law will still hold someone responsible for breaching a contractual promise, but the remedies are different, and as was the issue here, it's much harder to get an interlocutory injunction.

Whether or not a term is a condition is a matter of construction, and depends on the intention of the parties. In answering the question of whether the relevant terms were conditions, the Court of Appeals made a number of important observations which are applicable to free content licences generally.

The first observation was that, just because with free content licensing there is no money changing hands, it is not the case that there can be no economic consideration involved. The Court recognised several other forms of economic benefit which free content licensors derive from licensing their works:

"There are substantial benefits, including economic benefits, to the creation and distribution of copyrighted works under public licenses that range far beyond traditional license royalties. For example, program creators may generate market share for their programs by providing certain components free of charge. Similarly, a programmer or company may increase its national or international reputation by incubating open source projects."

This is a really significant observation for the court to make, because there are some major ideological barriers that seemed to get in the way of the District Court on this point. Even though free content licencing is all about authors dealing with their economic rights under copyright, free content is all too often viewed as non-economic. Just because free content doesn't fit in with the traditional royalties-based system, it does not mean that there are not real economic motives involved.

The second observation was made in the context of the general rule (applicable in that jurisdiction) that an author who grants a non-exclusive licence effectively waives their right to sue for copyright infringement. If the relevant terms were conditions, then they would be capable of serving as limitations on the scope of the licence, which would negate this rule. The Court said that:
"[t]he choice to exact consideration in the form of compliance with the open source requirements of disclosure and explanation of changes, rather than as a dollar-denominated fee, is entitled to no less legal recognition."

Again, this seems to be an important point in terms of getting over psychological hurdles. The District Court was clearly hung up on the terms in the Artistic License allowing users to freely distribute and modify licensed material; it focused on the breadth of the freedoms granted. In doing so it overlooked that while the License did grant broad freedoms, it clearly circumscribed them. The Court of Appeals understood what the District Court did not: that releasing material under a free licence is not the same as giving it away.

The heart of the decision was of course about the particular wording in the Artistic License. The use of the phrase "provided that" in the Artistic License was significant, because such wording usually indicates a condition under Californian contract law. Further, the requirement that any copies distributed be accompanied by the original copyright notice - a relatively common term - also typically indicates a condition.

In the end, the Court of Appeals decided that the relevant terms were conditions, and that Jacobsen had a copyright infringement action open to him. Since the District Court didn't assess Jacobsen's prospects of success on the merits, the Court of Appeals remanded the injunction application back to them for their consideration. Given that Katzer and Kamind apparently conceded that they did not comply with the Artistic License, Jacobsen would seem a good chance to get his injunction, and later to succeed at the merits stage.

Though much turned on the particular wording here, the reasoning behind the assessment of the terms can easily be applied to other free content licences, as can the recognition of the economic motives involved in free content licencing, motives which though non-traditional, are both legitimate and worthy of protection by the law. Independent of any value as a binding precedent, this case is a magnificent example of a court really appreciating the vibe of free content.