Thursday, 4 February 2010

Hollywood v the Internets

Australian copyright law has a new landmark decision as of this morning, with Justice Cowdroy of the Federal Court of Australia handing down his decision in the Roadshow Films v iiNet Limited case, in which the misleadingly-named Australian Federation Against Copyright Theft (AFACT) sued iiNet, Australia's third-largest ISP, alleging copyright infringement. The case is significant in several ways both for ISPs and for operators of Internet services in Australia.

AFACT is a consortium of Hollywood movie studios who alleged that iiNet customers infringed copyrights owned by them in certain films by distributing copies via the BitTorrent file sharing protocol, and that iiNet itself had infringed by authorising its customers' infringements. AFACT had engaged an anti-piracy software firm to track the transmission of films over BitTorrent by IP addresses allocated to iiNet, and had then sent notices to iiNet warning them of the infringements and requesting that the ISP take action against the customers concerned. iiNet argued that it had not authorised any infringements. It also argued that privacy provisions in telecommunications legislation prevented it from acting upon any notices sent to it, and alternatively that it was protected from litigation by safe harbour provisions in copyright legislation.

Cowdroy J today held that while iiNet customers had infringed copyrights owned by AFACT members, iiNet had not authorised these infringements, for three reasons:


  1. that one can distinguish "the provision of the 'means' of infringement compared to the provision of a precondition to infringement";
  2. that any scheme for acting on AFACT notices would not constitute a relevant power or a reasonable step available to prevent infringement (within the meaning of s 101(1A) of the Copyright Act, which sets out factors that must be considered in assessing authorisation); and
  3. that iiNet did not sanction or approve of copyright infringement by its customers.


Cowdroy J held that the means of infringement in this situation was the BitTorrent system (the protocol, trackers and clients) and not iiNet's network, thus distinguishing classic authorisation cases such as University of New South Wales v Moorhouse (involving a university library that provided photocopiers for the use of library patrons) as well as more recent Internet-centric cases such as Universal Music v Sharman Licence Holdings (in which Sharman was found to have authorised infringements via its Kazaa file-sharing software, with which Sharman both refrained from preventing infringement and actively encouraged infringement).

Distinguishing in this way the ultimate means from mere preconditions injects some clarity into the test for authorisation, which has largely revolved around degrees of control and of encouragement (Cowdroy J's second and third reasons mentioned above go to this classic test). This approach was obviously advantageous for iiNet. However, for operators of services such as wikis and social-networking sites, this approach would seem to render it more likely that they would be found to be authorising any copyright infringements by users, by providing the means of infringement such as a file upload facility or the ability to edit pages.

Without authorisation AFACT's case thus failed, however Cowdroy J went on to consider iiNet's other arguments in its defence anyway, in the event of an appeal (which would seem highly likely). He held that iiNet would not have been protected by s 112E of the Copyright Act, which protects telecommunications providers from being held to authorise infringement merely through providing the telecommunications service used to carry out the infringement. However, he found that iiNet would have been protected by the safe harbour provisions in the Copyright Act (s 116AA ff) because it had a "reasonably implemented" policy for dealing with repeat infringers.

These safe harbour provisions were based on the United States' OCILLA safe harbour provisions, although while the American provisions extend to "online service providers" (including website operators) the Australian ones are limited to "carriage service providers", that is, ISPs themselves. To my knowledge this is the first case to seriously address these provisions, and Cowdroy J notably utilised American OCILLA jurisprudence in doing so. Thus it seems that the safe harbour provisions will provide reasonably strong protections for ISPs, although with the current form of the legislation, this is of little comfort to online service providers.

The decision is significant in the context of Australian copyright law, and will be a boon for ISPs operating in Australia. However, for online service providers (such as operators of wikis), the substance of the decision will only serve to underline their precarious legal position in Australia, as opposed to their American counterparts, when it comes to copyright infringement by users of their services. They are not protected by safe harbours, and a "means"-based test for authorisation may well be worse than the more traditional control/encouragement test, if indeed it replaces it (it may merely augment it).

The silver lining however may be in Cowdroy J's rhetoric. His discussions of AFACT's nature and objectives, of its arguments and trial conduct, and of its attempt essentially to foist upon iiNet a positive obligation to protect its members' copyright interests, are enlightening. Robert Corr extracts some choice quotes here. Following last year's even more significant landmark decision by the High Court of Australia in the epic IceTV case, there would seem to be a healthy desire, in certain quarters of the legal community, to reevaluate some of the more extremist trajectories in Australian copyright law.

Saturday, 30 January 2010

What happens to unreferenced BLPs?

Those of us who live other than under rocks will no doubt be aware of the latest controversy over Wikipedia's approach to biographies of living persons articles (BLPs), concerning the deletion last week of a large number of BLPs that had been tagged as being unsourced, and had not been edited for more than six months. The deletions sparked a giant administrators' noticeboard discussion, a request for arbitration and now a request for comments on how to proceed from here.

At the crux of the dispute is how seriously the project is to take the modified standards that it has adopted with respect to biographies of living persons.

Debates of this sort are usually run along inclusionist/deletionist lines, but really the more important philosophical dichotomy when it comes to BLPs is between eventualists and immediatists. Wikipedia on the whole favours an eventualist perspective - facilitated by the almost immeasurably large potential pool of labour out there - but the BLP policy is essentially a localised switch to immediatism: unsourced material needs to be sourced post-haste, or else removed.

Conceptually it's an elegant and attractive approach. But a major flaw with it is our attraction to eventualism. We just can't shake it off.

This category, and its many subcategories, tracks BLP articles that have been tagged as not having any sources. At the time of writing there are over 47,000 of them, some having been tagged as long ago as December 2006. Evidently any sense of urgency has passed those by. The backlogs mount until they approach the point where individual editors have difficulty comprehending the problem, let along working to address it. Frustration builds at the inevitable inertia, until something radical happens, like these mass deletions.

Is this view accurate? Is the problem of unsourced BLPs really out of hand? We can try to answer these questions by looking at the way the backlog has been managed.

Unfortunately, the data available for this purpose is somewhat limited. Database dumps older than the 20 September 2009 dump are currently not available due to maintenance. However that September dump, along with dumps from 28 November 2009 and 16 January this year (shortly before the deletions started), do offer three data points with which to commence.

The monthly subcategories from October 2006 to August 2009 inclusive were common to all three dumps. The total number of articles in these categories declined from 50,715 in September to 43,655 earlier this month, a 13.9% fall. However, over the same period, the total in all subcategories through December 2009 rose from 50,715 to 51,301, a 1.2% increase. At least over this period, new additions outweighed articles being removed from these categories.

It should be noted that some of these additions are due to articles that had been tagged, but were unsorted, being added into the monthly subcategories. In fact, ten of the thirty-five subcategories common to all three dumps saw increases in numbers since September. The following graph shows the change in the monthly category totals over the roughly four months between the September and January dumps:


Without analysing the actual changes in the lists of articles in these subcategories it won't be possible to tell whether the sorting process is merely outweighing the normal reductions through articles being referenced or deleted, or, as I suspect, if there are genuinely fewer reductions in these subcategories that are no longer recent, but not yet the oldest. This can be the subject of further inquiry.

What we can say now is that the total number of unreferenced BLPs is now showing real decline for at least the first time in four months, possibly longer. It seems to have been the shock of mass deletions that has spurred people into action either to fix or delete these articles. Hopefully the shock will last long enough for a significant reduction to be achieved.

Tuesday, 13 October 2009

WikiReader

Openmoko, a group which produces and distributes an open-source mobile phone environment, as well as phones to run it, has released the WikiReader, a dedicated device for reading Wikipedia. The WikiReader has a 240 by 200 pixel touchscreen and uses a compressed, text-only version of Wikipedia stored on a microSD card. Users can subscribe to receive quarterly updated copies on a new microSD card, or download the updates for free.

There are many implementations out there for reading Wikipedia on mobile devices, but to my knowledge this is the first dedicated Wikipedia reading device. However, beyond the inherent simplicity that a dedicated device provides, it's difficult to see many advantages to the WikiReader over other options.

One of the major advantages of Wikipedia is its up-to-the-minute coverage, and as an offline device (even with quarterly updates) the WikiReader loses this advantage. Mobile online access to Wikipedia has not been the best in the past, but the Wikipedia mobile portal has received plenty of tender loving development recently and is now quite decent, even on older devices. Aside from this mobile web interface, there are also dedicated Wikipedia reading apps for devices such as the iPhone.

Naturally not everyone has mobile internet access, or is always in a location where it is available, so offline methods are essential for many people. But there are plenty of implementations available for other devices, such as Encyclopodia, for the iPod family, or a TomeRaider format Wikipedia ebook.

Of course, the convenience of Wikipedia has been central to its success, and the convenience of a dedicated device may outweigh its disadvantages. It will be difficult for the WikiReader to succeed, however, when there is so much more flexible competition out there.

Wednesday, 1 July 2009

Arbitration Committee mail traffic

Some brief traffic statistics on the Arbitration Committee's mailing list:


  • a total of 14692 messages were received by the list from January through June this year
  • an average of 81 messages were received each day
  • this is more than foundation-l (4473), wikien-l (4015) and wikitech-l (2924) combined over the same period, with change left over

Conclude from this what you will.

Monday, 29 June 2009

All Quiet on the Waziri Front

There's an interesting piece in the New York Times today on investigative journalist David Rohde - who was kidnapped in Afghanistan last year and who escaped last week from his captors in Waziristan, in northern Pakistan - and the efforts to extend the media blackout on news of the kidnapping to his Wikipedia article.

The blackout was orchestrated by the New York Times Company and was said to have involved forty international news agencies, from NPR to al-Jazeera. NYT personnel "believed that publicity would raise Mr. Rohde's value to his captors as a bargaining chip and reduce his chance of survival", the story says, quoting Rohde's colleague Michael Moss as saying "I knew from my jihad reporting that the captors would be very quick to get online and assess who he was and what he’d done, what his value to them might be".

Along with staff at other news agencies, NYT personnel contacted Jimmy Wales too, who passed the matter along to a small group of administrators who reverted mentions of the kidnapping and protected the article a number of times over the following months. Michael Moss also apparently edited the article to emphasise Rohde's Pulitzer Prize-winning work on the Srebrenica massacre, as well as his work on Guantanamo Bay, believing that if his captors read the article they might view him as more sympathetic towards Muslims.

Jimbo acknowledges in the NYT piece that the matter was made easier by the lack of reliable sources reporting the kidnapping - a consequence of the blackout - which meant that the biographies of living persons policy could operate to keep any references to the kidnapping out of the article. The policy, of course, was originally intended to keep fabricated material out of articles, but it worked equally well to assist the blackout in this case.

The ethics of the blackout have come into question following Rohde's escape. NPR reported Poynter Institute journalism ethics lecturer Kelly McBride as saying "I find it a little disturbing, because it makes me wonder what else 40 international news organizations have agreed not to tell the public". Dan Murphy at the Christian Science Monitor says that the question of whether the press has a double standard in keeping quiet about their own while regularly reporting on other kidnappings will likely become part of the debate. Greg Mitchell, the editor of industry journal Editor & Publisher, details that organisation's internal debates and ultimate decision to adhere to the blackout. Mitchell raises a potential competing public interest argument, that information about events such as kidnappings in a certain area could, in some cases, help protect the public (though the average NYT reader doesn't hang out near Kabul that often - it might help protect other journalists though).

On the Wikipedia front, this is an interesting biographies of living persons case because every aspect of it involves journalists, who as a profession develop, apply and teach a whole suite of ethical principles governing their work, principles that many have suggested Wikipedia ought to adapt or learn from.

It's regularly true that hard cases make bad policy, and it is so here: the kidnapping was said to have been reported by an unnamed Afghani news agency, and apparently by Italian agency Adnkronos too; the existence of reliable sources on the matter (which I cannot verify due to absent or broken links) throws into doubt the legitimacy of enforcing the blackout on Wikipedia.

This may well put a wedge between two similar but distinct camps of support for the biographies of living persons policy: those who believe that such articles should be written from a "do no harm" perspective, and those who have a similar sympathy but only go so far as supporting a strict, immediatist adherence to ordinary content policy (instead of the typical eventualist stance), and no further.

Sunday, 17 May 2009

New tools

A couple of new tools I've put together that people might find some use for:


  • Admin activity statistics: shows some statistics on how many admins have used their tools at all over various timeframes, and on how many actions are taken by each active admin over various timeframes. Works on any Wikimedia project.
  • Per-page contributions: like [[Special:Contributions]], but shows contributions just to a particular page. Works on any Wikimedia project. I've already found it quite useful in several arbitration cases, especially for users who have made a large number of edits, or for pages which have been edited many times.

The image below is one of the graphs produced by the admin activity tool, it shows how many admins have performed at least one administrative action over various timeframes on the English Wikipedia:

Monday, 20 April 2009

More bug statistics

Last November I put together some simple charts with the information from the weekly bug statistics that are automatically generated for the wikitech-l mailing list. There's now thirty-two weeks of data available, so here are some updated charts.

The distribution of resolution types seems to have stayed more or less the same over time, continuing the pattern seen in the original charts:


However, there are some changes in the other graph, which is based on information about the number of bugs each week. It shows the number of new, reopened, assigned and resolved bugs each week (using the scale on the left) and the total number of open bugs (in blue, using the scale on the right):


While there is still the same rough correlation between the number of new bugs and the number of bugs resolved each week, there is also a steady trend upwards in the total number of open bugs. Indeed, the total has risen nearly 20% since October last year.

So what are the consequences of so many bugs being opened but not dealt with? The following chart, generated by Bugzilla directly, shows the distribution of the "severity" parameter of all currently open bugs:


It shows that three-fifths of open bugs have severity given as "enhancement", essentially meaning that they're feature requests, entered into Bugzilla for tracking purposes, rather than being true bugs. A further 13% are marked "trivial" or "minor", and nearly a quarter "normal"; only 3% are "major".

So while the number of unresolved bugs is steadily rising, most of these are either feature requests or only minor bugs. Still, the backlog is fairly steadily getting worse, a reminder that it's constantly necessary for new volunteer developers to become involved with improving MediaWiki.