Sunday, 17 May 2009

New tools

A couple of new tools I've put together that people might find some use for:


  • Admin activity statistics: shows some statistics on how many admins have used their tools at all over various timeframes, and on how many actions are taken by each active admin over various timeframes. Works on any Wikimedia project.
  • Per-page contributions: like [[Special:Contributions]], but shows contributions just to a particular page. Works on any Wikimedia project. I've already found it quite useful in several arbitration cases, especially for users who have made a large number of edits, or for pages which have been edited many times.

The image below is one of the graphs produced by the admin activity tool, it shows how many admins have performed at least one administrative action over various timeframes on the English Wikipedia:

Monday, 20 April 2009

More bug statistics

Last November I put together some simple charts with the information from the weekly bug statistics that are automatically generated for the wikitech-l mailing list. There's now thirty-two weeks of data available, so here are some updated charts.

The distribution of resolution types seems to have stayed more or less the same over time, continuing the pattern seen in the original charts:


However, there are some changes in the other graph, which is based on information about the number of bugs each week. It shows the number of new, reopened, assigned and resolved bugs each week (using the scale on the left) and the total number of open bugs (in blue, using the scale on the right):


While there is still the same rough correlation between the number of new bugs and the number of bugs resolved each week, there is also a steady trend upwards in the total number of open bugs. Indeed, the total has risen nearly 20% since October last year.

So what are the consequences of so many bugs being opened but not dealt with? The following chart, generated by Bugzilla directly, shows the distribution of the "severity" parameter of all currently open bugs:


It shows that three-fifths of open bugs have severity given as "enhancement", essentially meaning that they're feature requests, entered into Bugzilla for tracking purposes, rather than being true bugs. A further 13% are marked "trivial" or "minor", and nearly a quarter "normal"; only 3% are "major".

So while the number of unresolved bugs is steadily rising, most of these are either feature requests or only minor bugs. Still, the backlog is fairly steadily getting worse, a reminder that it's constantly necessary for new volunteer developers to become involved with improving MediaWiki.

Thursday, 19 March 2009

Parts of Wikipedia blacklisted in Australia

The Australian Communications and Media Authority (ACMA) has added whistleblower website Wikileaks to its secret website blacklist. This comes after Wikileaks published a recent version of the blacklist, which includes Wikipedia pages, in addition to various religious websites and the site of a Queensland dentist.

In February an anti-censorship activist submitted a Wikileaks page (containing a copy of Denmark's secret blacklist!) to ACMA's online complaints facility, as a test of ACMA's guidelines. ACMA blacklisted that page, satisfied that it was "prohibited content" or "potential prohibited content" under the relevant legislation. However Wikileaks then published details of the report, including the correspondence, and then published a leaked copy of the ACMA blacklist from last August. Following this, ACMA blacklisted the entire Wikileaks site.

As of the time of writing, it does not seem possible to access Wikileaks from Australia, so I do not know what is on the leaked blacklist. But media reports indicate that, in addition to the intended targets of child porn sites, there is a substantial minority of other sites blacklisted, including some Wikipedia pages, YouTube videos, and online gambling sites, as well as a few bizarre examples in a tuckshop management company and an animal carer group.

The responsible minister, Senator Stephen Conroy, has denied that Wikileaks' list is the real thing, and one of the ISPs involved in the mandatory internet filtering trial has backed that up, saying that it is not the same as the list supplied to them recently.

Yet whether Wikileaks' list is accurate or not, the attention now being paid to the practices of ACMA in relation to the blacklist has at least exposed the risk to educational sites like Wikipedia posed by similar censorship systems. The ACMA blacklisting scheme is designed to dovetail with Australia's existing content classification system (for films, television etc) by defining "prohibited content" to mean content classified as RC (refused classification) or X 18+ by the Classification Board (and also R 18+ content to which unrestricted access is allowed, and under certain circumstances, M 15+ content).

This system has been criticised in a number of ways, not least because Internet content is subject to the film and television classification rules, rather than the rules for publications (with the result that, for example, a printed newspaper and a newspaper website showing the same material will be treated differently, depending on which version is classified first). Nevertheless, the Classification Board has extensive experience in content classification, and, as it is a singular organisation whose decisions are subject to review, is at least broadly consistent in its application of the guidelines.

The blacklisting scheme goes further, however, and allows ACMA to blacklist not only content which has actually been classified, but also "potential prohibited content", that is, unclassified content which it believes would ultimately be prohibited if it were classified. In practice, this means that ACMA bureaucrats - whose decisions are not subject to the same process of review, and are not even guaranteed to be made in the same way and applying the same process as the Classification Board - can blacklist sites if they think there is a "substantial likelihood" that the content would be prohibited.

Under the National Classification Code (PDF), classification not only depends on what the content depicts, but on the manner in which it is depicted. Relevantly for Wikipedia, educational materials covering subject matter like sexuality will likely be treated differently than other genres of material depicting the same subjects. With this parallel ACMA scheme, there is no guarantee of consistency, no guarantee the code will be correctly applied and no prospect of review. Thus, the public's access to legitimate educational content, such as Wikipedia articles, is subject to the whims of ACMA bureaucrats.

A related problem is that the ACMA blacklist is the basis of the aforementioned proposed mandatory internet filtering scheme in Australia, which aims to filter the Internet at the ISP level. Depending on the way such a scheme (if it is actually instituted, which seems unlikely at this time) is actually implemented by ISPs, we may end up with a situation in which access to Wikipedia is widely blocked, as happened recently in the UK.

Sunday, 8 March 2009

Maryland court rejects identification subpoena

Zebulon Brodie, a franchisee for Dunkin' Donuts, sued Independent Newspapers (operator of the Newszap.com classifieds and forums website) and three pseudonymous members of the site for defamation and conspiracy to defame, after the three participated in a forum thread in which the cleanliness of the store was critiqued.

The liability of Independent Newspapers (IN) was fairly easily resolved: the trial judge found that the company, as the provider of an "interactive computer service", could not be treated as the publisher or speaker of the forum postings due to s 230 of the Communications Decency Act, and as such could not be liable in defamation for the postings' contents. This provision has protected a range of service providers from liability for defamation and similar actions, including the Wikimedia Foundation itself.

However, the liability of the three pseudonymous users is a different story, and it was this issue that has been contentious in the case. The Newszap website required users to register before using the forums, and Brodie sought, by way of a subpoena, to compel IN to identify a total of five pseudonymous users who had participated in the forum thread. In turn, IN sought motions to quash the subpoena, and for a protective order to be issued; however, the trial judge rejected those motions, and ordered IN to identify the users.

The Maryland Court of Appeals overturned that order in a decision published this week (PDF). The basis for the decision was that three of the users did not make any comments that were actionable in defamation, and the other two, though they did make arguable actionable remarks, were not actually named as defendants in Brodie's original complaint (and by the time the case had proceeded to that point, any action against the two was barred by limitations provisions).

Though the case was thus resolved on an essentially procedural point, the Court of Appeals nevertheless went on to discuss the underlying question of when anonymous or pseudonymous users in such sitautions should be identified, and offered some guidance to lower courts.

All seven judges agreed on four steps that should be undertaken by courts considering defamation actions involving anonymous or pseudonymous defendants, where disclosure is sought:


  1. require the plaintiff to make efforts to notify those defendants of any subpoena or application to disclose their identity - in the context of Internet forums, by posting a message there;
  2. allow those defendants reasonable opportunity to oppose the application;
  3. require the plaintiff to clearly identify the speech said to be actionable in defamation; and
  4. determine whether the plaintiff has advanced a prima facie case against those defendants.

However, four judges comprising the majority went further, and added a fifth step that courts should undertake: if all the other requirements were satisfied, the court should weigh the strength of the prima facie case against the anonymous or pseudonymous defendants' First Amendment rights.

First Amendment jurisprudence concerning free speech has tended to recognise that an author's decision whether or not to disclose their identity may be protected as much as the content of their speech itself. In practice, this has translated into, for example, the Supreme Court of the United States striking down a local council ordinance forcing anyone soliciting door-to-door (in that case, Jehovah's Witnesses) from identifying themselves and obtaining a permit before doing so. While anonymity, like any other aspect of the right to free speech, does not protect speech which is defamatory, the majority were keen to point out that anonymous or pseudonymous posters have a right "not to be subject to frivolous suits for defamation brought solely to unmask their identity." In their view, the additional balancing test, beyond the prima facie requirement, was necessary to give adequate protection to this right. A lower standard of protection, in their view, "would inhibit the use of the Internet as a marketplace of ideas, where boundaries for participation in public discourse melt away."

The three judges who dissented as to the need for the balancing test were of the view that the prima facie requirement provided sufficient protection of First Amendment rights, given that they are already taken account of in the ordinary law of defamation. Judge Adkins, writing for the minority, cautioned that "the majority decision invites the lower courts to apply, on an ad hoc basis, a 'superlaw' of Internet defamation that can trump the well established defamation law."

The case is an interesting example of the way in which computer services providers who are protected by section 230 nevertheless have a significant role to play in legal processes that reach past them to target users of their services. The court also placed emphasis on ensuring that anonymous or pseudonymous users have an opportunity to participate in legal processes before their identity is disclosed. As a consequence, providers are not merely passive targets for subpoenas, nor must they be zealous defenders of all users of their services; rather, they have an important mediative role.

Thursday, 22 January 2009

Small screen$

Correction

Angela subsequently provided a correction for this post; the error was in the cited publication:
"The new platform has nothing to do with Nokia as far as I know. I've sent a correction to the author of that article. The Foundation already has a branding deal with Nokia but that's not related to this."

Angela Beesley (former member of the Wikimedia Foundation Board of Trustees, and current chair of the Advisory Board), has indicated during a talk at Linux.conf.au that the Foundation will be announcing a new mobile platform for Wikipedia later in the year. According to ZDNet Australia, the platform is currently under development and will be licensed to Nokia.

There are already a number of iPhone apps for reading Wikipedia, whether online and offline, including plenty of good free apps. In addition to dedicated apps, there is a solid specialised Wikipedia mobile interface, and there have been efforts to make the regular web interface of Wikimedia wikis more practicable in mobile browsers.

It's good news, however, to hear that a dedicated Wikipedia interface is on its way for one of the closed mobile platforms. What's also interesting is to hear that the platform will be licensed to Nokia, which makes it sound as if there's some commercial arrangement involved. It's interesting in that Angela's talk also touched on the recent success of the fundraiser, and the possibility of alternative sources of funds, such as selling physical versions of content like books and posters.

Licensing a branded mobile platform strikes me as an interesting potential revenue stream. It reminds me of Mozilla's arrangement with Google: something that benefits users, and also allows money to be made without compromising on principles (unlike, say, introducing ads).

Tuesday, 25 November 2008

The human touch

Google has recently released SearchWiki, a set of tools for annotating Google search results. It's a rather dramatic change to the main search page, accesible by anyone logged into a Google account.

The interesting thing is that it seems like a personal version of Wikia Search (personal in the sense that only your alterations change the order of results, although you can see comments from everyone), though earlier comparisons were made more to link sharing sites like Digg.

So, might the emergence of this tool mean that Google is not, unlike so many others, underestimating the potential of Wikia Search?

Tuesday, 18 November 2008

Bug statistics

Since the beginning of September, the bug tracker for MediaWiki has been sending weekly updates to the Wikitech-l mailing list, with stats on how many bugs were opened and resolved, the type of resolution, and the top five resolvers for that week. With eleven weeks of data so far, some observations can be made.

The following graph shows the number of new, resolved, reopened and assigned bugs per week (dates given are the starting date for the week). The total number of bugs open that week is shown in blue, and uses the scale to the right of the graph:


The total number of open bugs has been trending upwards, but only marginally, over the past couple of months. It will be interesting to see, with further weekly data, where this trend goes.

It also seems that the number of bugs resolved in any given week tends to go up and down in tandem with the number of new bugs reported in that week. Although there is no data currently available on how quickly bugs are resolved, I would speculate that most of the "urgent" bugs are resolved within the week that they are reported, which would explain the correlation.

Note also the spike in activity in the week beginning 6th October; this was probably the result of the first Bug Monday.

The second graph shows the breakdown of types of bug resolutions:


The distribution seems fairly similar week on week, with most resolutions being fixes. It's interesting to note that regularly around 25% to 35% of bug reports are problematic in some way, whether duplicates or bugs that cannot be reproduced by testers.

The weekly reports are just a taste of the information available about current bugs; see the reports and charts page for much more statistic-y goodness. And kudos to the developers who steadily work away each week to handle bugs!