Jump to content

Wikipedia:Wikipedia Signpost/2012-05-21/Technology report

From Wikipedia, the free encyclopedia
Technology report

On the indestructibility of Wikimedia content

WMF wiki content now almost indestructible


The content of Wikimedia wikis has recently moved significantly closer towards indestructibility, it was announced this week by WMF developer and data dumps specialist Ariel Glenn.

Masaryk University, in the Czech Republic, is one institution now mirroring Wikimedia dumps.

Specifically, data from all Wikimedia wikis is now being successfully replicated to three non-WMF sites around the globe: C3L in Brazil, Masaryk University in the Czech Republic and the servers of Your.org in the United States. Each site holds ("mirrors") at least five monthly snapshots ("dumps") of the publicly available wikitext-based content of all of the many hundreds of Wikimedia wikis. Your.org also hosts a copy of all previous dumps and will hold a single snapshot of all publicly viewable media. Moreover, Glenn reports, "getting the bugs out of the mirroring setup [has made it] easier to add new locations" as well as providing the latest snapshots to already established mirrors. As reported then, the first dump mirror came online in October last year, but this is the first time so many have been available concurrently.

Increasing the number of mirrors—made possible by the free licensing of Wikimedia wikis—helps to ensure that content is sufficiently accessible and geographically diverse to survive natural and artificial disasters; while multiple websites do host live copies of the English and other major Wikipedias, dump mirroring is particularly useful for protecting the content of smaller wikis, which do not enjoy such protection; the same used to be the case of the English Wikipedia, whose 2001 articles were long thought to be lost until old backups were uncovered in December 2010. Theoretically, dump mirrors could also offer better download speeds at times of peak usage, but that is unlikely to be a primary use case for Wikimedia wikis.

Of course, not everyone is so concerned at the possibility that Wikimedia's content might be destroyed in the immediate future, dump mirrors or no dump mirrors. As WMF Lead Platform Architect Tim Starling commented in a 2011 discussion of forking Wikipedia, "the chance of [WMF financial collapse] appears to be vanishingly small, and shrinking as the Foundation gets larger. If there was some financial problem, then we would have plenty of warning and plenty of time to plan an exit strategy. The technical risks (meteorite strike etc.) are also receding as we grow larger". That discussion focussed rather less on the technical aspects of making Wikimedia content indestructible, and more on allowing separate communities to emerge if Wikimedia communities broke up.

In brief

Signpost poll
Bugzilla
You can now give your opinion now on next week's poll: Which of the following do you consider the greatest threat to Wikipedia?

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.

  • 1.20wmf3 starts being deployed: The third "mini-deployment" of the MediaWiki 1.20 series has now been launched on a host of wikis including Wikimedia Commons. The relatively low-profile deployment, equivalent to two weeks' worth of development, brings with it a combined total of nearly 150 bug fixes and new features, including the option for duplicating metadata such as author and source across a batch upload undertaken via Commons' UploadWizard, and an "associated namespace" checkbox to make filtering a watchlist easier. The "title" of a diff page (the text that displays as a tab name in most modern browsers) will now include the word "diff" to make identification easier. Most deployments went smoothly, although the Wikibooks sites had to be temporarily reverted due to a single major bug with PDF generation. As of time of writing, 1.20wmf3 has just been deployed to the English Wikipedia; the discovery of major bugs notwithstanding, it will be deployed to all remaining wikis (non-English Wikipedias) later in the week.
  • Wikipedia ads?!?: An uptick in the prevalence of malware injecting adverts into Wikipedia was noted this week on the Wikimedia blog. One example, called "I want this" poses as a legitimate browser addon, before proceeding to embed genuine adverts into the surrounds of a Wikipedia article, so as to make it appear as though Wikipedia itself was promoting the product or service described. Similar behaviour has been noted by internet service providers in countries such as China, where intercepting web-page requests is legal. "Rest assured", wrote Philippe Beaudette, WMF Director of Community Advocacy, and Erik Moeller, Vice President of Engineering and Product Development, the authors of the blog post, "You won’t be seeing legitimate advertisements on Wikipedia. We're here to distribute the sum of human knowledge to everyone on the planet—ad-free, forever."
  • WMF announces new hires: The past fortnight has included a number of new WMF engineering-related hires. Subramanya Sastry, a developer and dedicated nonprofit and volunteer worker in his spare time, joins as a senior features software engineer, to work initially on the Visual Editor project. British Wikimedian, long-time editor and indeed founder member of the English Wikipedia's Arbitration Committee James Forrester joins the same team as a technical product analyst; Vibha Bamba joins the Editor Engagement Experiments (E3) team as an interaction designer, having held a similar role at Yahoo; and the E3 team will be benefiting from the work of self-taught programmer Ori Livneh for the foreseeable future.
  • Universal Language Selector featured on Wikimedia blog: As mentioned in the Signpost a fortnight ago, developers from the Localisation Team have now begun work on a Universal Language Selector, which featured on the Wikimedia blog this week. Ideas for the tool, which enables easier switching between different interface languages, first came to prominence back in October 2010. However, the news was not so good for progress on the NewPagesFeed extension, which had to be temporarily turned off this week after a bug caused it to show content not intended for display to random users browsing with Internet Explorer (bug #36968).
  • One bot approved: 1 BRFA was recently approved for use on the English Wikipedia:
  1. JYBot, modifying, adding and removing interwiki links. At the time of writing, 16 BRFAs are active. As usual, community input is encouraged.