Wikipedia:Wikipedia Signpost/2009-01-31/Orphans
Large portion of articles are orphans
Almost 30% of Wikipedia articles are "orphans", with few or no incoming links from other articles, according to WikiProject Orphanage. Based on an analysis by JaGa from January 24, 2009, that includes 133,515 articles with zero links from other articles and another 92,031 linked only from lists or chronology pages. A total of 533,411 articles have links from only one or two articles (excluding lists and chronology pages); these are also classified as orphans according to WikiProject Orphanage. Only 42,936 articles have been tagged with the {{orphan}} template. By JaGa's count there are 2,575,308 articles when disambiguation pages are excluded (compared to 2,700,000+ counted by Special:Statistics).
The distribution of links per article is a characteristic long tail distribution that approximately demonstrates the Pareto principle: articles with 50 or more links comprise 20% of all articles, but account for 84% of all links. JaGa's list of the top 5000 articles by link count shows that many of the very top articles are ones commonly linked from templates, such as biography, geographic coordinate system, list of sovereign states, and music genre. Major nations are also among the most-linked articles; United States holds the top spot, with 16% of all articles linking to it.
The long tail distribution of links is consistent with a 2008 academic study of the network structure of Wikipedia, which showed that—like networks of scientific publications—Wikipedia linkage demonstrates preferential attachment and appears to be a scale-free network (see earlier story). That study focused on red links and the creation of new articles, and followup work showed a troubling trend that may also help explain the large magnitude of the orphan problem revealed by JaGa's data. Computer scientist Diomidis Spinellis showed that while Wikipedia was growing exponentially from 2003 to 2006 there was a stable average rate of 1.8 links to "incomplete" articles (red links and stubs) per non-stub article, but that rate had declined to 1.4 by early 2008. This indicates that linkage patterns became more "top-heavy" and articles were relatively less likely to point to undeveloped articles. Orphaned articles tend to be stubs, and because they have few related articles linking to them, they are likely to remain underdeveloped for longer than well-linked stubs.
Partly to blame may be a pernicious trend noted by User:Raul654, James F. and others: contrary to the red links guideline, red links are frequently being removed for aesthetic reasons. The 2008 linkage study showed that new articles tend to be created soon after the first link pointing to them. Red links thus drive growth and allow new articles to avoid orphan status right from the start.
Discuss this story
Hello. I just read your article on Orphan articles, and wasn't sure whether there was a dedicated place to comment - so I came here in the meantime. Mainly just to thank you for noting that there seems to be a massive anti-red link campaign in many quarters: I've noticed it myself, even to the point of people editing out links I'd deliberately left for bizarrely-overlooked important articles-to-be-written (there are still some glaring omissions on many of the topics I'm interested in). I suspect the trend is symbiotically linked, however, to practices of over-linking. e.g. the contrary tendancy to link every other word in an article whether or not it has any bearing on the subject to hand. (In fact, frequently this seems to become "particularly if not".) There have been specific drives to remove the linking of dates (with some validity - they all-to-often become trivia magnets of dubious relevance in many cases), and related 'unnecessary' links - which has further leaked over into removing very-necessary links because they look similar to those elsewhere deemed unnecessary. I think that the removal of red links, or a drive to stem their creation, can be seen to be hand-in-hand with those types of push. Sometimes. Similarly, on the same/other hand, mass-creation of red links is another common "problem" - it can either (some say) cast doubt on the notability of a subject by stating/implying that there are no obvious references to it anywhere here, or else suggest that the editor is over-zealous in their own interpretation of what might be eventually considered sufficiently notable (i.e. assuming that every "best boy" and "grip" in a film's cast & crew list will ultimately warrant their own separate page). After which slight rambling, all I really wanted to say was "Thank You" for trying to reassert the significant benefits and usefulness of red links, and for highlighting why they are important, necessary and worthwhile. ntnon (talk) 00:32, 2 February 2009 (UTC)
Why "portion" rather than "proportion"?
Would it be possible to have a "random orphan article" link in the navigation column? Jackiespeel (talk) 15:10, 2 February 2009 (UTC)[reply]
Relation between number of links and number accesses?