Jump to content

User:Aquaticonions/Beyond the Neutral Point of View

From Wikipedia, the free encyclopedia


Beyond the Neutral Point of View: Register, Ideology, and Community Among English-Language Wikipedia Editors

by Aquaticonions

A thesis submitted to the Undergraduate Programs in Anthropology and Linguistics in partial fulfillment of a Bachelor of Arts degree at [REDACTED]

Introduction

[edit]

“[I]f you want to really navigate the truth via Wikipedia, you have to dig into those ‘history’ and ‘discuss’ pages hanging off of every entry. That's where the real action is, the tidily organized palimpsest of the flamewar that lurks beneath any definition of ‘truth’” (Doctorow 2006).

For a new user, learning to edit Wikipedia is presented as primarily a matter of learning a series of rules, techniques, and perhaps values. The knowledge to be mastered is laid out in a collection of “meta” pages, which take the form of impersonal lists of policies, or sometimes hands-on tutorials. Perusing these pages, one gets the impression that successfully editing Wikipedia is chiefly a matter of finding the right guideline or principle to be applied in any particular instance, and following it correctly. However, upon closer examination, it is not clear that editing is (or ought to be) typically done in this rote manner, nor that users are actually inculcated into editor-dom in this way. Among the many guidelines new users are expected to read are a few which seem to run counter to Wikipedia’s apparent emphasis on rules. The guideline page entitled “Be bold” tells new users to err on the side of making more edits, even if some of them inevitably turn out to violate other policies. Most strikingly, a page describing one of the five “pillars” or “fundamental principles” of editing, entitled “Ignore all rules,” contains only the following sentence: “[i]f a rule prevents you from improving or maintaining Wikipedia, ignore it.” As one delves further into Wikipedia, it becomes increasingly clear that even these most fundamental policies and tenets are sites of contestation more often than of consensus on the website. Yet, it often seems that this very contestation brings people and ideas together, perhaps even more than it drives them apart, forming what Jemielniak (2014) describes as a “community of dissensus.” Still, the work that Wikipedia editors undertake is deeply interactional, ideological, and (meta)semiotic, in a way that their own conceptions of that work do not always fully capture.

So, amidst these contradictions, what tasks are actually involved in editing Wikipedia, and how do users come to know how to do them “correctly,” especially when credentialed authorities are not present (as is typically the case)? What happens when users fail, or disagree about what is correct? What part do the many meta-, inter-, hyper-, and paratextual components of Wikipedia, such as talk pages and guidelines, play in this process? How does the notion of “consensus” and the rules that define it shape individuals’ perception and participation? How is the often-discussed “Wikipedia community” understood, produced, and bounded? Lastly, what can be said about the actual contents of Wikipedia’s articles in light of these processes?

This thesis asserts that the community and knowledge aspects of Wikipedia are inseparably intertwined, because their reproduction is mediated through the very same semiotic processes. Through these reproductive processes, certain ways of writing, thinking, seeing, and interacting are continually conventionalized, while others are devalued or erased. This leads users to be recruited to (or rejected from) in-group-ness, and their contributions to be ratified (or not), through discursive enactments of ideology.

In this thesis, I study Wikipedia primarily through discourse analysis of text artifacts from across the website. Through this, I hope to take up objects of study such as ideology, linguistic register, and community which, by virtue of being emergent phenomena, are impossible to look at directly. An understanding of these factors and their interaction help me to generalize about the nature of participation on Wikipedia, as well as the stakes that this social system has for the content of the website’s often-accessed “content pages,” also known as “articles.” My goal is not to directly evaluate the accuracy or fairness of these articles’ contents, but rather to understand the “backstage” discursive processes which underlie them.

My data center around three case studies. Each of these begins with a single “site,” a discussion thread on a particular talk page. Each of these sites involves some sort of infelicity, disagreement, or “breach,” since these types of events and others’ reactions to them are well established as moments when otherwise covert attitudes or ideologies come to the fore and can be studied (Garfinkel 1967). My analysis in each case then extends far beyond the initial thread, to text artifacts elsewhere on the talk page, other pages on Wikipedia which are referenced or hyperlinked in the thread, or even sites elsewhere on the Internet or in the “real world” which are discursively connected to it in some way. My intention is that each of these case studies brings to light a different aspect of Wikipedia’s social world. In drawing these connections and analyzing their significance, I model my approach on Gal & Irvine’s (2019) notions of a semiotic “centerpiece” as a site of ideological work, and thus as a starting point for ethnographic analysis.

My first case study treats a journalist’s reporting on happenings within the Wikipedia community, and its uptake among the community itself. My analysis of this case highlights the role of register and vision, arguing that users learn how to interact on (and conceive of) Wikipedia in a particular and limited way, as a result of explicit and implicit corrections by more experienced editors. In my second case study, I examine a short exchange on the talk page for “Genetic Engineering.” Here, I argue that editing Wikipedia is predicated on ideological “black boxes” which underlie both the aforementioned limited vision of editors, and the apparent incommensurability of their perspectives with those of some outsiders. My final case study starts with a discussion on the talk page associated with the article about the Cherokee Nation. In my discussion of this site and the network of texts surrounding it, I show how register competence (and ideologies about such competence) are key factors in the ratification of people and ideas on Wikipedia, and that the resulting inclusion and exclusion often falls along lines of identity, leading to what is known among Wikipedians as “systemic bias.” Together, these cases show how Wikipedia editors resolve the website’s central contradictions through discursive and ideological work, and the unintended consequences that this work can have.

Background

[edit]
What is Wikipedia?

Among Wikipedia editors, “Wikipedia” can refer to the internet domain named “wikipedia.org,” to the body of text contained therein, to the group of people (mostly volunteers) who maintain and edit that body of work, and occasionally to the group of (mostly paid) bureaucrats within the Wikimedia Foundation who have a certain degree of managerial control over the website. Furthermore, “Wikipedia” used in the singular can also refer to the subset of Wikipedia (considered as a website, body of work, or community) in a particular language. One could speak of “the Spanish Wikipedia,” “the Amharic Wikipedia,” or even “Wikipedias” in the plural if one wishes to discuss more than one. Wikipedias in different languages are fairly autonomous, but tend to share at least basic organizational principles, due to both the shared oversight of the Wikimedia Foundation and the significant crossover between them due to the high proportion of multilingual editors (Jemielniak 2014).[1] This study only examines data from the English Wikipedia. When I reference “Wikipedia,” it can be understood to mean “the English Wikipedia,” though I believe that the majority of the claims I make hold for most other Wikipedias as well.

Despite this apparent ambiguity, users (and non-users) commonly reference “Wikipedia” without clarifying exactly which of these meanings they intend. Wikipedia is frequently understood to be all of these things simultaneously. Rather than a website, a collection of interconnected texts, and a community of people that all share the same name, Wikipedia is often “natively” understood as an assemblage of all of these things. In my analysis, it would be wrong to start from the assumption that, say, Wikipedia as a community and as a collection of texts are two wholly separate things, when Wikipedians themselves do not always make such a distinction. Of course, it would also be wrong to reject the distinction out of hand. I am mostly interested in Wikipedia as a group of people (which, I claim below, constitutes a community) and as a collection of written language, and my unmodified usage of the term reflects that. If I intend a distinct or more specific meaning that is not obvious from context, I say so explicitly.

Wikipedia (the website) has a complex internal hierarchy of pages, which are themselves categorized into various named types. Each individual page is labeled as one type or another, either in a banner at the top of the page, or in the title itself. Content, guideline or policy, user, and talk pages are the most common types, and make up the bulk of my data. Content pages are the informational articles, the only kind of pages that non-editors usually read on the website. These are also known as “(Wikipedia) articles,” a term I also use to refer to them. Guidelines and policies delineate rules about style, organizational protocol, or behavior for editors.[2] User pages contain the profiles, contributions, and achievements of individual Wikipedians. They mostly consist of information curated by the Wikipedian in question, but may also include notes from other users. If content pages are a Goffmanian “frontstage,” intended to be consumed by non-editors and editors alike, all these other page types can be likened to a “backstage” world, intended for editors only (although anyone can access them) (Goffman 1956). All of these page types are housed on Wikipedia (the website), under the domain name en.wiki.x.io.

Associated with every page of one of the other page types listed above is what is called a “talk page.” Such talk pages, which occupy a place of special importance in Wikipedia’s aforementioned “backstage,” are the main focus of this thesis. A talk page can be reached by clicking a tab labeled “talk,” towards the top of a given page. Talk pages are spaces explicitly dedicated to meta-level discussion of the topic, contents, and upkeep of the content, policy, user, or other page they are associated with. For instance, the article entitled Cheese has an associated talk page, entitled Talk:Cheese, where a user could go to propose that a new section be added to the Cheese article, or to ask a question about whether a certain edit would be appropriate. Talk pages are the most common sites for consensus discussion on Wikipedia, although the guidelines around consensus also apply to deliberations that occur on other message boards on the website. As I demonstrate below, users understand each of these page types (content, talk, guideline, etc.) to be characterized by at least one unique linguistic register. However, I also hope to show that the reality of register usage on Wikipedia is somewhat more complex.

Hypertext on Wikipedia

Much of the linguistic content of Wikipedia is inter- and meta-textual. If intertextuality is a phenomenon in which texts (pieces of semiosis which are understood to be discrete in some way) reference, allude, or orient to one another (Bauman 2004:4), we can term meta-textuality a special case of intertextuality in which certain texts (in this case, pages), or even entire classes of texts, are understood or prescribed to be “about” other individual texts, or groups thereof. This metatextuality is metapragmatic (Silverstein 1976) when that “aboutness” concerns the pragmatics of another text, and it is metasemantic when it concerns that text’s semantics. For example, as discussed above, a given talk page is metatextually linked to a certain other page by default, taking as its object the content, most often the semantics or pragmatics, of another page. Similarly, a particular guideline page might metapragmatically detail how to write articles which are biographies of living people, or, as I detail below, how to participate in a talk page. Furthermore, there are many other connections between pages on Wikipedia which are more particular, and which are not necessarily relations of “aboutness.” This more generic inter- or para-textuality—in the sense of pointing to or indexing texts “around” it without being particularly “meta”—on Wikipedia manifests in various ways, and is usually mediated through hypertext.

Hypertext is a technology of central importance to Wikipedia, as well as to the internet’s functioning in general. It allows for multiple digital documents, websites, or pages within a single website to connect to one another through the use of hyperlinks. A hyperlink is a stretch of text within a digital text artifact, which a user can select with a mouse click (or screen tap) in order to access another “linked” artifact. For example, this is a hyperlink that leads to the anarchist news website CrimethInc. By default, hypertext appears in a distinct color (visible in this paper as well as on Wikipedia) and sometimes with an underline (although Wikipedia largely eschews this, as do I). By convention (across the internet, not just on Wikipedia), the text within or immediately surrounding a hyperlink usually contains some description of where the hyperlink leads, but it need not. Semiotically, hyperlinks can be thought of as maximally mediated referential indexicals, which signify their object (the target page) through a “black box” of code and communication between servers. Not unlike a street-sign or a personal pronoun, the hyperlink functions like a set of directions leading to the target page. However, unlike most deictics, the actual process of “navigation” is opaque to the viewer. By not just pointing to a text artifact but actually bringing one to it when clicked, hyperlinks function not just as the street sign, but also as the street.[3] Furthermore, the many stylistic possibilities that are allowed for in the formatting of a hyperlink, along with the technology’s central place in the art and politics of editing Wikipedia, mean that hyperlinks can also serve as non-referential social indexicals (sensu Silverstein 1976). As I show below, the usage and formatting of hyperlinks by Wikipedia editors carry a variety of social-indexical meanings, reflecting the norms and values of the community.

It is worth understanding the qualitative centrality of hypertext’s aforementioned “mechanical” functionality. The density of hyperlinks on Wikipedia, especially within content pages but elsewhere as well, is a signature element of the website’s user experience.[4] Their pervasiveness contributes to a particular “feel” that is characteristic of navigating Wikipedia. This feel seems to encourage users and visitors to engage with the website’s content in a particular way, a qualitative experience that is hard to put into words. So, alongside my attempts to describe its hermeneutic and interactional effects, I reference pages on Wikipedia through hyperlinks throughout this essay, following similar conventions to the ones that I would use if it were written on Wikipedia itself. In addition to making my ethnographic material more easily accessible, I hope that this move helps the reader to experience how connections between pages and ideas are mediated through hypertext on Wikipedia (demonstrating, as well as describing, a sort of poetics of the hyperlink). As I argue, this mediation through both the technology of the hyperlink, and the conventions that surround its use on Wikipedia, plays an important role in shaping patterns of interpretation and interaction among Wikipedia editors.

Existing research on Wikipedia

While Wikipedia has not been treated in the linguistic-anthropological literature, there is much work on the topic elsewhere in the social sciences. For the most part, these studies have very different aims and methods from mine, but their findings are nonetheless useful complements to my own account.

Perhaps the richest source of ethnographic information about Wikipedia is Dariusz Jemielniak’s Common Knowledge? An Ethnography of Wikipedia (2014). This is the only book-length ethnographic study of Wikipedia that I am aware of. Written for a non-academic audience, it gives an insider’s account of navigating the internal political world of the Wikipedia community. Jemielniak, whose background is in economics and organization studies, focuses largely on the formal, institutional aspects of the website, and tends to understand local social happenings among Wikipedians as emergent from this structure. My own analysis takes an opposing theoretical tack, with the fine-grained semiosis of linguistic interaction between Wikipedians as its jumping-off point. Nonetheless, this text is a useful source of basic facts about the history and structure of the Wikipedia community, as well as well-considered critical discussions of user hierarchies and their bearing on the creation of texts.

As Jemielniak understands it, Wikipedia embodies numerous contradictions between top-down and bottom-up forms of social organization, and the accompanying ideologies which justify them. He describes this tension in terms of “anarchy” and “bureaucracy,” terms I avoid (especially the former) due to their contested status in the social scientific literature (though see Graeber (2015) for a salient treatment of the relationship between the two). I have already mentioned his idea of Wikipedia as a “community of dissensus,” in which disagreement and argument are essential to the maintenance of both community and content on the website. Despite my reservations with his method of locating and analyzing contradiction and dissensus, this basic analytic is of use in untangling the place of individuals, collectives, and written rules in the conflicts I examine in each of my case studies. Furthermore, his comprehensive description of the multiple formal and informal hierarchies that structure Wikipedia’s editor community provide essential context for many of the interactions I examine.

In Blogs, Wikipedia, Second Life, and Beyond: From Production to Produsage (2008), media scholar Axel Bruns highlights Wikipedia as an example of “produsage” (a portmanteau of “production” and “usage”), a type of online content which is simultaneously produced and consumed by its users, often without a strong distinction between the two. This framework has certain similarities to the linguistic anthropological perspective I take, influenced by Gal & Irvine (2019), which sees individuals’ simultaneous interpretation and production of signs as fundamental to all instances of semiosis. However, it is on these very grounds that this perspective brings into question Bruns’ implication that Wikipedia and other “new media” are somehow unique in their supposed “interactive” nature; linguistic anthropologists truly do mean that all semiosis is bivalent in this way (Nakassis 2016:233-6). Still, the idea of produsage reminds us that, unlike many other sites (online or off), these chains of uptake and production occur in pursuit of a (perhaps loosely defined) common, “productive” goal.

There is a significant body of literature which attempts to review the factual accuracy of information on Wikipedia’s content pages. Most famously, a high-profile study in Nature pitted Wikipedia against Encyclopedia Britannica, finding the two comparable in terms of their representation of facts (Giles 2005), though Britannica (2006) later rebutted this claim. A number of similar studies have been conducted in the years since then, using various metrics to assess Wikipedia’s content for accuracy. These studies have often come to quite different conclusions, and no scholarly consensus has yet been reached (Mesgari et al 2015). Below, I refrain from opining on questions of aggregate factual accuracy, for which these more quantitative methodologies are better suited. Instead, I hope to shed light on the discursive processes which motivate the presentation of information on Wikipedia, rather than merely evaluating these processes’ results.

Finally, there exist a handful of short anthropological accounts of the Wikipedia editing process itself. Most of these (Bjork-James 2020; Gieck et al 2016; Gloor et al 2015; Pharao Hansen 2016) comment and elaborate on the well-documented “systemic biases” on Wikipedia, in accordance with the user base’s demographic trends.[5] However, unlike these studies, I do not begin with “bias” as a self-evident analytic category. Rather, I attempt to identify ideologies in use among Wikipedia users, and uncover the ways in which their enactment produces and contests notions of “bias” and “neutrality.”

Meet the Wikipedians

The human subjects of this study are known on Wikipedia as “editors,” or sometimes “users.” As implied by these terms’ reference to individuals’ actions of “editing” or “use,” they are commonly understood to include anyone who engages in editing Wikipedia at all, regardless of experience or social status. “Wikipedians” is another term used among the community to refer to editors. While no formal distinction is made between “Wikipedian” and “user” or “editor,” and “Wikipedian” is often used in the same broad sense as those terms (see WP:Wikipedians), this pseudo-demonym is also sometimes employed to denote only those users who have proven themselves to be trusted members of the “Wikipedia community” (a construct I explore in the pages that follow). My usage of “Wikipedian” reflects this narrower meaning of the term, while I use “user” and “editor” to refer to anyone who edits the website. Both of these categories are contrasted with people who access the website simply to read articles and not to edit, whom I refer to as “non-users,” or “readers” (reflecting my focus on Wikipedia’s backstage; they could, of course, be seen as “users” from a frontstage perspective).

Another relevant category of people on Wikipedia is “new users.” “New user” is an ill-defined subcategory of editors who are perceived, or perceive themselves, not to be very knowledgeable (yet) about Wikipedia and how to edit it. Much of this study focuses on new users as they navigate talk page and other “consensus” discussions. I do not attempt to strictly define who counts as a new user; rather, I attempt to show how new users are constructed and identified discursively.

Wikipedia also has a role system, through which certain users are elected to hold special privileges and responsibilities that regular users do not have. The main four categories are administrators (AKA admins), bureaucrats, arbitrators, and stewards (WP:ADMINH). While the role system plays a large part in shaping the social world of Wikipedia (Jemielniak 2014), accounting for this sort of formalized power structure is not the focus of this essay. The particular ethnographic cases I have chosen do not involve users with any of these roles. As I noted in the introduction, I hope to show how power is enacted on Wikipedia even absent credentialed or titled authorities.

Demographically, Wikipedia editors are overwhelmingly male, white, Anglophone, and well-educated. As many as 91% of editors are men, and 61% have at least a bachelor’s degree (Jemielniak 2014). Wikipedia does not directly collect statistics on race, but national origin data show that Wikipedians overwhelmingly hail from global north, white-majority countries like the United States, the Commonwealth countries, and western Europe. This skewing effect is probably heightened further by the fact that editors from outside of these regions are more likely to edit Wikipedia in languages other than English, making the English Wikipedia even more racially, geographically, and linguistically homogeneous than the website as a whole. The average age of a Wikipedia editor is 27 (Jemielniak 2014), and 59% of users are between the ages of 17 and 40.

I myself am a Wikipedia editor. I have had an account since 2015, and in that time I have made around 200 edits, a fairly low number. I consider myself to be a member of the “Wikipedia community,” though I am quite inexperienced in comparison to many other users. I also often act as a “lurker” on edit logs and talk pages, watching other editors’ interactions without intervening. So, I approach this project having had some experience as a participant in the discursive processes I describe, and much more as an observer. I originally became interested in writing about language on Wikipedia while using the website to research native plants in the area where I grew up. I noticed that many articles on particular plants had sections detailing their traditional uses by Indigenous people, but that these practices were all described in the past tense. Noting that many of these uses are still practiced, I began to correct these passages, using the present tense or a more tense-neutral construction. I discussed this project with other users on WikiProject Indigenous peoples of North America, including one user Philkon, or Phil Konstantin, with whom I would later correspond personally. His observations and experiences as a Wikipedia editor were one starting point for researching my third case study.

Epistemology of Wikipedia

[edit]
Ideology

Ideologies, as I define them, are necessary and necessarily limited interpretive frameworks for understanding the world which make some facts important while casting others aside, and lead individuals to ask certain questions while neglecting others (Gal and Irvine 2019). Following the sense in linguistic anthropology which defines ideology as situated beliefs (implicit or explicit) about some topic, object, or idea (say, “language”; Silverstein 1979), my usage of this term reflects its neutral, rather than negative, valence, which is likewise most common in the (non-Marxist) literature (Woolard 1998). Especially relevant to this study are language ideology (Silverstein 1979; Woolard 1998) and media ideology (Gershon 2010), ideologies which take “language” and (certain) “media” as their objects. Language ideologies predispose one towards conceiving of language in general, or particular linguistic varieties or practices, as having particular characteristics and capabilities. Similarly, media ideologies concern what different media are for, and what one can do with them. Media ideologies are shaped by the process of remediation, discussed below.

This project also discusses political ideologies, which I hold to be similar in kind to ideologies of language and media. Present on Wikipedia are not only political ideologies that exist in the “real world” like liberalism or socialism, but also ones that are internal to the website itself. These named ideologies with explicitly laid out beliefs (the most famous being deletionism and inclusionism) are not especially relevant to the case studies I discuss, but reflect important and fundamental divides among Wikipedians. However, not all ideologies (among Wikipedians or in general) are explicit in this way. Indeed, some are implicit or covert (Hill 1998). Such ideologies are less sources of conflict than instances of (local) hegemony, or pervasive and essentialized ideological agreement in alignment with power (Woolard 1985). On Wikipedia, users are acculturated to ideologies (of language, politics, media, and beyond) that make certain ways of thinking, (inter)acting, and contributing seem natural. Meanwhile, non-users who read Wikipedia approach the website with their own, often very different, ideologies. The ideological differences and conflicts that arise from these distinctions are the focus of this paper.

Community

Though the notion of a “Wikipedia community” is commonly referenced on the website, it is not obvious exactly who can be counted a member of this community, and what it is that ties its members together. So, I do not take the existence of such a community for granted. Rather, I intend to demonstrate how community (non)membership is constructed discursively in different contexts, and subsequently leveraged in order to achieve other social ends. I ultimately argue that this is a key aspect of the processes by which “consensus” decisions are reached. In doing so, I draw on a few distinct theoretical perspectives on “community.” In addition to the “community of dissensus” (Jemielniak 2014) discussed above, the notions of “community of practice” (Lave & Wenger 1991) and “speech community” (Silverstein 1996) are both found to be productive (if not perfect fits) when applied to Wikipedia. If the former is defined as a group bound together by some sort of shared practices, the latter is a special case of the former in which those practices are linguistic (though it need not involve a single language). All three of these analytics share the benefit of not relying on agreement between members in establishing or defining a community; any theory that does so would be a poor fit. Jemielniak takes this a step further by arguing that disagreement is not just compatible with a community’s existence, but constitutive of it in this case. Viewing Wikipedia as a community of practice reminds us that Wikipedians are united by a common task, and seeing it as a speech community reminds us that this task is not just mediated by language, but itself linguistic in nature. The latter is also useful in contextualizing shared linguistic practices found among editors (discussed below). Of course, Wikipedia is quite dissimilar to the paradigmatic community of speech or practice, not least because of its size and lack of physical co-presence among members. I draw on all three of these analytic tools in my analysis, as each one provides unique insights.

Citation and the “black box”

The reigning understanding (or ideology) of knowledge on Wikipedia holds that it is always produced elsewhere. In theory, the website acts as a secondary or tertiary source, not unlike a textbook or print encyclopedia, aggregating truths that have already been produced (or, perhaps, discovered) by some other person, group, or apparatus. (This is why contributors are “editors,” not “authors.”) In practice, however, editing Wikipedia often involves disagreement and interpretive work that cannot be reduced to simple representation, collation, or curation. Still, a fairly explicit ideology among Wikipedia editors perceives the website as unproblematically reflecting ideas whose legitimacy has already been established elsewhere. But much like experimental science, editing Wikipedia is an intrinsically social activity, guided and reproduced by means of a totalizing ideology of knowledge.

Of the many policy and guideline pages, three are designated as the “core content policies.” These are “neutral point of view” (which I discuss later), “verifiability”, and “no original research”. The “no original research” policy states:

“outside of Wikipedia, original research is a key part of scholarly work. However, Wikipedia editors must not base their contributions on their own original research. Wikipedia editors must base their contributions on reliable, published sources” (WP:OR).

Multiple things can make a source “reliable.” Many sources, such as newspapers or non-academic publishers, are determined to be reliable (or not) via a consensus process (discussed below). An editor wishing to cite a source of this type can check the relevant noticeboard to see whether it has already been discussed, and if it has, proceed based on that earlier decision. In contrast, peer-reviewed scholarly sources are understood to be reliable by default (WP:V). Secondary academic sources, like meta-analyses, are considered preferable to primary ones, since further interpretive work has already been performed by an expert (WP:OR).

Interestingly, the perceived near-inherent reliability of academic scholarship is attributed to the peer review process (WP:SCHOLARSHIP). The “official” view of scientific knowledge on Wikipedia seems almost totally uninterested in a mechanistic justification for knowledge (i.e. one which holds that laboratory-scientific methodology provides unmediated access to objective truth; see Daston & Galiston 1992). In fact, the relevant policies do not even reference the experimental process, and in fact gloss over the details of peer review. The most detailed explanation for peer review’s efficacy characterizes it as “a professional structure… for checking or analyzing facts, legal issues, evidence, and arguments. The greater the degree of scrutiny given to these issues, the more reliable the source” (WP:V). Wikipedia policy does not explain what such “scrutiny” entails, why it works, or how peer review involves more of it than, say, the fact-checking department at a newspaper.[6]

“Truth” is not a privileged concept within Wikipedia’s policies; “reliability [of sources]” and “verifiability [of claims]” are employed instead, since they are thought to be less ambiguous, less contentious, and easier to apply (WP:V; WP:NOTTRUTH).[7] Of course, this does not mean that individual Wikipedia editors do not have varied opinions about these issues. For instance, enough users periodically request that the website’s policy be changed to emphasize “truth” over “reliability” and “verifiability,” that this idea has its own section on a list entitled “perennial proposals.” Typically, Wikipedia policies do not explicitly prescribe boundaries between acceptable and unacceptable ideas. Rather, they describe the conditions under which those decisions should be made. Additionally, their official status and hegemonic nature allow users who reference them to correct the opinions and actions of others.

Thus, peer review in the eyes of Wikipedia editors can be thought of as a “black box,” a process “about which they need to know nothing but its input and output” (Latour 1987:3). More relevant to my thesis, I argue that the results of past consensus discussions on Wikipedia itself, like policies or the reliability of sources, themselves function similarly. As Latour demonstrates, black boxes in experimental science are created through massive amounts of discursive work over long periods of time. On Wikipedia, the process is much the same; ideas have been deemed relevant or irrelevant, and practices have been deemed productive or harmful, through thousands of deliberations and conflicts over the course of two decades. Once established, decisions like these can be extremely difficult to overturn (Jemielniak 2014). In Case Study 2, I dig deeper into a process that plays a major role in the genesis and reproduction of black boxes on Wikipedia: citational practices, including the citation both of outside sources already deemed reliable, and of previous consensus decisions within Wikipedia itself.

Two kinds of citation are of primary interest to my argument. First, the common practice of referencing policies, usually with hyperlinks, in consensus discussions (such as those found on talk pages) is found to be a locus of social indexicality, and thus a key site where ideologies are enacted, and social differentiation occurs. Especially important in this case is the intertextual gap, or the difference between the “original” instance of a text and its recontextualized form (Bauman 2004:7). Bauman argues that this gap’s “calibration” helps to produce and define the boundaries of genres with respect to one another; in this essay, I proceed from the notion that registers (which have a great deal of conceptual overlap with genres) are delineated similarly. Second, I discuss content page citational practices, which are the most common intertexts connecting Wikipedia to articles, books, web pages, and other texts beyond the website itself. This involves the use of footnotes to reference particular claims back to their sources outside of Wikipedia, much like in academic writing.

Language of Wikipedia: the “Talk Page Talk” Register

[edit]

Across its many component pages and page types, Wikipedia can be understood to be organized in terms of (linguistic) registers. In a linguistic-anthropological framework, register is a “reflexive model” consisting of a repertoire of associated signs (linguistic or otherwise), their corresponding social-indexical values, and ideologies which connect the two in various ways. These disparate elements which make up a model, and which one must understand in order to use or recognize the model, are known as “norms” (Agha 2007:8). These models exist foremost in the minds of interactants, who may instantiate them, or conjecture that another interactant has instantiated them, to varying degrees of success, explicitness, and normativity. Many register norms on Wikipedia are laid out explicitly in guidelines, while others are not. Familiarity with these norms (i.e. membership in Agha’s “domain of recognition”) allows one to judge others’ adherence to them (or lack thereof) (Agha 2005:45). Furthermore, users commonly make explicit metadiscursive (or even hypertextual) references to another’s adherence to—or deviation from—the norms of a particular register (most commonly, those intended to be used on content and talk pages). Instances of both discursive orientation to register models and metadiscursive policing of such orientation are important data, which help us to understand the attitudes (i.e. ideologies) that they presuppose and entail. From this understanding, an analysis which attends to indexical order (Silverstein 2003) allows me to identify the stakes that the events in question hold for issues such as ideology, community, and identity.

Although this essay uses discourse analysis to study register phenomena, it is not a diachronic study of enregisterment or any other similar process, though such a study is an area of future research interest. Each of my three case studies is a primarily synchronic view into the deployment of register, in conjunction with inter-, meta-, para-, and hypertextual elements, in particular instances which highlight the way that these processes reproduce both ideology and social structure. This project is more concerned with the broad social effects of registers in use than with their sociohistorical trajectories, although it also does not neglect the way that these two processes are closely intertwined.

The way that Wikipedia users, especially experienced ones, talk (or rather, type) on talk pages constitutes one linguistic register. This register, which I call “Talk Page Talk,” is socially encoded through a collection of signs and ideologies, which together constitute a reflexive model (Agha 2007). Talk page talk regiments linguistic usage in the context of the inherently metapragmatic talk page. One of its main functions is ensuring that talk page discussions concern the language of the relevant article (which has its own register norms), and that they do so correctly. So, it can be understood as a meta-register—a register about other linguistic registers.

Many aspects of the reflexive model of Talk Page Talk are prescribed explicitly on guideline pages such as WP:TPG. Here, we find that many signs indexing “correct” talk page usage are (meta)orthographic, including proper usage of hyperlinks and indentation. The guidelines make scant explicit reference to lexical elements, but in practice, we find abbreviations and other jargon like “COI” (Conflict of Interest), “NPP” (News Page Patrol), and “enwiki” (English Wikipedia), to be common elements of the register’s repertoire as well. The guidelines for talk pages prescribe not only how to say things on talk pages, but also the sorts of things that should and should not be said. This variety of explicit metapragmatic prescriptions includes “Stay on topic” and “No personal attacks.” There is no indication that Wikipedians typically draw a clear distinction between these two sorts of metasigns, instead treating them as similar in kind, and as similarly important elements of the reflexive model.

In fact, prescriptions like these are not just metasigns, but meta-metasigns, since talk pages (and other non-content pages) and their associated registers are themselves inherently metapragmatic. This can be observed in the metasemantic norms discussed above; “Stay on topic” and “No personal attacks” remind users not only what to talk about and what not to, but that the correct topic of discussion is the linguistic material within particular content pages. Additionally, these prescriptions function iconically as a reminder that the work of editing Wikipedia is largely a matter of thinking and commenting about pre-existing texts. As such, very high degrees of metalinguistic recursion are not uncommon on talk, policy, and guideline pages. Just as talk pages are regimented by guideline pages, those guideline pages have their own associated talk pages, and other guideline pages which explain how to edit both of them. For instance, one banner begins, “[t]his page documents an English Wikipedia behavioral guideline,” above a page entitled “Talk Page Guidelines,” which outlines proper linguistic practices on talk pages, which are themselves metacommunicative (WP:TPG). This banner is an example of talk about talk about talk about talk (or perhaps meta-meta-metacommunication), and is just one click away from any given talk page.

In addition to a repertoire of semiotic elements, the Talk Page Talk register also has a particular social range and domain. A register’s social range includes its indexical focus (the person, group, or thing which is the subject of its indexical value), image (the stereotypic information that is evoked about the indexical focus through the register’s use), and value (which can be positive or negative, or potentially ambivalent) (Agha 2007). In normative usage of Talk Page Talk, the primary indexical focus is the person employing the register. Almost without exception, this person is indexed as a user of Wikipedia. The image typically indexed by the register’s use is that the user is an experienced editor, and on the second order, that their opinion is to be trusted. Therefore, usage of the register is positively valued among Wikipedians.

An additional, and perhaps more trivial, indexical focus of Talk Page Talk points not to a person or group at all, but to the event (or perhaps “location”) of its own use. Just as certain registers might index that a speaker and their interlocutors are enacting a classroom or a courtroom, adherence to the norms of Talk Page Talk indexically indicates that the interactional situation at hand is, in fact, that of a talk page discussion. Lastly, certain tropic usages of Talk Page Talk can, through contrast, take an addressee as an indexical focus, often with a negative value. This is discussed further below.

A register’s social domain includes its domain of recognition (the set of people who recognizes the register in use) and its domain of fluency (the set of people who are capable of felicitously employing it) (Agha 2007). In the case of Talk Page Talk, both of these categories are rather narrow. The domain of fluency is composed entirely of Wikipedians, especially ones who are in fact experienced users. Thus, there is at least an approximate alignment between the register’s indexical image regarding the capabilities of its fluent users, and their actual capabilities. However, this correlation is far from absolute. For instance, a fairly new user could convincingly perform Talk Page Talk (or at least, a close approximation of it) by adhering closely to the guidelines listed above which delineate its repertoire, and/or by closely mimicking the (meta)linguistic practices of other users. While the Wikipedia community expects most inexperienced users not to behave this way (WP:BITE), it is well within the realm of possibility that someone could master Talk Page Talk and other registers faster than they gain the practical skills indexically associated with such mastery.

Talk Page Talk’s domain of recognition is only slightly wider than its domain of fluency, a distinction that highlights the gap between “new users” and more experienced ones.[8] In addition to users who are themselves fluent in the register, the domain of recognition also includes users with enough knowledge of the guidelines or experience reading talk pages that they can tell the difference between a fluent and non-fluent user in context, but perhaps not enough to consistently perform it themselves. To a lesser extent, it also includes very new users or non-users who recognize Talk Page Talk as a distinct register based on its distinct jargon and other repertoire features that set it apart, without detailed knowledge of those features’ indexical (or even denotational) meanings. In place of such detailed indexical awareness, these users might draw on outside knowledge, such as stereotypes about Wikipedia itself, in order to associate the register with qualia such as “technical,” “impenetrable,” “bureaucratic,” or perhaps “geeky.” This reminds us that domains of recognition and fluency are not absolute; rather, individuals can orient themselves to them by degrees. Users can be more or less capable of employing or recognizing Talk Page Talk based on any number of factors. Even the most experienced Wikipedian might well slip up, and even be corrected by another user. We have seen that users’ fluency in Talk Page Talk, their ability to recognize it, and their experience editing Wikipedia do indeed all tend to be correlated, but perhaps not as closely as the average Wikipedian might expect based on indexical stereotypes. It is in these gaps, ambivalences, and ambiguities that the Talk Page Talk register’s potential for affecting social differentiation arises.

Talk Page Talk should be understood with respect to a somewhat wider set of practices and ideologies on Wikipedia known as “consensus.” Consensus, which I mentioned above, is a common mode of backstage interaction on Wikipedia, governed by its own set of norms, many of which are made explicit on dedicated guideline pages. In this paper, the term refers to the very particular process known as “consensus” among Wikipedians. This is distinct from consensus in its usual sense, and claims I make about consensus on Wikipedia do not necessarily apply to consensus processes writ large.[9] It occurs not only on talk pages, but also on message boards and other pages on Wikipedia that are designated for direct discussion between editors. Additionally, an implicit sort of consensus is understood to be frequently reached through the process of collaboratively editing articles or guideline pages, even if the users involved do not converse directly (WP:Consensus). In addition to the many norms (and their associated policies) that are specific to talk pages, users who participate successfully in talk page discussions must align to the norms of consensus. As a result, the (meta)linguistic practices which occur on message boards and other sites of explicit consensus discussion on Wikipedia bear strong similarities to Talk Page Talk, although they are not identical.

Remediation, the process by which shared understandings and practices around new kinds of media draw on norms associated with older media (Gershon 2010) plays an important role in defining register on Wikipedia as well. From the website’s name, a portmanteau of wiki[10] and encyclopedia, the two most obvious instances of remediation on Wikipedia can be observed. But beyond wikis and encyclopedias, many forms of media (and thus, registers) contribute to the ongoing remediation of Wikipedia. A guideline page entitled “What Wikipedia is not” tells us that Wikipedia is not “a dictionary… a soapbox… a blog, web hosting service, social networking service, or memorial site… a manual, guidebook, textbook, or scientific journal,” or “a newspaper,” just to name a few examples. It goes on to metapragmatically explain the difference between each of the aforementioned registers and those common on Wikipedia (including content, talk, and user pages). For instance, we learn that, unlike a blog, “Wikipedia articles use formal English and are not written in Internet posting style.”[11] This entire page is an interestingly reflexive explication of register and media ideology among the Wikipedia community. Wikipedians valorize or stigmatize particular lexical items, orthographic practices, grammatical constructions, and even entire registers based on their stereotypical association with other sorts of media, and incorporate them into their own usage (or take care to avoid them) accordingly. Blogs, textbooks, dictionaries, and other media all act both as foils against which Wikipedia defines the sorts of language that its users ought to employ, and as sources from which its contributors draw both informational content and formal conventions.

In what follows, I trace out the language and epistemology of Wikipedia through three case studies. As noted above, my goal in doing so is to determine the way that register models, “black boxes,” and ideological notions of community membership are invoked, cited, and contested in the process of generating “consensus” among Wikipedians.

Case Study 1: Pete Buttigieg

[edit]

From time to time, certain “behind-the-scenes” events on Wikipedia attract attention beyond the limits of the website, and spill out into the “real world.” One such case occurred during the 2020 presidential election, when investigative journalist Ashley Feinberg published a Slate article about a Wikipedia user with apparent ties to then-candidate Pete Buttigieg’s campaign, who had edited a large number of articles about, or connected to, Buttigieg. In the article, she connects the account to a person, Neehar Garg, who attended the same high school as Buttigieg, and was possibly involved in one of his earlier political campaigns. She also divulges that the Buttigieg campaign denied any connection with Garg or the account, despite apparent evidence to the contrary from Garg’s editing history. Feinberg concludes:

The evidence seems overwhelming that, despite the campaign’s repeated denials, Pete Buttigieg has had knowledge of, and at least some active participation in, the maintenance of his Wikipedia presence. This is not a crime. It’s just a deeply weird thing to deny. (Feinberg 2019)

This article gained significant traction on social media, which is where I first encountered it in late 2019 or early 2020. Feinberg’s telling of the story fits one of several common media narratives about political bias, conflicts of interest, and paid editing on Wikipedia. For instance, it practically invites comparison to the now-defunct Twitter page CongressEdits, which tweeted automatically every time Wikipedia was edited from within the houses of Congress until its suspension in 2018.

Inevitably, popular off-site discourses like the one that began with this article make their way back to Wikipedia. A link to the Slate article can be found towards the top of the talk page for the content page entitled “Pete Buttigieg,” along with links to four other news pieces that reference Feinberg’s, under the heading “[t]his article has been mentioned by multiple media organizations” (hyperlinks original). This heading is built using a template provided on a guideline page, which provides protocols for acknowledging and handling press coverage of Wikipedia. Media coverage of Wikipedia is common enough for a detailed set of protocols about it to exist on the website, but still rare enough that it tends to generate at least some degree of interest, excitement, or occasionally even anger, among Wikipedians.

Archived on the “Pete Buttigieg” talk page are two discussion threads about the Slate article. With one exception, the participants in both discussions appear largely uninterested in the implications of Feinberg’s reporting with respect to large questions about the trustworthiness or political bias of Wikipedia, or even of the content page in question. (In my experience, these questions are much more commonly treated in detail by journalists and academics than by users on the website itself). Rather, they are mostly concerned with the accuracy of Feinberg’s representation of the fine points of Wikipedia’s internal operations, or the practical effects that her article’s popularity might have on the process of maintaining the “Pete Buttigieg” page.

Among non-Wikipedians who are aware of Wikipedia, such as many journalists, schoolteachers, or internet users writ large, a pervasive (but not hegemonic) ideology holds that Wikipedia is not an unreliable source of facts per se, but rather is compromised or limited in particular ways. One common iteration of this idea is the claim that Wikipedia articles about political or other controversial topics are unreliable because they are subject to extensive conflicts of interest or other biases, while less controversial articles are more trustworthy. Another ideology holds that the text of Wikipedia articles themselves is not reliable, but that the non-Wikipedia sources cited within the text usually are. While some Wikipedia users may also agree with these perspectives, statements like these are more often taken by Wikipedians as overly simplistic generalizations which are not informed by the more complex reality of quality control on the website. Feinberg does not explicitly appeal to either of these narratives, but her article, much like CongressEdits, would be a welcome piece of confirming evidence for anyone who was already skeptical about the political information on Wikipedia.

In the example above, the effects of both “native” and “non-native” ideologies are apparent. For a non-Wikipedian, a question like “what implications does possible editing by political campaigns have for the accuracy of information on Wikipedia?” might seem to follow straightforwardly from Feinberg’s Slate article. (An answer to that question might even seem equally straightforward, rightly or not.) But for most of the twelve users who participated in this discussion in December of 2019, this broad concern about the website as a whole was subordinated to more technical, or perhaps even pedantic, ones:

The policy knowledge was definitely good, albeit not perfect. She states that COI editing is "strictly prohibited" on WP, when actually our policy merely "strongly discourages" it, and only outright prohibits undisclosed paid COI. But that's a minor quibble. Sdkb (talk) 08:31, 28 December 2019 (UTC) (Talk:Pete Buttigieg, hyperlinks original)

At the very least, any concerns that these users did have about Wikipedia being politically compromised were expressed somewhere other than this talk page, or kept to themselves. As explained above, users on talk pages (as well as, occasionally, elsewhere) conform to the model of Talk Page Talk by varying degrees, in accordance with their level of familiarity with the register, their attitudes towards the register’s indexical values, and their social goals within a particular event. In the case of the Pete Buttigieg talk page discussion, most users involved embody this register in a fairly normative way. Although many of them disagree with one another, their shared register usage puts them on a more or less equitable footing (Goffman 1979), framing the event such that the particular sorts of disputes that occur (namely, those regarding technical details) are made appropriate. The converse occurs when one user in the discussion appears not to participate in normative Talk Page Talk:

Just a comment - Are politically-charged editors (either from the campaign or for intentions of attacking another candidate) invading Wikipedia? I've seen a few contentious articles be put up to AfD, usually being bombarded by those in support of the person in question. Weird stuff, I guess this is my 1st time seeing something fishy like this Letmejustcorrectthatforyou (talk) 10:56, 21 December 2019 (UTC)

See the article Conflict-of-interest editing on Wikipedia. People with WP-articles are often interested in what those articles say, which is unsurprising. Gråbergs Gråa Sång (talk) 15:26, 21 December 2019 (UTC) (Talk:Pete Buttigieg, hyperlinks original)

The first comment, which is a top-level response[12] to the initial post of the discussion thread started by the user Catiline52, deviates from the model of Talk Page Talk in several ways. Aside from the jargon term “AfD” (Articles for Deletion), the user Letmejustcorrectthatforyou uses no register-characteristic lexical items, instead opting to use non-technical terms. For instance, “politically charged” is not a common descriptor within Talk Page Talk (or normative language on Wikipedia broadly); an experienced editor who wanted to be perceived as such would be more likely to use phrases like “political bias” or, as many others in the thread do, “COI” (Conflict of Interest). The lexical items “invading,” “bombarded,” and “fishy” seem similarly out of place on a talk page. These choices, especially “politically charged,” seems to more closely evoke a journalistic register, aligning them to a certain degree with Feinberg herself, rather than the other users in the discussion. Recall the guideline which states that “Wikipedia is not a newspaper” (WP:NOTNEWS). While it is impossible to know whether other users shared my own uptake and saw Letmejustcorrectthatforyou’s non-standard register usage as journalistic, it is apparent that register on Wikipedia is defined and bounded not just positively through the guidelines’ prescriptions, but also negatively through remediative comparison with other kinds of media. The use of bold text violates a prescribed orthographic norm as well (WP:SHOUT[13]). While perhaps not strictly part of the talk page register, the red (broken) link to their user page, indicating that they have not yet set it up, is another common index of lack of experience or technical Wikipedia knowledge. In fact, Jemielniak highlights the redlinked username as “an immediate sign of being either a novice or a rarity” which makes one “stand out and receive more scrutiny” (2014:25).

The semantic content of Letmejustcorrectthatforyou’s comment deviates from norms in several ways as well. First, their leap from the particular happenings described in Feinberg’s article to a more general statement about the political state of Wikipedia is extremely atypical for Talk Page Talk, and could even be seen as a violation of talk page guidelines such as “No meta” or “Stay on topic.” Additionally, and perhaps most straightforwardly, the assertion “this is my 1st time seeing something fishy like this” denotes (rather than indexes) their unfamiliarity with a particular kind of event on Wikipedia. Especially in the context of all the other co-occurring signs described above, this statement can easily be metonymically extended to denote a more general unfamiliarity with Wikipedia as a whole. Furthermore, their apparent concern about the reliability of “contentious articles” aligns with the very same popular ideologies regarding political information on Wikipedia discussed above. This further orients Letmejustcorrectthatforyou’s perceived perspective away from that of the Wikipedian community, and towards some outsider perspective. In particular, in combination with the aforementioned lexical choices such as “politically charged,” it indexes a further orientation towards the perspective of a journalist like Feinberg, whom other, apparently more experienced, users in the discussion such as Gråbergs Gråa Sång, overtly or implicitly criticize.

This is precisely the interpretive tack that Gråbergs Gråa Sång takes. (Note that they are the only user to respond to Letmejustcorrectthatforyou’s comment, perhaps itself an index of the infelicity of Letmejustcorrectthatforyou’s contribution, which otherwise occasioned a resounding silence in the Talk Page.) Their reply simultaneously demeans, dismisses, and corrects Letmejustcorrectthatforyou through a tropic invocation of the Talk Page Talk register. By linking to the (very long) Wikipedia article detailing the history of conflicts of interest on Wikipedia, they inform Letmejustcorrectthatforyou that the issue they had never seen before is in fact common and well-documented. This same link also acts both as a speaker-focal index of their relative experience through the use of a correctly formatted in-line hyperlink, and as an addressee-focal index of Letmejustcorrectthatforyou’s lack thereof. Gråbergs Gråa Sång’s description of the situation using more register-appropriate terms, like “conflict of interest” and “people with WP-articles” indexes lexical competency in a parallel manner. Additionally, the (probably intentionally) obvious statement “[p]eople with WP-articles are often interested in what those articles say, which is unsurprising” seems to imply that it was silly for Letmejustcorrectthatforyou to be surprised by the existence of political conflicts of interest on Wikipedia at all, and inappropriate for them to bring it up on this talk page. All of these elements function dialogically with respect to the initial comment as a correction of its orthographic, lexical, and semantic content.

Ultimately, we see how Gråbergs Gråa Sång’s two-sentence reply indexically indicates what one should say on a talk page, how one should say it, and how Letmejustcorrectthatforyou’s initial comment failed on each of these counts. Corrections like these are common across Wikipedia talk pages, and in aggregate, they serve to regiment the ideologies and actions not only of the (usually) inexperienced editors who are corrected, but also other users who come upon the archived conversations. Individual corrections like this one encourage new users to perceive and interact with some particular aspect of Wikipedia differently than they did before, usually in a way that is quite practical, focused on the minutiae of editing rather than the big picture of the website’s functioning. Over time, these ideological “tweaks” allow a user to increasingly orient themselves, by degrees, towards the norms which are figured as typical of experienced editors. Thus, these same tweaks play a part in delineating the boundaries of the “Wikipedia community,” both in the moment as the new user is temporarily placed outside of the community or at its margin, and over time as the user gradually becomes perceived as a member, and potentially even plays an active role in boundary-making themself.

How might this conversation have gone differently? The policy page “Please do not bite the newcomers” provides the clearest prescription for how interactions like this one ought to proceed. Depending on one’s interpretation, Gråbergs Gråa Sång could be seen to either conform to or violate guidelines such as “[m]oderate your approach and wording,” “[a]void sarcasm,” and “[a]void excessive Wikipedia jargon” (WP:BITE). In all likelihood, this ambiguity is intentional; at the very least, it is what makes their reply efficacious. Replying with outright hostility would not only have been less likely to meaningfully regiment Letmejustcorrectthatforyou’s (and other users’) future actions on talk pages, but also would have opened up Gråbergs Gråa Sång themself to criticism for being harsh and disproportionate, in violation of the guidelines. They probably also could have communicated similar information (both indexical and denotational) while also going out of their way to avoid seeming curt or dismissive, though perhaps not in just two short sentences. While instances of correction like this serve to regiment action and ideology and define boundaries, they sometimes look very different. Some editors are unusually kind, while others are unapologetically hostile. Users who make tweaks like these do so in a variety of contexts, and might themselves have varying degrees of technical knowledge, as may their interlocutors. The remaining case studies begin with similar corrections to the one detailed above, although the ensuing conversations take different trajectories, due to factors like these.

Case Study 2: Genetic Engineering

[edit]

On September 7th, 2018, an anonymous Wikipedia user with the IP address 86.183.211.16, who had never contributed to the website before, made a long edit to the content page entitled “Genetic engineering.” The contents of the edit, in which the user criticizes genetic engineering, deviate from the register characteristic of Wikipedia content pages, and include only one citation of an outside source, whose relevance to their larger argument is tenuous. This new text was removed from the page (“reverted,” in emic language), a few minutes later by the more experienced user Plantsurfer. The anonymous user then left a series of notes on Plantsurfer’s user talk page, first to ask for help citing sources, but soon after with accusations of “censorship” and “ignor[ing]” their[14] requests. They also added a banner on top of the “genetic engineering” article alleging that its “neutrality… [was] disputed,” although Plantsurfer removed this as well.

Additionally, the user left the following note on the talk page for the “genetic engineering” article, under the heading “Look a little bit closer, see, roses really smell like: NPOV (4U)”:[15]

Hello about 1/10th of this article discusses the negatives of genetic engineering, and it has been placed right at the bottom of the article. Perhaps it'd be good to allow other people to submit information without the censorship? Cherry picking what edits are approved just because you disagree with the content could really be seen as a little bit fascist (on a good day). Either way, feel free to bury your heads in the sand and hide people's minds from the realities of the big bad world ... because either way, it ain't gonna save you! :) (Talk:Genetic engineering)

Plantsurfer deleted this note, citing “trolling,” but the anonymous user soon reverted it back. Plantsurfer did not attempt to remove it a second time, but instead replied to the note, writing:

No censorship is happening here. Anyone is free to contribute to this article provided the contribution complies with Wikipedia's core rules, notable among which are adherence to a neutral point of view, no original research and verifiability. (Talk:Genetic engineering)

Soon after, the anonymous editor stopped making contributions. Their IP address may have been blocked, or they might have simply given up.

This sort of breach is not terribly common on Wikipedia, but neither is it unheard of. Most deliberations are more civil and closer to uniform in register, and these sorts of interactions certainly play a large part in reproducing the website’s norms and ideologies as well. In the case described above, what sets apart the anonymous user is their unfamiliarity with (or in all probability, intentional disregard for) the policies, conventions, and attitudes that are hegemonic on Wikipedia. Their apparent aggression aside, the user was legitimately new to Wikipedia at the time of this interaction (unless they had previously contributed from a different IP address). They appear to learn more about how to edit the website as the conflict progresses, earlier on asking for help with a citation, but later installing a banner, which requires a small degree of technical editing knowledge. However, regardless of the extent to which more seasoned Wikipedians like Plantsurfer understood their unusual conduct as resulting from ignorance or hostility, it is notable that Plantsurfer[16] engages only with their conduct. Not once in the interaction between the two does Plantsurfer respond to the content of the anonymous user’s claim about genetic engineering. Rather, Plantsurfer labels the user’s initial edit, and subsequent comments on various pages, as “vandalism,” “trolling,” and a violation of all three core content policies.

Contrary to the anonymous editor’s claims, this probably cannot be chalked up to “censorship,” given that diverse and critical views of genetic engineering are productively debated elsewhere on the same talk page, and given attention both on the “genetic engineering” content page, and on another content page dedicated to controversies about genetically modified foods. Rather, Plantsurfer is unable to engage with the user or take them seriously, because neither seems interested nor able to commensurate the ideological divide between them. While ideas and processes (such as, in this case, the guidelines themselves) may be black boxes for an experienced Wikipedian like Plantsurfer, the anonymous user has no reason to take them for granted.

Thus, Plantsurfer’s most natural response to their edits is to, through citation, express that they are operating under the wrong norms. In doing so, Plantsurfer achieves several things. The use of in-line hyperlink citations of Wikipedia policies is “perfunctory” in Latour’s sense (1987:34). By demonstrating knowledge of important documents across the website and the technical ability to refer to them using the correct formatting (as opposed to using an unformatted hyperlink, or referencing the policy without linking to it at all), Plantsurfer indexes her group membership as a Wikipedian and fluency with both its explicit rules and implicit norms of discourse. In contrast, the anonymous user is shown to be deficient in all of these areas, and is thus figured as an outsider, incompetent, and unserious. Plantsurfer does not, and need not, explain what it means to adhere to a neutral point of view, or why the anonymous user’s edit qualifies as vandalism. In fact, her appeal to NPOV dialogically rebukes the anonymous user’s invocation of the same policy in the heading of the discussion thread, implying that it, like their other attempts to adhere to Wikipedia’s norms, was incorrect. She employs these references, not primarily to teach or clarify, but “as so many signposts indicating… the technical resources that are under [her] command” (Latour 1987:36).

Plantsurfer’s (non)usage of deictics, compared to that of the anonymous user, reinforces this same contrast. While the latter employs the interrogative mood and makes multiple references to a plural “you,” Plantsurfer eschews person-indexing deixis altogether. This non-acknowledgement of the addressee helps her seem objective and above the fray, as does the phrasing of her references to policies as nomic statements in the simple present. By comparison, the anonymous user appears frantic and accusatory (if they did not seem that way already).

Relatedly, Plantsurfer’s citation of Wikipedia’s policies helps to reproduce the practices that allow for the production of knowledge on Wikipedia, by clearly demarcating correct practices from incorrect ones. By dismissing the anonymous user’s edits out of hand, Plantsurfer seems to make an example of them. Since talk page discussions are never deleted, exchanges like this one in aggregate, with their breaches and subsequent corrections, remain as reminders for other users who might happen upon them, of kinds of behavior that are not suitable for producing knowledge on Wikipedia. As mentioned above, the details of the policies that Plantsurfer cites are not of primary relevance; they are victims of the intertextual gap. These links also have an emblematic significance. Plantsurfer’s choice to characterize the anonymous user’s actions as violations of the core content policies (the three most important policies on the entire website), reflects their transgression not so much in quality as in degree. There are a number of other policies which would be equally- or better-suited to describing the incongruity between the user’s actions and those that are expected of Wikipedians. However, the fundamentalness of the core content policies to the ideologies and practices of Wikipedia acts as an indexical-icon for the anonymous editor’s perceived fundamental failure to enact those ideologies and practices. When viewed by other users, this interpretation, made credible by Plantsurfer’s indexing of her own relative expertise, makes the anonymous user’s actions into not just a failed attempt to edit Wikipedia, but the very paradigm for what not to do on the website.

As we have seen, the production of knowledge on Wikipedia parallels that of experimental science (and other similarly procedure-grounded institutional practices) in several ways. Like them, it relies heavily on ideas which, through the discursive work of either prior consensus on Wikipedia, or of outside fact-checkers and peer reviewers, come to function as black boxes for those who participate in it. These ideas, as heuristics or touchstones of a wider ideology, are strategically deployed in the continual battle to keep Wikipedians’ community of practice coherent and discrete, and to bring it closer to a perceived goal. Often, these deployments come in the form of citations and references, which carry with them a wide range of indexical associations. This further contextualizes what I called ideological “tweaks” in the first case study; the corrections are achieved through this indexical process of citation, and the “big picture” questions that users are encouraged to ignore are prior consensus decisions that have become black boxes. Yet, many who participate in these practices and share these ideologies would deny or minimize the discursive or social nature of their work, instead characterizing it as a project of mere representation. These ideas and practices, which are all so essential to Wikipedia’s functioning, are also closely intertwined with some of its largest problems.

Interactions like those discussed in this case study and the previous one are an essential part of Wikipedia’s functioning. Wikipedia’s policies make clear (and I would concur) that the website cannot become, or continue to be, a knowledge-producing entity without excluding certain ideas, practices, and thus also people. This exclusion, which often happens discursively through the consensus process, is also productive, creating new categories and policing boundaries between them. The case of Plantsurfer and the anonymous user is a comparatively low-stakes example of this. But the question of exactly what and who should make up the milieu of Wikipedia is highly fraught, especially in cases of what Wikipedians call “systemic bias,” which I address below. Conclusions reached according to the ideology of consensus on Wikipedia are often highly contingent, and with a critical eye and the benefit of hindsight, sometimes turn out to be misguided or illegitimate. Due to Wikipedia’s sheer size, such decisions are frequently made by a small and random few who happen upon a particular discussion, or worse, by malicious groups with harmful agendas. This problem is further compounded by the inertia of existing consensus, a force that seems to become more powerful the more important or broad-reaching a given consensus decision is (Jemielniak 2014). Yet, a powerful ideology holds that this process yields results which are by default good (and sometimes even binding), but also minimizes the social particularities of the process itself.

Here, then, are two contradictions within Wikipedia as an apparatus. First, despite its aesthetic of “openness” or universality, ignoring the “big picture” of the website, as explained above, and excluding people and ideas (implicitly or explicitly), both seem to be centrally important to Wikipedia’s everyday functioning. Second, the very practices and ideologies which are most essential to Wikipedia’s concrescence can, at the same time, be sites of frustrating arbitrariness and outright injustice. Wikipedians are not naive to these contradictions. On the contrary, much of the ideological work that they engage in, including that which I discuss in this paper, is concerned with justifying or resolving them.

Case Study 3: Cherokee Nation

[edit]

The talk page for Wikipedia’s article about the Cherokee Nation is nearly as long as the article itself. It is full of clarification questions, deliberations about content, and debates about errors, dating back more than a decade. In June 2009, a discussion began under the heading “Name of the tribe.” At this site, several Wikipedians deliberate about whether the associated content article should be titled “Cherokee Nation” or “Cherokee Nation of Oklahoma.”[17] Users Philkon[18] and Uyvsdi argue for the former designation, while a third user with the screenname Chuck Hamilton, who had himself added many of the references to the “Cherokee Nation of Oklahoma” prior to the discussion, makes a case for the latter. On their user pages, Philkon and Uyvsdi both identify themselves as enrolled tribal citizens (Philkon as citizen of the Cherokee Nation in question, and Uyvsdi does not specify), while Hamilton does not. The deliberation concludes that all references to the tribe should be changed to “Cherokee Nation,” a change which Uyvsdi applied soon after the discussion. Today, the page follows the same convention, with a single note explaining that “Cherokee Nation of Oklahoma is sometimes used.

The users in this talk page discussion do not all succeed at using Talk Page Talk normatively, but they all orient to it to varying degrees. As I have noted, (meta)communication on Wikipedia involves frequent references to different sorts of intertexts across and beyond the website. For instance, in the aforementioned tribal name discussion, Chuck Hamilton claims that removing “of Oklahoma” from the name “would not only be misleading but a violation of NPOV [neutral point of view]” since the other two recognized tribes also consider themselves to be Cherokee nations. Uyvsdi disagrees, arguing that “[c]alling a tribe by its current name is hardly a violation of NPOV,” and that while some people object to that name, they “are not in a position to name someone else's tribe” (Talk:Cherokee Nation). In doing so, Uyvsdi turns Hamilton’s accusation against him, citing the very same guideline to make the opposite point (and perhaps to accuse him of political bias). Eventually, it becomes clear that Hamilton is in the minority, and he concedes the issue.

Like many Wikipedia editing guidelines, NPOV is a site of frequent conflict. Disagreements about policies like NPOV play a major part in the production of boundaries on Wikipedia, but actual adherence to the policy’s details is rarely what is at issue. Though, in my opinion, Phil’s and Uyvsdi’s perspective conforms to it most closely in this instance, the NPOV policy at the time did not provide a totally clear-cut answer to the Cherokee Nation dispute, nor did the (perhaps more relevant) guideline page about article titles. According to those guidelines, official names hold no privileged status in the determination of an appropriate article title. Rather, they should balance the goals of “recognizability, naturalness, precision, conciseness, and consistency” (WP:AT). The naming conventions for “tribes and ethnic groups” did not yet exist, and up until recently, gave little specific information that would be relevant to the name of a tribal nation. Ultimately, the decision on an appropriate title is left up to consensus.

Wikipedia’s guidelines on article titles provide explicit information on the naming conventions for government agencies, operas, and even an entire page dedicated to the names of ancient Romans. Yet, until recently, this enormous collection of guidelines only incidentally addressed the naming conventions for sovereign tribal nations on the aforementioned “ethnicities and tribes” page, only written in 2012. From its inception, this page failed to address that they are distinct from “tribes” in the ethnographic sense, or that they are political units as much as (or perhaps more than) ethnic ones (Barker 2005). This contrast highlights what has been identified among both Wikipedians and critics as the website’s Western-centric “bias.” It is also an example of semiotic erasure, a phenomenon in which an ideology obscures certain qualities or possibilities (Gal & Irvine 2019).

The guidelines for article naming seemed to presuppose (as many Wikipedians probably do) that there is a fairly neat divide between “ethnicities and tribes” on the one hand, and political institutions on the other, and thus fail to account for tribal nations, which have qualities of both (Barker 2005). Since tribal nations were unaccounted for in the guidelines, a main source of epistemic authority for editors, there is a sense in which they did not exist within Wikipedia’s ontology at all. Just as Phil Konstantin and his interlocutors encountered this ambiguity more than a decade ago, participants in various other deliberations since have referenced this guideline in a way that, I would argue, led to confusion and error. The very idea of “neutrality” on Wikipedia (and thus, of its opposite, “bias”) is contingent on a set of guidelines, which are themselves underlain by an (in this case, political) ideology that presumes a certain taxonomy of things and people. To even speak of neutrality, then, is misleading because it implies the possibility of an alternative state of affairs in which Wikipedia is somehow non-ideological. Ideologies, of course, are unavoidable because they are necessary interpretive frameworks; they can only be replaced with different (and perhaps, better) ones (Gal & Irvine 2019).

In fact, Wikipedia is not stuck with this particular ideology forever. Since starting to write this thesis, I raised the issue on WikiProject Indigenous peoples of North America, and a few editors have begun to restructure the “ethnicities and tribes” guideline page in order to better account for the political and social dimensions of Indigeneity with respect to article titles. This consensus process is ongoing at the time of writing, and I cannot predict where it will lead or if it will provoke backlash. Editing a relatively obscure set of guidelines like these is not as difficult as revising core policies, nor is it the same thing as actually changing editors’ minds, but it may well have a real impact on the presentation and reception of Indigenous topics on Wikipedia.

In the tribal name dispute, however, the actual content of the NPOV and article naming guidelines is hardly even discussed. Instead, the editors involved mostly allege that their detractors have violated it, and that they have abided by it, making their references “perfunctory” much like Plantsurfer’s citation of policies in Case Study 2. Of course, this does not mean that both sides are equally right or wrong. However, it does speak to the size of the intertextual gap between the original text of guidelines like NPOV on the one hand, and their recontextualizations in talk pages on the other. I believe that this gap exists for the following reasons. Like other page types, Wikipedia guideline pages are stereotypically associated with their own register, which has its own repertoire, social domain, and social range, distinct from those of Talk Page Talk. As instantiations of this different register, long quotations of guidelines would seem stylistically out of place on a talk page, and would evoke more attention or emphasis than an editor would typically want. Additionally, in the case of long and complex policies, such quotations would have to be correspondingly lengthy and awkward. Instead, there is a pervasive norm for referencing policies and guidelines that sidesteps this issue, but also increases the perceived semiotic “distance” between the two. Instead of quoting, editors usually name the requisite policy and provide an in-line hyperlink to the guideline page itself (WP:TPYES).[19]


To naturalize and streamline this practice even further, a system of shortcuts has proliferated, allowing Wikipedians to link not only to entire guideline pages, which can be quite long, but also to individual policies within them, using a hyperlink which only names the policy it indexes in an abbreviated manner (see Figure 1). So, the details of the policy are at once easily at hand (since they can be accessed by clicking the hyperlink), and offset (since they are not actually present in a given talk page comment). This system has its benefits and drawbacks. While it allows for easy reference to information from a massive list, it also widens the gap between the original content of a guideline and its recontextualized, abbreviated form, to the point where one can invoke its name or abbreviation, while conveying little or none of what it is meant to signify. This has two major consequences. First, as seen above, Wikipedia’s policies can be more easily used to lend institutional legitimacy to any side of an argument, regardless of its actual relevance. Second, this system of abbreviations exacerbates the hierarchy based on differential technical knowledge and capabilities with respect to the repertoire of Talk Page Talk, which I have argued tends to exclude new, less-experienced, and perhaps minoritized editors.

Later in the discussion, Philkon justifies his position by stating that he has “talked with the Cherokee Nation Principal Chief Chad Smith about this issue,” and that, “[a]s an enrolled citizen of the ‘Cherokee Nation’... there is no ‘Cherokee Nation Of Oklahoma’”. A new user, Onopearls, then joins the discussion, and argues that Phil’s reference to his personal communications “constitutes WP:OR [original research] and isn't relevant to the discussion” (hyperlinks original). Uyvsdi disagrees, claiming that “this isn't a citation for an article; it's a discussion, so talking to highest [sic] elected official of the tribe in question is highly relevant.” The majority of editors involved in the discussion concur with Uyvsdi, and Phil’s “original research”[20] plays a role in the conclusion that they reach.

Today, the “no original research” guideline page explicitly states that the policy “does not apply to talk pages,” but this phrase has been added since the discussion took place. As they stood in June 2009, the guidelines were ambiguous about this issue. Like many disagreements on Wikipedia, the issue was left up to the interpretation and consensus of the users involved. The difference of opinion between the users described above can be understood as a difference of media ideology. The ideology enacted by Phil and Uyvsdi holds original research to be an important source of information, which happens to be inappropriate to cite on a content page. For them, the “no original research” policy is a prescriptive rule that applies in a specific media context, not a claim about knowledge at large. For Onopearls, however, original research is undesirable across contexts on Wikipedia, or possibly even a bad source of information in general. The change between 2009’s version of the policy and today’s, reflecting the former ideology, could be interpreted as a result of an intertextual consensus deliberation between (and about) talk and content pages. At the time of the tribal name discussion, such an analysis might argue, the website was smaller and newer, and the media ideologies of its contributors had had less time to coalesce. Through discussions and conflicts like the one described above, occurring within and between various sites across the website, Wikipedians approached a consensus as the registers characteristic of talk and content pages became more clearly defined with respect to one another. Eventually, this consensus was codified, and what was once a difference in personal ideology became instead a matter of policy. (The NPOV debate discussed above might exemplify a similar process at an earlier stage, since no consensus seems to have been reached yet.)

Policies can be understood as ideologies which have been codified and given some official status. In a strictly hierarchical context, this means that they are enactments of the ideologies of those with the most power, while in a more equitable or democratic medium, they would in theory reflect the ideologies which are most widely held. On Wikipedia, which, according to both expert observers and users themselves, is some unique kind of heterarchy comprising elements of both (Jemielniak 2014; WP:NOT), policy must be understood as a combination of the two. So, while the above narrative about the guidelines’ origins is not false, it is not complete either. The process of consensus which has led to both written policies and widespread (but uncodified) beliefs and practices across Wikipedia has naturalized particular ideological orientations. Like all ideologies, these do not treat every person or topic equally.

In this case study, we have seen how this issue manifests in the case of Indigenous people. The political realities of Indigeneity are mischaracterized and unaccounted for not only by Wikipedia’s policies and guidelines (which we have seen can be changed, at least in certain cases), but through pervasive ideologies, of which the policies are only the most visible exponents. Views or critiques (both on Wikipedia and in the academy) which reify or take for granted the category of “systemic bias,” let alone the Wikimedia Foundation’s proposed solution of “knowledge equity” (see Bjork-James 2021) do not fully account for this phenomenon. Many of the problems identified as “bias,” such as disproportionate focus on certain article topics and the exclusion of minoritized editors, in fact emerge from, and are reproduced by, the discursive enactment of ideology in the practices of consensus and citation. Proposed solutions to these issues that fall under the “knowledge equity” umbrella, such as edit-a-thons and other attempts to reduce these gaps, do not address the root of the problem. Of course, affecting the ideological orientations of thousands of Wikipedia editors is a daunting task, and I do not have specific recommendations for how it might be achieved (though I believe that groups like the aforementioned WikiProject Indigenous peoples of North America are doing effective work in this area; see Pharao Hansen (2016) and Bjork-James (2021) for more suggestions). As I have shown, Wikipedia has proven itself to be at once a milieu in which novel ideologies frequently take shape and proliferate, and a place where preexisting ideas often go unquestioned. Unless the latter tendency wins out over the former, changes like this one might just be possible.

Conclusion

[edit]

The invocations and uptakes of the Talk Page Talk register in the sites I have examined are microcosms of larger discursive phenomena on Wikipedia. While a complete sociohistorical account of Talk Page Talk is outside the scope of this paper, the synchronic effects of its use are apparent. Users are encouraged to attend to particular sorts of tasks and questions while neglecting others, and taught technical knowledge of Wikipedia nearly coterminously with a register that indexes such knowledge. To be clear, my claim is not a Whorfian one. Talk Page Talk does not simply encode or determine the knowledge and skills necessary to be a competent Wikipedia editor. Rather, linguistic practice, community, and technical proficiency (as well as editors’ ideological orientations towards all three of these factors) are continually and dialectically co-constructed on Wikipedia in a way that makes them hard to separate causally.

Underlying this process is a body of knowledge, including both “reliable” outside sources and prior internal consensus, which is itself rarely contested because it forms the very hegemonic ideological orientations upon which most contestation among Wikipedians is based. Though it can be understood to some extent as a “community of dissensus,” the Wikipedia community is also constructed through social differentiation that entails boundary-making and ultimately the exclusion of certain people, ideas, and practices which are understood not to presuppose these “black boxes.” These boundaries are constructed discursively as new consensus is established, often through citation and other indexical work. It is through this process that a community capable of maintaining and expanding the website is refined and reproduced. Yet, the very same phenomena give rise to what both insiders and critics consider to be Wikipedia’s largest problems. The consensus process is certainly vulnerable to capture by bad actors, but even absent ill intent, it unavoidably relies on unconsidered and pervasive ideological assumptions which, as I have shown, can be at odds with Wikipedians’ own goals of neutrality and reliability. When coupled with the practices of differentiation and exclusion described above, these assumptions likely play a role in the continued marginalization of minoritized groups and individuals on the website as well.

What bearing does this analysis have on the average reader’s experience of Wikipedia? One should remember that behind nearly every Wikipedia article is some sort of negotiation between deference to “reliable,” black-boxed epistemic authority on the one hand, and various individual editors’ ideological attitudes, predispositions, and goals on the other. While the results of this can be apparent in the text itself, the best way to understand the ideological work that has gone into the making of an article is to simply read the talk page and edit logs. While these pages are, perhaps intentionally, less accessible to the general public than the articles (constituting their own “black box” from a reader’s perspective), there is no substitute for delving into this “tidily organized palimpsest” (Doctorow 2006) if one wishes to critically engage with information on Wikipedia.

Sources Cited

[edit]

Agha, Asif. “Voice, Footing, Enregisterment.” Journal of Linguistic Anthropology 15, no. 1 (2005): 38–59. [1].

Agha, Asif. 2007. Language and Social Relations. Cambridge: Cambridge University Press.

Barker, Joanne. 2005. Sovereignty Matters: Locations of Contestation and Possibility in Indigenous Struggles for Self-Determination. Lincoln: University of Nebraska Press.

Bjork-James, Carwil. 2020. “New maps for an inclusive Wikipedia: decolonial scholarship and strategies to counter systemic bias.” New Review of Hypermedia and Multimedia: 1-22.

Cunningham, Ward. “Wiki History.” WikiWikiWeb, 2014. [2].

Daston, Lorraine, and Peter Galison. 1992. “The Image of Objectivity.” Representations, no. 40: 81–128. [3].

Doctorow, Cory. 2006 “On ‘Digital Maoism: The Hazards of the New Online Collectivism’ by Jaron Lanier.” Edge: The Reality Club. [4].

Encyclopædia Britannica, Inc. 2006. “Fatally Flawed: Refuting the recent study on encyclopedic accuracy by the journal Nature.”

Gal, Susan, and Judith T. Irvine. 2019. Signs of Difference: Language and Ideology in Social Life. Cambridge: Cambridge University Press.

Garfinkel, Harold. 1967. Studies in Ethnomethodology. Englewood Cliffs, N.J.: Prentice-Hall.

Gershon, Ilana. Breakup 2.0: Disconnecting over New Media. Ithaca, NY: Cornell University Press, 2010.

Gieck, Robin, Hanna-Mari Kinnunen, Yuanyuan Li, Mohsen Moghaddam, Franziska Pradel, Peter A. Gloor, Maria Paasivaara, and Matthäus P. Zylka. 2016. "Cultural differences in the understanding of history on Wikipedia." In Designing Networks for Innovation and Improvisation, pp. 3-12.

Giles, J. 2005. “Internet encyclopaedias go head to head.” Nature 438, pp. 900–901. [5].

Gloor, Peter, Patrick De Boer, Wei Lo, Stefan Wagner, Keiichi Nemoto, and Hauke Fuehres. 2015. "Cultural anthropology through the lens of Wikipedia-A comparison of historical leadership networks in the English, Chinese, Japanese and German Wikipedia." arXiv preprint arXiv:1502.05256.

Goffman, Erving. The Presentation of Self in Everyday Life. Garden City, NY: Doubleday, 1959.

Goffman, Erving. [1979] 1981. “Footing”. In Forms of Talk, pp. 124–57. University of Pennsylvania Press.

Graeber, David. Fragments of an Anarchist Anthropology. Chicago: Prickly Paradigm Press, 2004.

Graeber, David. The Utopia of Rules: On Technology, Stupidity, and the Secret Joys of Bureaucracy. Brooklyn, NY: Melville House Publishing, 2016.

Hill, Jane H. “Language, Race, and White Public Space.” American Anthropologist 100, no. 3 (1998): 680–89. [6].

Jemielniak, Dariusz. 2014. Common Knowledge?: An Ethnography of Wikipedia. Stanford University Press.

Latour, Bruno. Science in Action: How to Follow Scientists and Engineers through Society. Cambridge, MA: Harvard University Press, 1987.

Lave, Jean, and Etienne Wenger. Situated Learning: Legitimate Peripheral Participation. Cambridge: Cambridge University Press, 1991.

Mesgari, Mostafa, Chitu Okoli Mohamad Mehdi, Finn Nielsen, and Arto Lanamäki. 2015. “‘The Sum of All Human Knowledge’: A Systematic Review of Scholarly Research on the Content of Wikipedia.” Journal of the Association for Information Science and Technology. 66. 10.1002/asi.23172

Nakassis, Constantine. Doing Style: Youth and Mass Mediation in South India. Chicago: University of Chicago Press, 2016.

Parker, Jeff. 2001. “A Poetics of the Link.” Electronic Book Review. [7].

Pharao Hansen, Magnus. 2016. "Writing Irataba: On Representing Native Americans on Wikipedia." American Anthropologist 118, no. 3: 541-553.

Silverstein, Michael. 1979. “Language structure and linguistic ideology.” In The Elements: A Parasession on Linguistic Units and Levels. Edited by Paul Clyne, William F. Hanks, and Carol L. Hofbauer. Chicago: Chicago Linguistic Society.

Silverstein, Michael. 1996. “Encountering Language and Languages of Encounter in North American Ethnohistory.” Journal of Linguistic Anthropology 6, no. 2: 126–44. [8].

Silverstein, Michael. 2003. “Indexical Order and the Dialectics of Sociolinguistic Life.” Language & Communication 23: 193–229. [9].

Woolard, Kathryn A. 1985. “Language Variation and Cultural Hegemony: Towards an Integration of Sociolinguistic and Social Theory.” American Ethnologist 12:738-748.

Woolard, Kathryn A. 1998. “Language Ideology as a Field of Inquiry.” Introduction. In Language Ideologies: Practice and Theory, pp. 3–20. New York, New York: Oxford University Press.

  1. ^ See this source also for a comparative account of the English and Polish Wikipedias, and Gloor et al. (2015) for a study that looks at the English, Mandarin, Japanese, and German Wikipedias side by side.
  2. ^ Guidelines and policies are technically two separate categories on Wikipedia, where policies are generally understood to be more fundamental, big-picture, or reflect a more robust consensus among the community. However, one page entitled “The difference between policies, guidelines and essays” admits that the distinction is “obscure” and not agreed upon by the community. I use the two terms interchangeably.
  3. ^ Aside from its importance to the basic “mechanical” workings of the internet, hypertext has a long history of literary and poetic usage (Parker 2001), showcasing its potential for indexical value and inclusion in the repertoires of register phenomena.
  4. ^ Certain cultural touchstones are quite literally reliant on Wikipedia’s norms of hypertext usage. These include shared experiences like the “Wikipedia rabbithole” (a long Wikipedia-reading session in which one learns about a paradigmatically obscure topic, where clicking links to further articles on the subject is figured as “going deeper”) as well as popular games like the Wiki Game (in which players “race” from one article to the next, navigating only by clicking hyperlinks).
  5. ^ These sources largely corroborate my own findings, suggesting that certain prevailing ideologies among Wikipedians undervalue or erase individuals, groups, topics, and forms of knowledge which appear to deviate from these unmarked categories.
  6. ^ Another fact of note is that this judgment about peer review holds across disciplinary lines, meaning that scholarly work in all disciplines (including the “hard” and social sciences as well as the humanities) are theoretically treated the same on Wikipedia.
  7. ^ In fact, this explicit rejection of “truth” means that the word is absent from the repertoire of Wikipedia’s various technical registers, making its usage in many contexts an index of outsider status and/or non-adherence to the ideology of verifiability.
  8. ^ The presence of new users should not be underestimated. A significant portion of activity on Wikipedia at any given time involves somewhat or completely inexperienced users. 50 percent of registered users have only ever made a single edit, and 95 percent have made ten or fewer (WP:Wikipedians). (Additionally, a large number of edits are made by unregistered users, since it is possible to contribute without making an account at all). Some of these inexperienced editors are certainly trolls or vandals, while others might join just to correct a single error, and yet others might intend to learn the metaphorical ropes, and become active and knowledgeable users. New users of all sorts can be found on nearly every edit log or active talk page, and form an essential part of the Wikipedia ecosystem. New users (both as figures of personhood and actual actants) are central to two of my three case studies.
  9. ^ See Graeber (2004) for a general account. The features of consensus on Wikipedia that I focus on; such as the effects of prior consensus decisions, the minutiae of interactional semiosis, and specific manifestations of power; are particular to Wikipedia rather than common or universal characteristics of the type Graeber identifies.
  10. ^ The word wiki originates with WikiWikiWeb, an early wiki-style website. Its name derives from the Hawaiian word wiki, meaning “quick” (Cunningham 2014).
  11. ^ Exactly what is meant by “formal English” is itself the subject of many guidelines.
  12. ^ A top-level response is a direct response, rather than one lower down in the thread. The help page about talk pages states that “[e]ach comment should be indented one more level than the comment it replies to, which may or may not be the preceding comment” (Help:Using talk pages § Indentation). In this example, Letmejustcorrectthatforyou’s comment is a reply to Catiline52, and Gråbergs Gråa Sång’s comment is a reply to Letmejustcorrectthatforyou. Notably, formatting this system of indentation must be done entirely by hand, in contrast to other online platforms like Twitter, Tumblr, or Reddit, which format similar comment threading systems automatically.
  13. ^ Interestingly, this guideline metaphorically extends the inherently metapragmatic verb “shout” to include the use of bold, all-caps, or large-font text.
  14. ^ Neither user makes their gender public in this discussion or elsewhere on Wikipedia. For clarity, I have opted to refer to Plantsurfer with the pronoun “she” and the anonymous user with “they.”
  15. ^ NPOV referring, of course, to the “neutral point of view.” It is interesting that the anonymous user, seemingly quite unfamiliar with norms of interaction on Wikipedia, knew this piece of jargon. I am unsure of the intent behind “4U.” I suspect it is intended as an abbreviation of “for you,” but I do not know what they meant by it.
  16. ^ Aside from Plantsurfer, the only other person to engage in this conflict was a user named Dennivitch, who left a short response to the anonymous user’s talk page comment, debating the accuracy of their usage of the word “fascist” (Talk:Genetic engineering).
  17. ^ The nation in question, whose legal name is the Cherokee Nation, is the largest of three federally-recognized Cherokee tribes, the other two being the Eastern Band of Cherokee Indians and the United Keetoowah Band of Cherokee Indians. This official name is the only one used by tribal and federal government authorities. According to Philkon, the same name is commonly used by the tribe’s citizens and most other people, while its political detractors are most likely to refer to it as the Cherokee Nation of Oklahoma (Konstantin, personal correspondence, 2020).
  18. ^ Philkon is Phil Konstantin, with whom I have had personal correspondence.
  19. ^ Users in the tribal name discussion do not always provide such hyperlinks, perhaps because they were a less common part of the Talk Page Talk register in 2009, or because they were not all aware of the convention.
  20. ^ While it is not discussed explicitly, the question of exactly why Phil’s claim even counts as “original research” highlights an important nuance of Wikipedia policy. This has to do with what counts as “research” for Wikipedians. The phrase “original research” is meant to contrast with “reliable, published sources.” Phil’s reference to Principal Chief Smith is certainly not “original” in the sense that it defers to someone who could be considered an authority. However, the fact that he cites private correspondence which is not published or public (and thus, not verifiable) rather than, say, a press release, makes it “original” in the sense which is relevant on Wikipedia.