Jump to content

User:FactOrOpinion/Draft SPS RfC

From Wikipedia, the free encyclopedia

This RfC is to determine the consensus about (1) whether the current explanation of "self-published" in WP:SPS generally serves us well, perhaps with small improvements, or if it should be revised in some significant way, and (2) how editors interpret "self-published," in order to help us revise the explanation if needed.

[place signature here]

Note: this RfC is solely about WP's interpretation of "self-published." It is not trying to assess whether a source is reliable, independent, primary, biased, etc., or whether its use is due or needs to be attributed, as these aspects are distinct from whether the source is self-published.

RFCBEFORE discussions took place here (a disagreement about whether material published by GLAAD is self-published), here (a more general discussion of what "self-published" means), and here (an RfC: "Should grey literature from advocacy groups and other similar orgs always be considered WP:SPS and therefore subject to WP:BLPSPS?"). However, disagreements about the interpretation of "self-published" go back much further than the RFCBEFORE discussions; these examples from 2020 (here, here and here) and 2021 circle around many of the same issues. Notes from previous discussions (below) is an attempt to summarize key issues raised in one or more of these discussions.

RfC questions

[edit]

WP:SPS explains the meaning of "self-published" with text in the body, supplemented by text in a longer footnote. The explanation as a whole is comprised of a link to the mainspace article on self-publishing, multiple examples, the statement "Self-published material is characterized by the lack of independent reviewers (those without a conflict of interest) validating the reliability of the content," and three quotes mentioning self-published material.

Full text of WP:SPS's explanation of "self-published"

Body:

Anyone can create a personal web page, self-publish a book, or claim to be an expert. That is why self-published material such as books, patents, newsletters, personal websites, open wikis, personal or group blogs (as distinguished from newsblogs, above), content farms, podcasts, Internet forum postings, and social media postings are largely not acceptable as sources.

Footnote:

Self-published material is characterized by the lack of independent reviewers (those without a conflict of interest) validating the reliability of the content. Further examples of self-published sources include press releases, the material contained within company websites, advertising campaigns, material published in media by the owner(s)/publisher(s) of the media group, self-released music albums, and electoral manifestos:

  • The University of California, Berkeley, library states: "Most pages found in general search engines for the web are self-published or published by businesses small and large with motives to get you to buy something or believe a point of view. Even within university and library web sites, there can be many pages that the institution does not try to oversee."
  • Princeton University offers this understanding in its publication, Academic Integrity at Princeton (2011): "Unlike most books and journal articles, which undergo strict editorial review before publication, much of the information on the Web is self-published. To be sure, there are many websites in which you can have confidence: mainstream newspapers, refereed electronic journals, and university, library, and government collections of data. But for vast amounts of Web-based information, no impartial reviewers have evaluated the accuracy or fairness of such material before it's made instantly available across the globe."
  • The "College of St. Catherine Libraries Guide to Chicago Manual of Style" (DEKloiber, December 1, 2003) states, "Any site that does not have a specific publisher or sponsoring body should be treated as unpublished or self-published work."

Question 1: Consider issues such as whether the characterization of "self-published material" is a good way to characterize it, the explanation of "self-published" (link + examples + characterization + quotes) reflects consensus practice, the explanation provides sufficient guidance, and editors agree on how to interpret it. Which option best represents your view?

a) The explanation might benefit from small improvements, but it serves us well and we should keep it.
b) The explanation is problematic in some significant way(s), and we should figure out how to revise it.

If your answer is (a), propose small improvements if you want. If your answer is (b), please identify the main problem(s).

Question 2: The previous discussions show consensus that some classes of publications are self-published and other classes of publications are not self-published. But for a sizeable swath of publications, consensus is unclear. Options a-c describe three views from the previous discussions. Which view best captures the kinds of sources that you'd say are/aren't self-published? If an option represents your view pretty well but not exactly, just say how you'd modify it:

a) Self-published sources are those where there is no barrier to one or a few people (not organizations) publishing what they want, perhaps by paying some entity to publish, print, or host it. Examples include open wikis, internet forum posts, personal websites, music released by its creator(s), and preprints. Someone other than the writer/creator(s) may provide feedback or editing (e.g., an author hires an editor), but this other person cannot block publication. Everything else — including material published by diverse organizations — is not self-published.
b) No barrier materials are self-published. Sources are also self-published if they're published by an organization and the content is about the organization itself (e.g., "About us" text, an annual investors report, marketing material), even if these have been reviewed by someone who could have blocked publication. Everything else is not self-published. (Note: the fraction of an organization's publications that are about the organization itself can vary a lot from one organization to another.)
c) Material from "traditional" publishers (e.g., newspapers, books from a standard publishing company, peer-reviewed journals) is not self-published unless it's about the organiza tion itself. Everything else is self-published, including material published by other kinds of organizations and any no barrier materials hosted by the traditional publisher (e.g., reader comments on a news article).
d) None of the above. Please describe your view, aiming for a description such that most of the time, other editors would say that it provides effective guidance for determining whether a given source is or isn't self-published.

Note: If the meanings of "no barrier" materials, "organization itself" materials, and "traditional" publishers aren't clear enough, there is more info in the Notes from previous discussions (in the sections titled Categories of publishers, General areas of consensus, and Areas where consensus is unclear). The Table below also provides illustrative examples.

Since there are two questions, your response might look like one of these examples:

  • 1a, 2b: I think things are pretty good right now and that a conflict of interest always exists if an employee is writing about their employer.
  • 1b, 2c: I think the existing explanation is confusing. The Collins Dictionary is right, and most of the time, unless you're using a publishing company, the material is self-published. Government publications are a gray area.

Additional information

[edit]
Table illustrating differences in how the current explanation and options 2a-c categorize example sources as SPS or not

Table

[edit]
Type of source Current explanation Option 2a Option 2b Option 2c
Personal social media post SPS SPS SPS SPS
Co-authored vanity press book SPS SPS SPS SPS
Government hearing transcript SPS maybe SPS speakers=authors, government=publisher or only printer? maybe SPS might also depend on the topic SPS
Foundation's webpage about its grant to an artist, plus artist's biographical info not SPS a mix SPS
Politician's campaign material not SPS SPS SPS
Coca-cola.com corporate website not SPS SPS SPS
Business press release distributed through PRWeb not SPS SPS SPS
Advocacy non-profit's "About us" info not SPS SPS SPS
Advocacy non-profit's report on an anti-LGBTQ+ activist not SPS not SPS SPS
Free university webinar about AI not SPS not SPS SPS
University department's faculty listing not SPS SPS SPS
National government publication re: its own defense capabilities not SPS SPS SPS
CIA World Factbook not SPS not SPS SPS
UN Intergovernmental Panel on Climate Change report on mitigation not SPS not SPS SPS
Job openings at the New York Times, on NYT website not SPS SPS SPS
Ofsted annual report about a local school (such reports are required by UK law) not SPS not SPS may not be SPS publishing arm of the government?
Learned society's membership info not SPS SPS SPS
Learned society's peer-reviewed journal not SPS not SPS not SPS not SPS
New York Times news article not SPS not SPS not SPS not SPS
Breitbart News article (Breitbart is blacklisted) may be SPS, do their editors validate reliability? not SPS not SPS not SPS
Live on scene (unscripted) TV news probably SPS, as an editor can't check in advance not SPS not SPS not SPS
Music album released by Sony characterization doesn't apply not SPS not SPS not SPS
Book released through a small press would depend on the genre (fiction, memoir, poetry, etc.) not SPS not SPS not SPS
  • The table shows that for some examples, all of the options classify the example as SPS or all of the options classify the example as non-SPS. These are examples of classes where editors' consensus is clear. The table also shows that for some examples, options vary in whether they classify the example as SPS or not. These are examples from the swath of publications where consensus is unclear.
  • Several of the cells for the current explanation are blank, as I've heard different editors express different opinions (for example, here are a number of quotes that WhatamIdoing collected from the WT:V archives where experienced editors stated contradictory views about whether Coca-Cola.com is self-published). Come to your own conclusion about each those cells: SPS, not SPS, or perhaps "the example's description doesn't provide enough information to know." You may want to say something about your conclusions in your !vote, especially if you chose answer 1a.
Notes from previous discussions

Notes

[edit]

Sorry if this feels too long to read (though it's a lot shorter than reading the preceding discussions!). People raised lots of issues, and this is my imperfect attempt to capture the most salient. I've tried to remain neutral in the sense of including people's varied perspectives; however, specific views below may not be neutral, as people sometimes had strong views.

Categories of publishers

[edit]

Some editors distinguished among different categories of publishers:

  • Natural persons (humans, as contrasted with organizations).
  • Organizations such as newspaper and magazine publishers, television broadcasters, non-vanity book publishers, publishers of peer-reviewed journals, record labels representing lots of artists. Some people call these "traditional" publishers, characterizing them as being in the business of publishing.
  • Organizations such as advocacy groups, universities, learned societies, think tanks, corporations, international non-governmental organizations, intergovernmental bodies, museums, foundations, charities, labor unions, and political campaigns. Some people call these "non-traditional" publishers. While they may publish quite a bit in the context of their main mission, they wouldn't describe themselves as primarily being in the business of publishing.
  • Governments might be sui generis. They have huge variations in size, differ both across and within countries, and they publish quite diverse types of materials.

Depending on how you interpret "self-published," a single publisher might publish a mix of self-published and non-self-published material. You might also conclude that some publishers have an arm that functions like a "traditional" publisher and another arm that doesn't (e.g., a government's publishing office versus its defense department, a professional society's peer-reviewed journal versus its advocacy arm).

General areas of consensus

[edit]

There seems to be consensus about the self-publishing status of some kinds of publications:

  • Materials like the following are self-published: personal websites, personal or group blogs (as distinguished from newsblogs), social media posts, wikis, preprints, reader comments on websites, music/games released under the creator's own label, internet forum posts, vanity press books, patents, unscripted podcasts published by the podcaster, individual Substacks, Forbes.com "contributors" material, Kindle Direct Publishing books, user reviews, a paid promo in a newspaper, and personal YouTube videos. There is no barrier to the creator(s) publishing — or paying someone else to publish, print or host — what they want, even if it's sometimes removed after the fact via post-publication moderation (e.g., a tweet that's removed for violating X's terms of service). Sometimes the author is one person, and other times, two or more people are authors (e.g., co-authored research), but corporate authors aren't included. In most cases, there is no editor, but if there is an editor, the editor cannot prevent publication. (In the RfC, I called these "no barrier" materials for ease of reference, though that term wasn't used in previous discussions.)
  • Most material from "traditional" publishers is not self-published, though there might be a few exceptions (see below). These publishers sometimes host no barrier materials (e.g., a newspaper article is not self-published, but reader comments on the article are, even if they are moderated).

Areas where consensus is unclear

[edit]

Previous discussions have not resolved whether the following kinds of material are always/sometimes/never self-published, and if it's "sometimes," what features distinguish the self-published materials from the non-self-published ones:

  • Material from non-traditional publishers and governments.
  • Material where the content is about the organization itself, even if it's edited by an employee who can block publication, and even if the organization mostly publishes non-self-published material. Examples include marketing material for the organization's products (including advertisements, where a newspaper or TV station effectively serves as a vanity press), press releases, political campaign material, annual investor reports, "About us" text, advertising rate info, information about employees, information about how to exchange a product, a government's explanation of how to use its services, and information about employment with the organization. (In the RfC, I've called these "organization itself" materials for ease of reference, though that term wasn't used in previous discussions.)
  • Material published by a "traditional" publishing company, but written by someone with control over publication (e.g., the company's owner, the journal's editor).

Words with multiple interpretations, and dictionary definitions of "self-published"

[edit]

WP:V states that "Source material must be published, on Wikipedia meaning made available to the public in some form," with a footnote adding "This includes material such as documents in publicly accessible archives as well as inscriptions in plain sight, e.g. tombstones." In contrast, The Chicago Manual of Style considers some documents in public archives to be unpublished. "Publisher" can mean "any entity that publishes," or instead be limited to "an organization in the business of publishing." Some editors use "publisher" when referring to a printer (e.g., of a dissertation) or a host/platform (e.g., a social media site, Kindle Direct Publishing); other editors say that "publisher" is distinct from "printer" and "host/platform." The word "author" can also be used in different ways. "Author" might mean "the human being(s) who created the work," or instead be used in a way that includes corporate authors. For material published by an organization, someone's interpretation of "author" may depend on whether the person who wrote it is named. Thus, in a discussion, the intended meaning of a word may be ambiguous, and participants' interpretations may differ.

Dictionary definitions of "self-publish(ed)" include:

  • "issued directly to the public by the author rather than through a publishing company" (Collins)
  • "to publish (a book) using the author's own resources" (Merriam-Webster)
  • "to arrange and pay for your own book to be published, rather than having it done by a publisher" (Cambridge)
  • "to publish or issue (one's own book or other material) independent of an established publishing house" (Dictionary.com)
  • "publish (a piece of one's work) independently and at one's own expense" (Oxford American)
  • "publish by oneself or with one's own money" (American Heritage)
  • "That is or has been published by oneself; chiefly spec. (of a book or other work) prepared and issued for distribution or sale by the author" (Oxford English)

In the definitions that use "author," it's ambiguous whether it's meant to include corporate authors or only natural persons. Some definitions highlight (1) whether the author pays for the work's publication, some highlight (2) whether the author uses a "publishing company" or "established publishing house," and some highlight both. Although (1) and (2) intersect, they're not the same; for example, if material is written by an employee and published by the employer, the material is not self-published according to the first (unless you treat the employer as a corporate author), but may be self-published according to the second. Self-published material need not involve a cost, as with social media or wikis.

Other considerations

[edit]

In reasoning about what is or should be considered self-published, people drew on diverse considerations, and a single person's reasoning often involved several considerations. Below are additional facts/opinions/questions that various people introduced. A single paragraph may include contradictory claims from different people:

a) Overview / use: The meaning of "self-published" has significant implications for which sources can be used for WP content, especially for content about living persons. BLP content is currently sourced to materials that some people consider self-published (e.g., material published by advocacy groups, universities, and governments); maybe some editors are misinterpreting WP:SPS, or maybe there's a gap in the policy. At times, editors' assessments of whether a specific source is or isn't self-published seem to be based on whether they do or don't want the source/POV to appear in an article, an appeal to consequences. This seems especially likely to occur with contentious topics (e.g., gender, politics). The many debates about what is/isn't self-published show that the current explanation doesn't work well enough. A clearer explanation would help reduce the time and energy spent in these disputes, and would help us determine whether some article content needs to be removed as a BLPSPS violation, or whether a source that some people thought was excluded under BLPSPS can actually be used. Some worry that narrowing the interpretation of SPS would create a BLP "minefield." Although WP:NOTBURO, many editors quote WP:SPS and WP:USINGSPS when debating whether a given source is/isn't SPS, and new(ish) editors also turn to the policy and the essay to figure out what they're supposed to do, so we want these texts to be clear and to represent consensus.
b) The explanation as a whole: The characterizations of "self-published" in the WP:SPS footnote and WP:USINGSPS differ. Some would like the essay's characterization to replace the one in WP:SPS, and others disagree. The WP:SPS characterization is overly broad, and some (or many) things are characterized as self-published when they should instead be characterized as non-self-published. Alternatively, the WP:SPS characterization is overly narrow, and some (or many) things are characterized as non-self-published when they should instead be characterized as self-published. We should use a dictionary definition, not "wikijargon." Dictionary definitions are easy to apply to some kinds of sources (e.g., books), but WP editors use many kinds of sources, and it may be unclear how a dictionary definition would categorize these other kinds. Outside of WP, people often use "self-published" only for no barrier materials.
c) Reviewer: The footnote characterization refers to a reviewer. Depending on the source, it may be hard to know whether material is reviewed by someone, and if so, whether that reviewer is in a position to block publication. Some think that an organization can be assumed to have a sufficient review process based on features such as size and positive reputation. Others think that whether an organization has a sufficient review process cannot be assumed, and it has to be demonstrated with an explicit editorial structure.
d) Conflict of interest: The footnote characterization refers to conflict of interest. COI is distinct from bias. How do we assess whether a conflict of interest exists? Is one of the interests always "reliability" (which WP never actually defines, though it is linked several times to a "reputation for fact-checking and accuracy"), and if so, what is the other interest that might or might not be in conflict? For example, is it the interests of the reviewer's employer, and if so, how do we determine what those are? Does a reviewer always have a COI when checking content about the reviewer's employer, but seldom otherwise? Is there always a COI if the author and reviewer both get paid by the same entity? If "conflict of interest" remains in the characterization, should it be linked to the mainspace COI article?
e) Reliability: The footnote characterization refers to validating the reliability of content. It may be hard to know whether a reviewer is assessing the reliability of the material; a reviewer might instead only be checking things like grammar and organization. The reliability of a source depends on what WP statement you want to source to it. Whether self-published material is likely to be more reliable than non-self-published material depends in part on one's interpretation of "self-published." Even if self-published sources are less reliable on average, the characterization conflates "self-published" and "reliable," when a source might be one, the other, both, or neither. Policies highlight the presumed overlap of self-published status and non-reliability in several ways. For example, this is why most SPS cannot be used as sources, and why the EXPERTSPS and ABOUTSELF exceptions exist. WP:SPS appears in a section titled "Sources that are usually not reliable," the current characterization refers to the lack of an independent editor "validating the reliability of the content," and an early ArbCom conclusion said "A self-published source is a published source that has not been subject to any form of independent fact-checking ..." (A bit of history: the text from that ArbCom quote was introduced into WP:RS in 2006, and although there was text in WP:V about self-published sources at that point, the explanation of "self-published" was limited to examples, where the examples were all no barrier materials. The first text about SPSs was introduced into WP:V earlier in 2006. There was no equivalent to BLPSPS. The first text about SPSs was introduced into WP:BLP in late 2005, and there too, the examples were limited to no barrier materials. The current WP:SPS footnote (characterization + examples + quotes) was introduced in 2011. It was initially a footnote and only became a reference note in 2023.)
f) More on reliability: Sources might be creative work (e.g., music, games, fictional books/TV shows/movies, poetry). Most of the time, they're probably used as sources for statements about their own content or structure, so it's often not critical to assess whether they are or are not self-published; even if they're considered self-published, the ways they're used on WP would often fall under ABOUTSELF. Still, the current characterization doesn't work for them, as they're generally not sources in which reliability would/could be assessed; thinking about what would lead you to say that a creative work is/isn't self-published may be helpful in thinking about how you interpret "self-published" more generally. It's also unclear what it means for a reviewer to validate the reliability of things like opinion pieces and interviews.
g) Other features: In assessing whether something is self-published, some people consider who is responsible for distribution and marketing, and who is responsible for legal matters such as copyright, liability, licensing, and contracts, though the legal responsibilities might vary by country.
h) Examples: The examples in WP:SPS would provide better guidance if some of them were removed (i.e., they're not examples of self-published material) and/or if some other examples were added (e.g., examples of material that isn't self-published, examples that better illustrate where the border is for self-published / not-self-published). Maybe we shouldn't give a characterization of the sort that appears in the footnote, and we should focus on giving lots of examples: adding some examples of non-self-published sources (identifying them as such), and adding some self-published examples that are less obvious.

Other things people mentioned, not about the characterization or examples of "self-published" per se:

  • The footnote includes a few quotes from sources, and depending on the RfC results, it may be time to update those.
  • About 10–15% of the sources listed in WP:RSP are currently identified as "self-published." Depending on the RfC's results, we may want to discuss the description of some individual RSP entries later. Depending on the outcome, either more or fewer RSP entries might be correctly described as self-published.
  • It's OK to leave the source quotes in a reference note, but the characterization should be moved into the body of the WP:SPS section.
  • We might think about changes to the WP:SPS text that aren't about the explanation of "self-published" per se. For example, should the EXPERTSPS text be modified to allow a group to qualify as "expert" in its field if academic and/or mainstream sources regularly treat the group as having expertise? (That may already be consensus practice, or perhaps people don't consider these to be self-published.) Should the text say that WP content sourced to EXPERTSPSs should always be attributed?
  • WP:ABOUTSELF and WP:BLPSELFPUB allow the use of self-published material in some cases, but it's unclear what is meant by "third party" in point 2 (e.g., if one considers a university's website to be SPS, can it be used for information about a professor, or does BLPSELFPUB preclude that?).

This section is space for questions, in case people have any

[edit]

Responses

[edit]