Jump to content

Wikipedia talk:Overcategorization/Archive 10

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 5Archive 8Archive 9Archive 10Archive 11Archive 12Archive 15


What constitutes overcategorization of a biography

I think the article on Eugen Relgis has way too many categories (over 70.) While one doesn't wish to set hard and fast numerical rules, are there any statistics on articles by how many categories they have. (Something that could look at to say, 70 is typical of biographies, or that it is perhaps a bit much.)

This document would be more helpful if it gave some sort of statistical view or some objective guidance to editors in what constitutes overcategorization. Thanks. Zodon (talk) 07:14, 28 April 2012 (UTC)

For instance, if it indicated median number of categories, and some percentile (e.g. 80th, 90th), or a bar chart of number of articles vs. number of categories, so one could tell what are unusual/outliers, and what are typical. Zodon (talk) 06:10, 30 April 2012 (UTC)

Noticeboard

Currently, category-related discussions tend to be spread out over the talkpages of WP:CFD; WP:CAT, WP:NCCAT; WP:CLS; and elsewhere. Awhile back, it seemed to me that having a category-related noticeboard might be nice, so I cobbled one together. Recently, some helpful person added a notice on WP:CAT about it. So at this point, I welcome others' thoughts on this. What do you think about it, and if positive, how and where do you think we should notify others of its existence? - jc37 06:57, 14 May 2012 (UTC)

Overcategorisation from orders, decorations and medals categories?

I've started a discussion at Wikipedia talk:Categorization#Overcategorisation from orders, decorations and medals categories?. --Paul_012 (talk) 17:48, 14 June 2012 (UTC)

American people by occupation by state

I am requesting comment on a proposal to clarify the scope of the American people by occupation by state category tree. I would highly appreciate comments or suggestions about how we could attempt to address the issues that exist. Thank you, -- Black Falcon (talk) 22:02, 25 July 2012 (UTC)

New RfC about Categorization of persons

Please see Wikipedia:Requests for comment/Categorization of persons: "Should we categorize people according to genetic and cultural heritage, faith, or sexual orientation? If so, what are our criteria for deciding an identity?" Thank you, IZAK (talk) 02:34, 30 August 2012 (UTC)

A tweak to SMALLCAT

WP:SMALLCAT should be tweaked, because a strict reading of the words "will never have more than a few members" makes the guideline useless in most cases, including the examples cited. The word "never" should be replaced with something less absolute, such as "improbable" or "unlikely".

The use of "never" means that the guideline doesn't even cover the existing examples, which have been in place since at least 2008. We can't say for certain that the number of Catalan-speaking countries will never grow; but we can say that it is highly unlikely. Similarly, it is theoretically possible that the two surviving Beatles could embark on series of marriages followed by quickie divorces, so we can't say "never"; but it is highly unlikely that Beatles wives will expand.

I became aware of this in a this current CfD, where an editor read the word "never" literally. I think that a literalist reading is inappropriate per WP:NOTBURO, but a rewording would avoid further confusion.

I propose changing "never" to "unlikely", and also to add some sort of time qualification. Something may happen in the future, but there is no point in having an underpopulated category for years because the situation may change in a few decades. For example, Category:Surgeons born in the 21st century would expand significantly from the 2030s onwards, but there'd be no point in retaining it until then just to hold one child prodigy.

My suggested rewording is:

"will never have more than a few members" → "are unlikely to have more than a few members within the next year".

I made the change, but will self-revert pending discussion. --BrownHairedGirl (talk) • (contribs) 09:46, 27 September 2012 (UTC)

As the literalist editor mentioned, I thought I should share my thoughts. I was working under the presumption that SMALLCAT was written the way it was written because it was meant to leave categories that, while currently containing only one or two entries, have the potential for expanding at some point in the near future. This is particularly true in the "People from (town, state)" category tree, within which I think the use of this guideline is misplaced. There is nothing inherently preventing a large number of people from a particular place from becoming notable in a short period of time, as there is in the provided examples on the guideline page, or the example provided by BrownHairedGirl. I'd suggest that said category tree falls into the specific exception listed within the guideline, being "part of a large overall accepted sub-categorization scheme".
The question at hand, really, is how we intend SMALLCAT to apply. Is it meant to apply broadly to any category that currently has minimal membership and whose potential to expand falls below some reasonable threshold; or is it meant to apply more narrowly, specifically to categories with some inherent factor preventing their expansion to larger numbers? This change would only be necessary if the guideline is meant to mean the former. For myself, I prefer the latter, but accept that I may be in the minority. -Dewelar (talk) 15:33, 27 September 2012 (UTC)

Decent points. I changed the text from "will never" to "can not". This should leave the question of time (now or in the future) indeterminate, which was/is, I believe, the intent, based upon MUCH prior examples at CfD. As always, I welcome further discussion on this though. - jc37 16:38, 27 September 2012 (UTC)

And I have changed to "do not"... I think this is best presented in the present tense. Things can always change in the future (While unlikely, Paul McCartney could suddenly join a polygamist sect, and marry 100 more women... which would change the acceptability of the exampled "Beatles wives" as category). The key is whether the population is currently too small and unlikely to grow. Blueboar (talk) 18:24, 27 September 2012 (UTC)
I didn't use "do not", because we often have a situation at CfD where there are other already existing pages which may be appropriate for the category, but are not currently in the cat. "do not" would leave that all open to wikilawyering.
With that in mind, I chose "can not", and think that should be restored, unless there are further concerns, or other better phrasing. - jc37 18:36, 27 September 2012 (UTC)
I concur with Jc37 that "can not" is the better phrasing. "Do not" is inherently about the present state. I have no problem having a category into which numerous articles yet to be written will eventually fit. "Can not" speaks to the feasibility of the category and its applicability. --Lquilter (talk) 20:19, 27 September 2012 (UTC)
OK... I have edited to: Avoid categories that, by their very definition, do not have more than a few members and are unlikely to grow in the near future, unless... This allows for categories that could grow in the near term (ie there are articles that exist or are likely to be written that could fit the category)... without setting some arbitrary time limit. I will stop being bold now, feel free to amend or revert. Blueboar (talk) 20:51, 27 September 2012 (UTC)
I initially removed "...in the near future..." since there is no deadline. But reading how it looks now, it occurred to me that we could switch it from a negative (have not more than) to a positive (only have). Though I do think that there may still be better ways to phrase this.- jc37 21:32, 27 September 2012 (UTC)

I have reverted the recent changes. Please can we discuss this and try to reach a consensus, rather than have editors imposing their own changes while discussions are underway? Thanks. --BrownHairedGirl (talk) • (contribs) 00:14, 28 September 2012 (UTC)

I thought we were above... but ok, what are your thoughts on what has been discussed so far? - jc37 00:19, 28 September 2012 (UTC)
Let me give an example of why I think we should include "unlikely to grow in the near future" ... take the potential category "African-American US Presidents"... at the moment, this potential category has a population of just one. Now, is quite likely that there will be other African-American US Presidents at some point in the future... so the category is likely to grow... eventually.
However, given the length of Presidential terms of office and the possibility that future Presidents might serve two terms... it is going to take a LONG time before this potential category gets to a point where it would list even five articles (Even if every future President is African-American, it will take a minimum of 20 years before we get to this point... and, realistically, it is likely to take much longer).
So... both currently and for a while yet to come, we should avoid the category "African-American US Presidents". It would be a SMALLCAT. That does not mean we will never create this category... "there is no deadline"... which means we can periodically reassess the situation. We can always create the cat at some point in the future, when we have a useful number of articles (and potential articles) to populate it. But, for NOW, we should avoid the cat. Blueboar (talk) 13:56, 28 September 2012 (UTC)

Well, given the outcome of the deletion discussion above, and the comments here, I can only conclude that there is absolutely no rhyme or reason for the application of this guideline, at least in part because nobody can agree -- and only a brave few are even willing to discuss -- how it's meant to be applied. I'll keep that in mind going forward and just not bother to weigh in on this sort of thing in the future. It's not worth the energy. -Dewelar (talk) 21:21, 1 October 2012 (UTC)

The "Published list" (WP:OC#TOPTEN) section

In a recent CfD discussion, WP:OC#TOPTEN was (unsuccesfully) used to argue that a list of top-selling drugs was suitable as a WP category. TOPTEN has an example, but that category was deleted in 2006. Are there any categories that do list the "top #" of something (and are valid per WP:OC#ARBITRARY etc) (the closest I've found is Category:Number-one singles) ? If not, then this section should be deleted - maybe with a bit of it and the shortcut merged into the ARBITRARY section. DexDor (talk) 22:04, 10 October 2012 (UTC) (updated - I should have read the lead of the page) DexDor (talk) 06:38, 11 October 2012 (UTC)

What constitutes a "Non-defining characteristic"

A recent CfD (See - Wikipedia:Categories_for_discussion/Log/2012_December_8#Category:Freemasons) centered on the issue of whether the category was or was not a "Non-defining characteristic". It was closed with no consensus. Given this lack of consensus... I think we need some discussion on what this guideline actually means when it talks about a "Non-defining characteristic"... what constitutes a "Non-defining characteristic" and what constitutes a "defining characteristic"? Can we come up with some examples of each, so editors can better determine which is which? Blueboar (talk) 13:52, 3 January 2013 (UTC)

The guidance does need clarification - I've been thinking about it since an earlier CFD. For categorization of individual people I don't think "defining" is a useful concept (except, possibly, as shorthand for some clear principles). I've put some notes describing WP categorization at User:DexDor/Categories. DexDor (talk) 06:46, 4 January 2013 (UTC)
  • Be careful when it comes to categories, as they work both ways - 1) Freemasonry may be a 'defining' characteristic of a person, but also, 2) the people are 'defining' of Freemasonry/lodges in a location. Ephebi (talk) 17:09, 6 January 2013 (UTC)
NOTE: Let's not make this a review of the Category:Freemasons discussion... that merely happens to be the CFD discussion that inspired me to ask the question. That particular CFD discussion ended in "no consensus", and I think the reason why that discussion did not come to a clear consensus is that the guideline does not really explain what a "non-defining characteristic" actually is. I think that lack of clarity needs to be fixed, and looking at a few examples will help us to fix it. Again, my goal is to have clearer guidance, not to re-argue a closed CFD. Blueboar (talk) 19:02, 6 January 2013 (UTC)
Anybody trying to categorize an article about a person should be looking at WP:COP rather than at WP:DEFINING. I propose we add the following to WP:DEFINING:
It can be difficult to apply the principle of "defining characteristics" to articles about individual people - more specific guidance about categorizing these articles can be found at Wikipedia:Categorization of people.
DexDor (talk) 22:47, 6 January 2013 (UTC)
Hmmm... I note that WP:COP says: "Categorize by those characteristics that make the person notable"... it uses the example of a film actor who holds a law degree, and says the person should be categorized as a film actor, but not as a lawyer unless his or her legal career was notable in its own right... I think this is hinting at what I have been saying ... and may help us to clarify non-defining characteristics with people. Lets say we were dealing with Category:Boy Scouts. Lots of people were Scouts, and so on its surface it would be a very large category... but very few people are notable because they were scouts. When you restrict the cat to the very small population of people where involvement in the Scouts was what made them notable it may become an overcategorization. Blueboar (talk) 23:19, 6 January 2013 (UTC)
Re: but very few people are notable because they were scouts. The other side of (non-) definition is whether scouting is notable because these people were members... particularly so if scouting was important to their development, then they should be categorised as Category:Boy Scouts. Remember these categories work both ways. Ephebi (talk) 11:31, 8 January 2013 (UTC)
By that argument, the example in WP:COP is wrong... the film actor should be categorized with Category:Lawyers, since lawyering is made notable because there are famous film actors who are lawyers. Sorry, I just don't buy it. Blueboar (talk) 16:49, 8 January 2013 (UTC)
If I can be allowed to step back from examples and attempt to define terms, I think there are two interelated issues here:
  1. The definition of what constitutes a defining characteristic for the purposes of categorisation. To use the terms from this page, 'the standards of and "definingness." (I believe the technical term here is wikt:definitionality, but that may be too technical to be used on the page.)
  2. The way in which those standards should be applied to the catgorisation of people.
In theory, point 2 should be entirely contingent on point 1. In practice, the categorisation of people has led to some truly vehement edit warring, so consensus implies a need for its own strong set of standards. For my part, biographical articles are some of the most obviously overcategorised articles we have, a referable standard here would probably greatly help.
Our present standards are rather circular. WP:DEFINING lists a number of rules of thumb, but ultimately says that disputed cases should be refered to WP:CFD. For our purposes, it would be helpful if the page itself gave a more extensive definition.
There are two key approaches to categorisation that are relevant to our definition:
  • Bottom up: what categories would article Foo fall under? (Is the Category/Characterisic:Bar defining of Foo?)
  • Top down: what articles and sub-categories would fall within Category:Bar? (What defines the limit of Barness? Should Foo be included in, or excluded from, the category?)
Categorisation work tends to use both top down and bottom up approaches, depending on circumstances and editor preference (e.g. much individual categorisation work is bottom up, though some editors approach whole categories at once; whereas Cfd usually operates top down, with bottom up questions sometimes being raised). Our definition of wikt:definitionality ought to encompass both approaches. We need our definition to operate as the glue which allows these approaches to meet in the middle. --Andrewaskew (talk) 03:08, 9 January 2013 (UTC)
(Replied to my own point to separate out issues). My own stab at the beginings of a definition.
Definitionality is the degree to which a characteristic or category is fundamental to the description of the contents of that category, and the degree to which the differing contents would be seen as interelated by that definition. If a category has a high degree of definitionality for an article, the article should be placed within that category or one of its subcategories. Aristotle is human, so he falls within Category:Ancient Greek philosophers which falls within Category:Ancient philosophers which falls within Category:Ancient people by occupation which falls within Category:Ancient people which falls within Category:People by period which falls within Category:People by time which falls within Category:People which falls a number of categories which fall within Category:Humans.
(As I said, prelimary definition.) --Andrewaskew (talk) 03:08, 9 January 2013 (UTC)

I've raised WP:DEFINING at WP:VPP

See Wikipedia:Village pump (policy)#Remove WP:DEFININGRyan Vesey 01:00, 10 February 2013 (UTC)

Rename section from "Non-defining" to "Defining"?

Why is the section named "Non-defining characteristics" rather than "Defining characteristics"? The shortcut is WP:DEFNINING. The first sentence is: "One of the central goals of the categorization system is to categorize articles by their defining characteristics." Using the negative in the title should only be done if there is a compelling reason ... but I don't see such a reason. Instructions are always clearer to readers when phrased positively. Any objections to renaming the section title to "Defining characteristics"?? --Noleander (talk) 14:55, 20 February 2013 (UTC)

It goes to the purpose of the guideline... The guideline is essentially a list of things that constitute overcategorization (and thus should be avoided when thinking about creating a category). A "Defining" characteristic is not an overcharacterization, while a "Non-Defining" characteristic is. Blueboar (talk) 16:16, 20 February 2013 (UTC)
Doh! I just figured that out and was going to remove my post :-) --Noleander (talk) 16:24, 20 February 2013 (UTC)

Help needed with draft RfC on WP:DEFINING

I'm planning on proposing an RfC regarding the WP:DEFINING guideline. A draft of the RfC is in my sandbox. I'd appreciate it if editors with interest in WP:DEFINING could take a look at the draft and offer suggestions for improvement. Thanks. --Noleander (talk) 15:42, 22 February 2013 (UTC)

Category for basketball players from Portland, Oregon

You are invited to join a discussion to determine if Category:Basketball players from Portland, Oregon is needed when Category:Sportspeople from Portland, Oregon and Category:Basketball players from Oregon already exist. The discussion is located here.—Bagumba (talk) 22:04, 26 February 2013 (UTC)

Proposal to change to WP:COP#N

WP:COP#N could be changed as shown below.

Categorize by those characteristics the occupation(s) that make the person notable: Apart from a limited number of categories for standard biographical details (in particular year of birth, year of death and nationality) an An article about a person should be categorized in terms of occupation (i.e. those categories below Category:People by occupation) only by the reason(s) for the person's notability. For example, a film actor who holds a law degree should be categorized as a film actor, but not as a lawyer unless his or her legal career was notable in its own right. Many people had assorted jobs before taking the one that made them notable; those other jobs should not be categorized.

The main advantage of the new wording is a clearer distinction between WP:COP (which is mainly about how to place bio articles in the categories that exist) and WP:OC (which is mainly about which categories should exist). It also avoids the ambiguous "standard biographical details" which is discussed in the RfC above. Any comments ? DexDor (talk) 06:37, 7 March 2013 (UTC)

It looks like you are proposing to remove the "standard biographical details" exception but those categories are widely accepted by the community (in fact the RfC above shows that the "standard biographical details" should probably be expanded to include college-alumni and memberships such as Eagle Scouts). Also, thoughts on process: (1) your suggestion should probably be posted on the WP:COP Talk page (or at least put a note there directing readers here); (2) maybe it would be better to post this idea as a subsection within the above RfC, to keep the discussion co-located? --Noleander (talk) 14:25, 7 March 2013 (UTC)
I would oppose limiting COP#N to occupations. Both Albert Einstein and Jack Benny played the violin, but neither were professional violinists... I hope we would agree it would be an overcategorization to place Einstein in Category:Violinists. On the other hand Benny is categorized as such, which is appropriate because it formed an important part of his comedy routine. He is notable for playing the violin (badly). Blueboar (talk) 14:52, 7 March 2013 (UTC)
@DexDor: I guess I'm not sure what your primary goal is. You cannot be suggesting limiting categorization of persons to only occupations ... because there are many counterexamples that make that not feasible. Is your primary goal to limit WP:OVERCATEGORIZATION so it only applies to testing for the existence of categories (not membership within a category)? If so, that is part of the RfC proposal above, but a couple of editors pretty strongly opposed that limitation. --Noleander (talk) 15:14, 7 March 2013 (UTC)
You might be able to get round this by saying

Include Categories such as one or more occupations that make the person notable ...

When proposing changes to the guidelines don't forget that the system of categoriation works two ways. Thus, while BLP might have a view on what makes people worthy of categorisation, some editing an intitutions to which he belongs may also think that the person helps define that institution. Take the example in the post above, while being an alumni of Harvard may not have been defining for the person, it may have been defining for Harvard. And above all, please remember that the editors that add content to the pages are doing this because they think its important - we need to ensure that we assist the real editors, and not let category obsessives define the article. Ephebi (talk) 20:09, 8 March 2013 (UTC)
We are bordering on OR here... we can not use categorization to "define" an institution by the people who are/were members. That is the job of sources. Blueboar (talk) 01:36, 9 March 2013 (UTC)

Unusually-specifically-targeted small category guideline clarifications

Usually I wouldn't do this, but: I'm reverting the edits of Bearcat (talk · contribs) on 2013-03-18; they need to have consensus before being re-added. This part of the change, in particular, seems to have changed the threshold on one point of detail and created an unusually-specific new point, minutes after Bearcat responded to someone citing this guideline. The relevant CfD, Wikipedia:Categories for discussion/Log/2013 March 14#Category:Pope Francis, is still open, and it appears that, even though Bearcat may not have intended it, the newly-modified guideline appears to alter the legitimacy of majority arguments in a CfD in process, and could even be cited by a closing administrator to override the bulk of consensus in the CfD. (In that CfD, Bearcat appears to be for deletion, and I'm for keeping.) The total guideline change, when also including Blurboar's clarification, is this paragraph. My specific problems with this:

  1. "Realistic potential" was changed to "documentable potential", which moves the goalposts from common sense or consensus, to allowing anyone to veto a category by arguing about citations not existing yet, no matter how obvious the category's imminent growth is. This seems specifically tailored to undermine Category:Pope Francis.
  2. This new guideline was added: "However, this exemption does not require that such a category must always be kept — in some cases, CFD may still opt to upmerge it until a few more of its articles have actually been created." What does "CFD may still opt" mean? The consensus can "still opt", or a closing admin can "still opt" no matter what the discussion was? The way that was worded would have left it open to a very flexible interpretation right about the moment Wikipedia:Categories for discussion/Log/2013 March 14#Category:Pope Francis is closed.
  3. "This exemption also should not be cited in defense of a category whose population prospects are purely speculative in nature, such as an eponymous category for a political figure who has not yet demonstrated the need for an eponymous category." Wow, that's an awfully specific example, isn't it? Anyone want to guess if there happens to be a category like that currently open for discussion and going towards keep based on WP:SMALLCAT?

I think you get the jist of it. It appears that there should be some discussion before these kinds of guideline edits happen. Whether or not they are intended as new guidelines or as some kind of already-existing interpretation, they appear to run up against the guideline interpretations implied by consensus so far at Wikipedia:Categories for discussion/Log/2013 March 14#Category:Pope Francis. --Closeapple (talk) 00:00, 21 March 2013 (UTC)

Comment: Wikipedia:Categories for discussion/Log/2013 March 14#Category:Pope Francis was closed as keep 6 hours later after my message here. I think discussion and result there speaks for itself about what kind of categories could end up unnecessarily deleted under this guideline change. There are a few classes of people that, in modern times, no matter how suddenly they ended up with that status, have such an effect that the probability of their categories being unpopulated is very slim. I'd put popes, British monarchs, and U.S. presidents in that class, for example. But recent U.S. presidential candidates (Category:Al Gore, Category:John McCain, Category:Mitt Romney) and the current heir to the British throne (Category:Charles, Prince of Wales) already have categories, so it may be that the concept I'm advocating here is valid but moot for most classes of these very influential people, and that it just happens that popes are the class of people who would most likely fit this concept without having already had categories beforehand. I'm trying to think of others. --Closeapple (talk) 08:18, 21 March 2013 (UTC)
For the record, I want to stress that what happened here is that a guideline was cited in favour of a situation that it was never meant to be applicable to in the first place — thus making it clear that the current wording of the guideline is inconsistent with the actual details of the consensus that it actually documents, and that somebody was misinterpreting what it's actually meant to cover. (And I know quite well what it was meant or not meant to cover, because I'm the person who added the original text in the first place, following a different instance of somebody misinterpreting it as automatically precluding all small categories under any circumstances whatsoever.) Trying to improve a guideline's clarity about its original intentions is not the same thing as "moving the goalposts"; the location of the goalposts hasn't changed, it just became clear that the guideline wasn't communicating clearly enough where they already are.
Whatever reasons there may have been to either keep or delete the Pope Francis category, SMALLCAT is not one of them either way — because that's not the type of situation that either the rule or its "exemption" were ever meant to apply to in the first place. The basic rule is about defined sets that can never grow, like "The Beatles' wives" — and the "exemption" was always meant to be strictly about occupational categories, like Category:Presidents of New Republic or Category:Mayors of North Cityville, which already have a specific and concrete and fully known reason to keep that overrides the category's current temporary smallness (such as the fact that the current occupant will eventually have a successor even if he's the first and only occupant so far, or that there have already been 50 other occupants of the position who meet WP:POLITICIAN and just haven't actually been written about yet, and on and so forth.) The intention was not to alter the consensus at Category:Pope Francis, but merely to point out the reasons why SMALLCAT is and should be irrelevant to the question either way. Its exemption was never meant to be interpretable as "any category that could potentially be expandable in ways we don't know yet"; it was always meant specifically and exclusively for situations where the category is expandable in ways that are already known. That's not "moving the goalposts"; it's "better clarifying where the goalposts already are".
And the updated text was not meant to necessarily preclude the existence of eponymous categories, either — merely the citation of that particular criterion as an argument either way, because neither section of the criterion was ever meant for that type of situation in the first place. Eponymous categories are meant to be evaluated in terms of whether they pass or fail OC#EPONYMOUS, and not in terms of whether they pass or fail SMALLCAT — and, in fact, as currently written SMALLCAT actually undermines or even invalidates EPONYMOUS in unintended ways, because you can always make a case that absolutely any eponymous category for absolutely anybody at all might be expandable in the future and should therefore be kept under SMALLCAT's prospect-of-expansion "exemption". And that's why a better clarification of SMALLCAT's scope and purpose is needed — because as written, it's not specific enough about what it covers and what it doesn't.
I was not, for the record, opposed to keeping Category:Pope Francis — note, for example, that I didn't actually cast a vote either way, and that in the end, the category got itself past SMALLCAT just by adding some already-existing articles without even needing a speculative exemption after all. What I was objecting to was that some of the reasoning, on both sides of the discussion, was flawed (for instance by appealing to this criterion instead of presenting a case properly based on passing or failing EPONYMOUS.) Bearcat (talk) 00:17, 23 March 2013 (UTC)

About Award recipients as a criterion of overcategorization

Isn't it a bit vague, given that there are so many exceptions in Category:Award winners?--Inspector (talk) 03:33, 28 January 2013 (UTC)

The one thing that can be agreed upon by almost everyone is that not every award should have a category for its recipients. Where the disagreement begins is what are the categories for awards that should exist, and so far the only way that has been dealt with is case-by-case. Just because a category exists doesn't mean consensus would support its existence, so you have to be careful about figuring out whether a given category has been discussed before or not. Good Ol’factory (talk) 03:57, 28 January 2013 (UTC)
I'm in favour of removing the "See also Category:Award winners" bit as it may encourage people to look there, find a category for a minor award and create a category for an even more minor award. We could also provide more guidance by changing the paragraph to something like:
Exceptions include Category:Nobel laureates and Category:Academy Award winners (these are internationally recognised as being the most prestigious awards in their fields).
DexDor (talk) 23:26, 28 January 2013 (UTC)
  • I really see no reason why we should have any award categories. I think the nobel categories would be better just as lists. It would cut down on category clutter. We should limit categorization to what people did, not include awards they recieved at all. It would also make a much clearer rule. We could then apply the same to reciepients of medals and other distinctions that seem to be allowed way more than awards.John Pack Lambert (talk) 01:05, 3 February 2013 (UTC)
Are you in favour of the change I proposed above - as a step in the right direction ? DexDor (talk) 08:36, 3 February 2013 (UTC)
I would love to see something similar for albums and songs, too, with all the categorization by gold and platinum certification in a multitude of countries. See 21 (Adele album)...ugh! --StarcheerspeaksnewslostwarsTalk to me 00:38, 9 February 2013 (UTC)

The example given in this section is Potential Presidential Candidates which Wikipedia is not a crystal ball clearly applies to. But what about categorizing officialy nominated candidates in major elections? Ottawahitech (talk) 14:24, 2 May 2013 (UTC)

Once someone is officially nominated, then they can be categorized as a candidate. It is no longer a crystal ball situation, and they are no longer just a "potential" candidate. Of course there is the question of whether the person is notable enough for an article (the nominee of some obscure fringe party may not be)... but that is a different issue from categorization. Blueboar (talk) 15:21, 2 May 2013 (UTC)
So I see... looking at the CFD discussion, I suspect that the determining factor was that the candidates were already listed in the British Columbia general election, 2013 article. That can make a difference. There is often debate about whether it is more appropriate to listify, categorize or do both... and different projects have set different "rules" about it. The important thing to me is that the information is somewhere on Wikipedia. Don't worry... Users will find it. Blueboar (talk) 15:49, 4 May 2013 (UTC)


Award recipients

I support the guideline but what are the criteria to allow still an award recipient category? The current definition just mention that there are a few exceptions:

People can and do receive awards and/or honors throughout their lives. In general (though there are a few exceptions to this), recipients of an award should be grouped in a list rather than a category.

-- SchreyP (messages) 22:16, 22 December 2011 (UTC)

  • The exception seems to be "the award is internationally recognized as the top award in the field, widely recognized outside the field, and seen to be a truly prestigious award". However I think we should go a step further and delete all awards cats. I really do not think there is a point in having a category for nobel prize winners. A list works just fine, and having awards cats just leads to people who get awards being in way more categories.John Pack Lambert (talk) 22:30, 1 February 2013 (UTC)
And that is exactly what has happened. But I think the community consensus would be towards having more categories rather than fewer. StAnselm (talk) 20:41, 11 March 2013 (UTC)
This conflicts with WP:CLN, which says that overlapping lists and categories can exist. A stated advantage of categories is "Good for exploratory browsing of Wikipedia.". When Category:Award winners exists, it seems appropriate that readers would want to drill down to winners via categories, and not be force to go to a list. It would be more objective to limit categories by notable awards that have articles that meet WP:GNG or WP:LISTN, than to limit it further by "prestigious" awards that are reduced to WP:ILIKEIT or WP:IDONTLIKEIT debates.—Bagumba (talk) 17:59, 19 April 2013 (UTC)
The first paragraph of CLN says "... each method of organizing information ... is applied ... following the guidelines and standards that have evolved on Wikipedia for each of these systems." and the 4th para of CLN refers to WP:OC. It is therefore incorrect to say that CLN and OC are contradictory. WP:DEFINING says "One of the central goals of the categorization system is to categorize articles by their defining characteristics", award categories often contain articles that don't even mention the award - why, for example, is the CERN article in Category:Blue plaques ? This can occur with other categories, but is particularly prevalent with awards (example). DexDor (talk) 21:51, 19 April 2013 (UTC)
Let's limit this thread to OC#AWARD specifically. Was the motivation to disallow most awards because some awards categories included members that were not verifiable? However, we have categories that sometimes have more damning BLPN issues with sexual orientation or religious beliefs, but we would not disallow those categories due to a few problems. Remove the uncertified members, don't eliminate categories.—Bagumba (talk)
I don't know the reasons (apart from what's on the linked CFDs), but verifiability isn't (IMO) the main problem. Take an article like Eiffel Tower - its WP:DEFINING characteristics (the things one expects to see in the lead) are being metal, a tower, a visitor attraction, in Paris etc; being a "Work designated as Historic Civil Engineering Landmark by the American Society of Civil Engineers" is not a defining characteristic (in fact, it's not even mentioned in the article). A similar category demonstrates another problem - only about 7% of the recipients of the award are in the category so anybody who is interested in what's received the award is much better off looking at the list article (which can be sorted in various ways, has explanatory information etc). DexDor (talk) 20:23, 20 April 2013 (UTC)
I take it that you are striking your earlier concern of "award categories often contain articles that don't even mention the award".—Bagumba (talk) 03:08, 26 April 2013 (UTC)
Categories being incomplete should not be the motivation to delete them. There is no deadline. The category should be completed, not deleted. Lists and categories can co-exist.—Bagumba (talk) 03:08, 26 April 2013 (UTC)
No, it appears that you have not understood my comments. One problem with award categories is that often for the person/thing receiving the award it is so irrelevant/trivial that it's not mentioned in the article (and certainly not in the lead) and hence the article shouldn't be in the category (the Eiffel tower example). Some awards are given to people/things that do not currently, and may never, have a WP article - hence a list is better as it can be complete. There are a few awards (e.g. Nobel Prizes) where neither of these concerns apply (i.e. the award is normally/always referred to in the lead/text and the recipient is always sufficiently notable to have a WP article). DexDor (talk) 05:23, 26 April 2013 (UTC)
  • User:DexDor: Your comment above is not a good reason to have Award categories removed from Wikipedia. If an article does not mention the award, then remove the article from the category. But why delete the whole category? Just my $.02. XOttawahitech (talk) 23:26, 11 June 2013 (UTC)

I dunno how to describe this (lol)

Let's say article X is the main article of category X. Is it overcategorization if article X and category X are included in category Y? Let's make it concrete: is it overcategorization if Liverpool F.C. is categorized at Category:Premier League clubs even though Category:Liverpool F.C. is also there? –HTD 05:10, 7 March 2013 (UTC)

No problem... categories can be inter-related without being hierarchical. Liverpool F.C. can be the primary article for Category:Liverpool F.C., and at the same time a subsidiary article within the Category:Premier League clubs. Both are appropriate categorizations. Blueboar (talk) 15:22, 7 March 2013 (UTC)
Thanks. It was not a sports team-related issue, but it's analogous. I've been involved in an edit war (not 3RR yet though lol) on this exact same issue, with the other party saying it was overcategorization. I guess categorizing Liverpool's academy team into the Premier League category is overcategorization. –HTD 16:34, 7 March 2013 (UTC)
Don't thank me yet... Analogies don't always work when you get down into the weeds of an actual dispute... To know whether a specific categorization is (or is not) over-categorization, we would really need to know the specific article and the specific categories in question. Blueboar (talk) 19:16, 7 March 2013 (UTC)
Sorry. Someone was removing Manila from Category:Capitals in Asia and Category:Cities in the Philippines since Category:Manila was already categorized there. –HTD 03:08, 8 March 2013 (UTC)
The relevant guidance is at Wikipedia:Categorization#Eponymous_categories, which basically says, you can choose - whether the eponymous cat and the article are both placed, or only one. In this particular case, I would say keep the article and category. However, there are many instances where only the category is kept. I would say, in cases where you have a bunch of articles that don't themselves have eponymous cats, in that case, all articles should be included.--Obi-Wan Kenobi (talk) 21:39, 12 June 2013 (UTC)

Comprehensive replacement proposal

  • All verified recipients of an award about whom we have articles should be that is of significance either as conferring or partially conferring notability, or being a defining characteristic (interpreted broadly, as a milestone in a career or a special feature of a career) or might reasonably be looked for as a group should be in included in both a list and a category. Exceptions and special considerations: (a) If the award is not considered itself notable, but is nonetheless defining or otherwise appropriate for a category, there should not be a list constructed in default of the article; (b) if nobody is willing to maintain a list, there should not be a list (c) if the number of individuals is small, the list can be included in the article for the award. (d) Verified recipients about whom we do not yet have an article but are obviously notable intrinsically cannot be included in a category, but they should be included in the list.
The virtue of this proposal is that it should eliminate the need for almost all discussions over individual lists, or whether they apply to a particular article.
The concept behind this is that all classification and indexing schemes that people find helpful and that are maintainable should be used. It is better to over-index than under index,and that a minority use a scheme is reason enough to keep it. (The underlying principle is that WP is an encyclopedia, and encyclopedias are meant to be useful. Anything useful that does not detract from our purposes is appropriate, if feasible. I'm not concerned with category clutter--there's much worse clutter, such as the succession boxes and the like, which are much more obtrusive. Categories are , if anything, relatively invisible.
Lists have the virtue that they can include those who do not yet have articles, serving as a guide for where articles are needed, and can include identifying information, and will be found in searching.
The key virtue of categories is they are self-maintaining. Otherwise, I think they are mainly useful at present as a way of spotting inappropriate articles, for filling in the topics where we do not have lists, and, to some extent, for building a hierarchy and thus displaying related subjects. And, of course, because some people like them. I look forward eagerly to superseding the whole category system by proper indexing, but while we have them, we should use them as dully as makes sense, if people want to do the work. DGG ( talk ) 00:29, 11 May 2013 (UTC)
  • Completely disagree. And how on earth do you think categories are "self-maintaining"? People all the time place articles in categories that are in-apt, or take folks out of categories where they should be, or just have a typo that makes the category not work. It is practically impossible to police a category for inclusion or exclusion. That's one of the major problems with categories. Wikipedia's categories are not a "tagging" or "keyword" system, and they are not "self-maintaining". I don't know exactly what that means, but I know that categories are a real pain to maintain and much, much worse than lists, which can be followed by any particular editor. --Lquilter (talk) 03:13, 11 May 2013 (UTC)
  • and I love this one, the exception that swallowed the rule: "or might reasonably be looked for as a group". Seriously, every conceivable entity "might reasonably be looked for as a group". Can you imagine trying to delete anything with that category? If you think that award categories should never be deleted, you should just say so outright. Otherwise, I'd like to see you present a category that you think was correctly deleted. Category:Robot Hall of Fame inductees? Someone could reasonably look for it as a group. "partially conferring notability"? why not. --Lquilter (talk) 03:18, 11 May 2013 (UTC)
  • Comment I am not sure if this is where it should go, but I think we should go the other way, and get rid of all award cats. Lists are much easier to keep accurate. Awards are a dime a dozen and we do not need award cats.John Pack Lambert (talk) 03:24, 29 June 2013 (UTC)
  • Comment. Generally I also disagree with this approach, mainly because it's not based on any past consensus or broad agreement as to the direction we should go, and this issue has been discussed many, many times. Good Ol’factory (talk) 22:06, 30 June 2013 (UTC)

proposed change

I'd like to make the following change to this section Wikipedia:OC#Arbitrary_inclusion_criterion:

  • "Categorization by year, as a means of subdividing a large category, is an exception to this."

becomes

  • "Categorization by year, decade, century, or other well-defined time period (such as historical era), as a means of subdividing a large category, is an exception to this. When you create a categorization by time period, you should define the inclusion criteria clearly (e.g. This category is for politicians who were active in the 19th century is not the same as This category is for politicians who were born in the 19th century")

This change reflects current practice IMHO. Let me know your thoughts. --Obi-Wan Kenobi (talk) 14:12, 23 May 2013 (UTC)

No strong objection... I only have one question. Say we take a large category (Cat:X) and we sub-divide it by decades... (resulting in "Cat:X in the 1900s", "Cat:X in the 1910s", "Cat:X in the 1930s", "Cat:X in the 1940s", "Cat:X in the 1950s", etc.) now, suppose that X was very rare in the 1940s due to WWII. We might have only one single article that would fit in "Cat:X in the 1940s". Having a sub-category with only one article in it is overcatetgorization... even if the by decade subdivision works great for the rest of the categorization scheme. How would you deal with this? Blueboar (talk) 16:51, 23 May 2013 (UTC)
There's a special rule - when there's a series/pattern, small categories are allowed. It's somewhere in the guidance. --Obi-Wan Kenobi (talk) 17:00, 23 May 2013 (UTC)
I suppose... although I don't really see the need for the rule. A better solution would be to be a bit more flexible about "imposing" the pattern. You could avoid the one article cat by merging two decades together (resulting in either "Cat:X the 1930s and 1940s" or Cat:X in the 1040s and 1950s" - whichever made the most sense given the situation). I suppose that is my real question... I certainly agree that the "by year" sub-cats are not the only acceptable "time units" for categorization. The key is to allow whatever time units make sense, given the topic area. Blueboar (talk) 17:54, 23 May 2013 (UTC)
Yeah, of course, we should make sure the guidance is flexible, and if people want to create groupings-by-30 years, or whatever, that should be ok if it makes sense for the topic.--Obi-Wan Kenobi (talk) 17:56, 23 May 2013 (UTC)
 Done --Obi-Wan Kenobi (talk) 21:40, 12 June 2013 (UTC)

Unless there is good evidcne that an annual category will be well-populated, it should be merged into one for the decade or even century. I would support half centuries, but not 30-year periods. Peterkingiron (talk) 16:11, 17 June 2013 (UTC)

Proposed reorganization of structure of page

To me the structure of the OCAT page is a bit unwieldy -- we have 19 subsections, only one of them has sub-subsections. I think we could restructure it to provide some super-groupings that might make the page more readable. So I'm proposing -- without any substantive changes to the specific examples -- a reorganization. However, a super-grouping would have some semantic implications, so that should be a topic of discussion too.

Here's the current layout
  1. Non-defining characteristics
  2. Small with no potential for growth
  3. Narrow intersection
  4. Mostly overlapping categories
  5. Arbitrary inclusion criterion
  6. Miscellaneous categories
  7. Eponymous categories for people
  8. People associated with
  9. Unrelated subjects with shared names
  10. Intersection by location
  11. Trivial characteristics or intersection
  12. Subjective inclusion criterion
  13. Non-notable intersections by ethnicity, religion, or sexual orientation
  14. Opinion about a question or issue
  15. Potential candidates and nominees
  16. Award recipients
  17. Published list
  18. Venues by event
  19. Performers by performance
    1. Performers by action or appearance
    2. Performers by role or composition
    3. Performers by performance venue
My proposal
  • Move some of the general points to the opening section. I think we have three over-arching principles that could be discussed more clearly in the opening sections. There is some overlap of course; I think that can be handled with "see also" references.
    • "Defining." Move some of the "defining" discussion to the opening section. Over and over, "defining" is cited in category discussions. Keep "defining" as a subheading though to provide examples. This is the core, I think, and is a "necessary" feature of categories.
    • "Categorization System Problems" - This is basically problems with inclusion criteria or features that make something that would be a perfectly good tag or keyword, nevertheless not work within Wikipedia's category system. Categories are binaries -- on or off -- and they work best as a result with clearly-defined topics that are fixed, so subjective/vague/overbroad aspects are not a good fit. Also, because a category has to be on or off, temporary & transitory aspects are not a good fit -- like opinions or statuses. While a topic that poses no categorization system problems is not a sufficient justification for a category, it is a necessary one.
A special type of "categorization system problems" is the use of "intersectional categories". "Intersections" applies to a lot of categories, but the reasons we frown upon intersections are not intersections, per se, but because they capture things that we don't want/need to capture for other reasons: Trivial, small/overly specific, or a nightmare to manage because there are too many potentially relevant intersections. We do support intersections for administrative purposes -- such as the numerous national & other (state/province) subdivisions of actual subjects that are used to split a large category into subcategories.

I'm bolding the existing subsections & including their number so it's easier to follow.

  • Non-Defining Characteristics
    • 1. Non-defining characteristics - (Some of the explanatory language is in the intro section, but still include examples that don't fit in any of the below subsections.)
    • 11. Trivial characteristics or intersection
      • 9. Unrelated subjects with shared names
    • 13. Non-notable intersections by ethnicity, religion, or sexual orientation
    • 5. Arbitrary inclusion criterion
    • 6. Miscellaneous categories - I propose renaming to "Catch-all" or remainder categories.
    • See also "Potential candidates and nominees"
    • "Associations" - These are all examples of A being associated with B. They are a sub-type of "non-defining".
      • 18. Venues by event
      • 19. Performers by performance
        • 19.1. Performers by action or appearance
        • 19.2. Performers by role or composition
        • 19.3. Performers by performance venue
      • 8. People associated with
    • Extrinsic associations - This would be associations made by third parties.
      • 16. Award recipients
      • 17. Published list


  • Categorization system problems
    • 12. Subjective inclusion criterion
    • 4. Mostly overlapping categories
    • 7. Eponymous categories for people - It's a categorization system problem mostly because it adds to category clutter, and encourages the application of eponymous tags on articles in a "by association" way.
    • 2. Small with no potential for growth - I'm not actually sure about this one. I'm not sure I buy it, for one thing, and I'm not sure the examples aren't really better explained by other reasoning.
    • Temporal/status-based - These pose problems for categorization system because they could result in someone having all the subcategories of a category applied to them. Moreover, in order to be adequately descriptive they would often unwieldily lengthy.
      • 14. Opinion about a question or issue
      • 15. Potential candidates and nominees [temporal/status-based; this is also likely a "non-defining characteristic")
    • Intersectional issues:
      • 10. Intersection by location
      • 3. Narrow intersection
      • See also Trivial characteristics or intersection
      • See also Non-notable intersections by ethnicity, religion, or sexual orientation


That's a start. Thoughts? --Lquilter (talk) 15:27, 22 May 2013 (UTC)

This seems reasonable and well thought out. Could you create a quick draft in your userspace so we can see how it might look all together? --Obi-Wan Kenobi (talk) 15:45, 22 May 2013 (UTC)

Speaking just for myself (as I've re-ordered this page even from when radiant! initially posted it : ) - Not a bad proposal. My main quibble would be that, for readability's sake, that PERF and its subsections should stay at the bottom of the page. I'm also not sure of some of the things you've grouped together, but would welcome discussion. I think making a subpage as O-WK suggests would be a good idea: Wikipedia talk:Overcategorization/Draft Reorg or some such : ) - jc37 16:33, 17 June 2013 (UTC)