Jump to content

Wikipedia:Bot requests/Archive 6

From Wikipedia, the free encyclopedia
Archive 1Archive 4Archive 5Archive 6Archive 7Archive 8Archive 10


Rambot demographics to past tense

The majority of the US location articles that were added by Rambot are written in the present tense. "As of 2000, the population is... the average income is... the majority of families have..." etc. These should be changed to past tense.

It's pretty straightforward to do - within the demographics section, replace "is" with "was" (10 instances); replace "are" with "were" (12 instances); replace "have" with "had" (4 instances). I've been doing this manually when I come across them (see, for example, Jasper, New York), but that'll take a while across 30,000 articles and it seems to me like a task ideally suited to a bot. Any offers? --OpenToppedBus - Talk to the driver 10:07, 3 February 2006 (UTC)

No, I'm pretty sure these should remain present tense. They are still describing the current world, just somewhat inaccurately. Superm401 - Talk 09:38, 20 February 2006 (UTC)
They are absolutely not describing the current world. They are explicitly describing what the situation was six years ago, at the time of the last census. They should be changed because a) except for the very smallest of communities, these figures are unlikely to still be accurate; b) even where the figures haven't changed, we don't know that - all that we know, verifiably, is what the situation was six years ago; c) it's simply bad grammar to say, "As of the census of 2000, there are 1,270 people... residing in the town"; d) these articles are currently inconsistent, as the intro is already in the past tense. Note that I am not suggesting that the "geography" section (also based on the census bureau figures) should be changed to past tense, as the area of the towns and the portion covered by water is unlikely to have changed. --OpenToppedBus - Talk to the driver 11:07, 20 February 2006 (UTC)
I had been changing these when I met them, recently I have changed a whole bunch and now started serious automated testing. Rich Farmbrough 14:44 11 March 2006 (UTC).
BTW all done. Rich Farmbrough 17:11 13 June 2006 (GMT).

Convert star ratings to text

Was wondering if a bot could convert these star ratings (and their derivatives) in albums to plain text. Here is the discussion. Wikipedia_talk:WikiProject_Albums#Stars_to_text. Gflores Talk 06:14, 5 February 2006 (UTC)

Here are the conversions that need to happen:
Hope that helps. -- WB 06:20, 5 February 2006 (UTC)
I will set User:Tawkerbot to do this once I recieve a bot flag. Tawker 02:26, 14 February 2006 (UTC)
Please see here for the reasons this didn't go ahead. Rich Farmbrough 01:58 10 March 2006 (UTC).

Referral ID spam remove bot

I've been removing referral IDs from outgoing links I've been able to find every now and then. I frankly don't like the idea that someone could make money from Wikipedia by sneaking these links into places where they even could be considered legit.

What I'd like this bot to do is to find links that contain a referral ID, strip it off and post a normal one that works just as well.

example:

http://www.amazon.com/gp/product/B000000W5L/sr=1-1/qid=1138522986/ref=pd_bbs_1/103-0299503-7272610?%5Fencoding=UTF8

becomes:

http://www.amazon.com/gp/product/B000000W5L/sr=1-1/qid=1138522986/

Obli (Talk) 23:11, 5 February 2006 (UTC)

That's a good idea. I did a quick search over the enwiki dump from 20060125 and there are about 1500 amazon links with /ref in them. Note that you can reduce the URLs even further. Your example can be reduced to:
http://www.amazon.com/gp/product/B000000W5L
Cmdrjameson 14:52, 6 February 2006 (UTC)
Someone should get on the job of finding the top sites with referral programs and make an algorithm to remove those as well, but I guess Amazon would be the major culprit, though.
Obli (Talk) 16:54, 6 February 2006 (UTC)
I've updated my scripts and I'm working my way through the Amazon links I've found. Hopefully it'll only take a few days. If you do find any other prominent sites with referrals, I'd be interested to hear about them. Cheers, Cmdrjameson 23:59, 6 February 2006 (UTC)
Cool, I though it would require a lot more bureaucracy than that to get a thing like this done. Thanks! Obli (Talk) 07:58, 7 February 2006 (UTC)
OK I've finished with all the Amazon URLs I could find, and have moved on to allmusic.com. These are rather impressive; they can have a 10 (or so) character uid= component, along with a 128-character token=. There's about 3600 of them in enwiki. Cmdrjameson 13:25, 9 February 2006 (UTC)

Ladnav bot, to reverse a vandal

Since fy: has been getting a higher load of real vandalism lately (as opposed to the occasional graffiti-editor), I'm looking for a way to revert the contributions of a specified anonymous user in a somewhat automated way. As the bot bit does have a function in this process, as done by administrators, I expected to find an actual bot to help them out, but I can't find it. If I'm blind, could someone point me in the right direction. If not, would someone be willing and able to write such a bot? The task seems pretty straight forward; it's just an awful lot of clicking when done by hand. 217.123.4.108 21:03, 7 February 2006 (UTC)

Janitor bot: classify Category:Cleanup by month by type of edit needed

The backlog on Category:Cleanup by month is getting out of control, with 1.3% of Wikipedia currently tagged for cleanup. In order to speed the cleanup process, I propose a janitor bot to move "cleanup" pages that belong in other maintenance departments elsewhere.

The bot would have the following proposed behaviors:

  • If {{cleanup}} is found on a list, replace it with cleanup-list.
  • If {{cleanup}} is found on a disambig, replace it with disambig-cleanup.
  • If less than MINWIKILINK % of the text contains hyperlinks within Wikipedia, replace {{cleanup}} with {{wikify}}. A proposed value for MINWIKILINK is 0.1%.

The first two tasks appear to be within the capabilities of current bots, and should be easy to accomplish. Going through the capabilities of current Wikipedia:Bots does not show the capability to determine percentage of wiki links, but this is probably not a difficult task, as word counting and a repeated regexp search for /[[*]]/ should be all that's required.

These are the tasks that seem obviously automatable. Much of WP:CU requires human interaction; but at least we can figure out some of the human interaction that's necessary. Alba 00:39, 8 February 2006 (UTC)

The 3rd task may be within the capacity of User:Gnome (Bot), More info will be posted on the bot page. The bot can do that now, but is not completly tested and some of the code is lacking. :-). P.S. the Gnome bot is a C++/CLI bot. If interested contact me on my talk page:-) Eagle (talk) (desk) 22:08, 27 February 2006 (UTC)
User: Gnome (Bot) has been deemed capable of doing the task...It is currently undergoing modifications to be able to preform the task. (Bot was in existance before I mentioned it) Now Alba and I are working on it.Eagle (talk) (desk) 21:44, 21 March 2006 (UTC)

Convert interwiki redirects to softredirect

I created {{softredirect}} some time ago, and it seems to have been well received. It is supposed to be used instead of interwiki redirects, which do not work. However, it's hard to find the interwiki redirects without a bot. I'd like to ask for a bot to convert all interwiki redirects into uses of the {{softredirect}} template. --cesarb 01:25, 11 February 2006 (UTC)

That can be done, but are you looking for interwiki redirects in all namespaces, just the User namespace, Wikipedia, Help, the main namespace?--Orgullomoore 15:49, 11 February 2006 (UTC)
All namespaces. I created the template to be used on all namespaces except article, but since interwiki redirects shouldn't be done on articles, the category associated with the template will allow us to find and fix them. Also, the pseudo-namespace WP: (which is also in the article namespace) has interwiki redirects which should be converted. --cesarb 18:01, 11 February 2006 (UTC)

Category:Stock exchanges

Category:Stock exchanges appears to be populated by all the stock exchanges, most of which fall into the geographical categories of ...in Europe, ...in North America, ...in Asia, etc. According to WP:CAT, articles shouldn't be a broad category that is covered by a sharper category. What bots, if any, do mass recategorization like this? --Christopherlin 18:03, 11 February 2006 (UTC)

Common typo

Here's an easy one: phd, Phd, PhD and Ph.D should be changed into Ph.D. Kjaergaard 05:36, 17 February 2006 (UTC)

I think that Mathbot by Oleg Alexandrov can correct spelling errors already. -- King of Hearts | (talk) 00:06, 25 February 2006 (UTC)

Robot wanted "id=toc" into "class=toccolours"

This was originally posted on WP:VPA, but User:Angela referred me here.

Does anyone have a robot that they could run which could change all occurences of "id=toc" into "class=toccolours"? They both look the same to most folk, but id=toc hides the division from folk who have preferences for "contents turned off". And 99% of these are not tables of contents but are related items link boxes. Editing the 1% by hand would be easier than the 99%.

For an example, see Elisa Oyj and Template:Finnishmobileoperators which has had this change done. -- SGBailey 21:17, 18 February 2006 (UTC)


American Cities/Towns

It might be nice if American Cities/Towns were displayed as nicely as British towns were. E.g. on the right side are all the vital/geo stats, and the articles expalins the town in question. American towns currently get a red dot on a map of the state they are in, but the non-American readers probably have no idea where the dot is in realtion to the USA as a whole. In general, the system from American towns/cities/etc seems a bit American-centric. Maybe I'm off base. I wouldn't mind a discussion, as I feel I can learn quite a bit more about British towns in general than the American counterparts by the general info displayed.

Recategorising

The category Category:Eurovision Song Contest needs categorising better, but it's kind of a bit hard to do alone manually. I'm wondering if a bot could do it better. Here's what essentially needs to be done:

Thanks if anyone can help. Esteffect 00:18, 23 February 2006 (UTC)

I've added it to my bots que, give it a few days and I should be able to run the job. I think I can run it fairly quickly as it's only 100 pages (12 min on the bots clock, I might just use AWB because its faster for the small jobs. Tawker 08:01, 2 March 2006 (UTC)

I'm wondering if I can find any of these bots somewhere. Right now I'm using pyWikipedia with XML Dump file.

  1. Bots that can get page titles from dump file if there is no interlanguage link in the articles.
  2. Bots that can find articles containing in-line interlanguage links ([[:fr:___]] [[:de:___]]). There are too many of these links in Thai WP such as this page th:LAOTSE (containing - en, fr, ja)
    • Then it might be good to get all the links in the articles that someone else can create articles (or even stubs) similar to Wanted Pages
  3. Similar to above request, can I get the wanted pages for specific categories (including sub-cat)

--Manop - TH 20:56, 24 February 2006 (UTC)

Bot for COTW, AID, etc.

There should be a bot for automating the processes of WP:COTW, WP:AID, and other collaborations. It should count the number of votes, remove failed nominations, and do everything listed under Wikipedia:Collaboration of the week/Maintenance if the clock reaches 18:00 Sunday, so people can have more time to work on expanding the collaborations. -- King of Hearts | (talk) 00:05, 25 February 2006 (UTC)

I don't think it's a good idea for a bot to count votes based on human edits. If people use a different format, they accidentally (or purposely) sign twice, etc etc... What might be able to be done is that the winning articlename be put in a protected page, and the bot read that.--Orgullomoore 15:31, 26 February 2006 (UTC)


Moving slogans from infobox company template

In disscussion at Template talk:Infobox Company there is an emerging consensus that the slogan field should be removed from the infobox, but we want to hang on to the data in the field. There are two things which you might be able to help us with, firstly could a bot create a list of the pages which contain infoboxes with slogans, and secondly would it be possible for a bot to remove the data and insert it perhaps as a section just before 'see also' with a heading such as 'corporate branding'? (The first would be very useful, the second is still subject to the discussion outcome - just trying to get a feel for what can be done) Many thanks Ian3055 22:09, 27 February 2006 (UTC)

Working on it (just the list, not actually editing). —Cryptic (talk) 14:25, 2 March 2006 (UTC)
Template talk:Infobox Company/Slogans. —Cryptic (talk) 18:02, 2 March 2006 (UTC)
Thanks! Its much easier to consider the problem now that we know how big it is... thank you Ian3055 22:38, 3 March 2006 (UTC)

Userbox and Userboxes bot

We need a bot to replace Userbox and Userboxes with Wikipedia:Userboxes. Thank you. --Fang Aili 22:06, 3 March 2006 (UTC)

Sounds like fun, I'll get right on it. — Mar. 3, '06 [22:13] <freakofnurxture|talk>

bot for adding box to an entire category of articles

Is there a bot for adding an info box to all articles in a category? I checked the Wikipedia:bots page, but I didn't see a bot that would do what I'm thinking.

Currently the dinosaur pages on WP are in bad shape. There are several hundred categorized dinosaur stubs that could use an infobox, but manually adding them might take some time. Can't a bot do all that work instead?--Firsfron 02:47, 6 March 2006 (UTC)

Putting the infobox in would be a cinch, populating it less so. Rich Farmbrough 19:44 8 March 2006 (UTC).


List of images used in an article

This suggestion is about a tool and not a bot, but I didn't know were to put it. I'm suggesting a tool that searches through a page's history and lists all the images that have been used, even if they were removed. This way, we could retrieve images that were replaced by better ones in the article, but still good for usage. The only problem, is that Wikipedia doesn't categorises images, that's why I think this tool would be useful. CG 21:22, 10 March 2006 (UTC)

A blacklist of the vandal's favourite pix might be useful... Rich Farmbrough 12:19 15 March 2006 (UTC).

the Crimea -> Crimea, the Ukraine -> Ukraine

Formerly names "Crimea" and "Ukraine" were used with the definite article. Today only Crimea and Ukraine (without article) are considered to be correct forms. But the Crimea and the Ukraine still can be found in many wiki-articles. IMHO it will be a good task for a bot - to remove "the". Don Alessandro 09:07, 12 March 2006 (UTC)


is there a bot that can perform search functions to link Wikipedia articles that are relevant to a Wikibook? Also, and, more generally, Wikicities, other areas of WikiBooks, Wiktionary, (And, I know, getting less likely the heuristic would be able to discern what was relevant, but i'd rather delete bad links than put in all of my own.) Google? Yahoo? Prometheuspan 18:18, 16 March 2006 (UTC)



I will look that up. As a side note, I had envisioned linking to as many other wikibooks as were relevant, and to as many wikipedia articles as possible. Prometheuspan 22:26, 28 February 2006 (UTC)

[edit] Bot Unfortunately, I haven't heard of anything like this. --Derbeth talk 23:40, 16 March 2006 (UTC)


What about a bot that links references to religious texts to the approprate section of that wikibook? This may need to be written for each book seperately, but of particular interest to me would be the Jewish/Christian Bible and Islamic Qu'ran. If this would interest anyone, please contact me on my talk page! Andrewjuren 20:50, 24 March 2006 (UTC)


Prometheuspan 00:33, 17 March 2006 (UTC)huh. You'd think they would have like an RSF or some such thing set up to link a new wiki to its parent networks like that. I have asked at the bot request wikipedia zone. Is there somebody else or someplace else to go look? Prometheuspan 00:33, 17 March 2006 (UTC)

Retrieved from "http://en.wikibooks.org/wiki/User_talk:Prometheuspan"

Is this possible?

There has been discussion at Wikipedia talk:Categorization about repopulating some categories that had previously been depopulated after being divided into subcategories. One example is Category:American actors. There is a good deal of support for doing this. Before there was {{CategoryTOC}} it was necessary to break large categories into smaller subcategories, and there is a value in having these smaller categories. However, categories also serve as the master index of subjects and it is often frustrating to have to look in several subcategories to browse through the articles in a subject. A good example of this is Category:Film directors. The proposal is to keep the subcategories, but also have articles duplicated in parent (or grand-parent) categories up to the level of topic articles.

I am wondering if a bot could be created to run frequently (once a day?) which would go through a list of categories that should be duplicated in other categories and check to see if the duplications exist. If they do not, they would be added. I suspect that there will need to be a page created to discuss this duplication process (Wikipedia:Duplicated Categories?) and editing the list of duplications would probably have to be limited to admins. By having this bot, a person could add the lowest level category that applies and the category would also end up in the higher level categories. The bot would have to look at each article in the category and see if the higher level categorization exists, if it does not, the categorization would be added. For categories of people, the piping should be copied so that the article is alphabetized correctly.

Another bot might scan through the higher level lists and collate a list of articles that have not been put in any of the lower level subcategories.

I am just wondering if this is possible. There would have to be quite a bit of discussion about whether this should happen and how it will happen. First I want to know what is possible. Thanks. -- Samuel Wantman 10:21, 19 March 2006 (UTC)

Well, if you just want the bot to add a category (the higher up one) to every page in a list its trivial, any pywikipedia bot (including Tawkerbot) can do it. If you want the AI, that's getting into fuzzy logic and might be a little trickier -- Tawker 18:35, 19 March 2006 (UTC)
What do you mean by the AI? Are you talking about the piping? The bot can look at the categorization for the subcategory and use the same piping when adding the categorization in the parent category. As a test, would it be possible to duplicate the categorization of the articles in Category:American film actors so that they are also in Category:American actors. In this case the aphabetizing of each article in Category:American actors would be the same when copying the piping from Category:American film actors. I've been doing this with AWB and it takes quite a bit of work and there are hundreds of articles. Thanks. -- Samuel Wantman 09:17, 20 March 2006 (UTC)
By "AI" he means that the bot will run on its own with artificial intelligence. A pywikipediabot can't run on it's own, as the human must tell it what do, where to do it, etc. Fetofs Hello! 12:05, 20 March 2006 (UTC)

It seems to be that a better solution would be to get the MediaWiki software to display all subcat articles of a particular category. If this feature is introduced in the future, carrying out the category population with a bot would have been a waste.--Commander Keane 12:26, 20 March 2006 (UTC)

This is a common response that I've heard for about a year and a half. I am not convinced that this will ever happen, and I'm not sure it needs to happen. Usually, the higher level categories only have subcategories. Having them populated adds the flexibility of seeing the larger sets and the smaller ones. If having a large number of articles makes it difficult to see the set of subcategories, it is possible to split the category such as Category:Operas and Category:Opera. -- Samuel Wantman 21:43, 20 March 2006 (UTC)

Spoken Wikipedia Project

Hi there! Those on the Spoken Wikipedia Project would like to explore using a bot to help with our work. Here are a couple of things that have come up in discussions with other project members:

RSS Feed Updater

Right now, we have a manually-updated RSS feed that lists new articles that have been recorded. That way, project members and casual listeners can find our new content easily. It would be great if we had a way to automate this, to save SCEhardt the work. Let me know if you're interested in that project.

Tagging

Currently, we use several tags for our project:

1. We have a tag that people use when they request an article to have read aloud and recorded.

2. We are discussing two tags that project members can add to the article's talk page:

  • One that replaces the request tag when a project member has volunteered to record the article
  • a related tag to use when a project member decides to record an article, but nobody has requested it for recording.

Once an article is recorded and uploaded:

3. We have a couple of tags that go on the article's page itself:

  • one for recordings that are a single file.
  • and one for long recordings that have been split into several parts for faster downloading
  • and one for recordings of a page summarised in another article

4. In addition to that, we add a tag to the article's talk page

  • A version for recordings where the article is unchanged (stable versions, for example)
  • And we have a version that is used when the article has changed--it provides a link to the old version that the recording it based on.

5. Additionally, we have a tag that goes on the Wikipedia:Featured articles page that alerts people that the article is available in an audio format, too.

6. Finally,

  • We add a tag for the article on Category:Spoken articles
  • But of course, there's a slight variation to the tag if the article was featured at the time of the recording.

So as you can see, we use between 4 and 6 tags for each recording. They all serve a good purpose: they promote the project, help organize our work, and make sure that people can find our recordings.

However, it's a lot of work to do this. Not all of these tags can be automated, but it seems to me that at least a couple could be. For instance, #4 might be. And it would be very useful if we could automate #6.

Again, if this idea is something you;d like to pursue, I can re-explain all of this and/or provide more details. Ckamaeleon ((T)) 02:41, 20 March 2006 (UTC)

Nothing? D'oh! Ckamaeleon ((T)) 21:35, 21 May 2006 (UTC)

AllyUnion's Bots

Several of AllyUnion's bots appear to have gone offline several days ago. It's only when the automated tasks that you are used to being done don't get done that you realize how much you depend on a bot. And this is currently the case. From AllyUnion's user page, it appears that he is mostly on wikibreak. I tried emailing him, but his email does not work. So I've left a message on his talk page. But if he's on break, who knows when he will see it.

So the next question becomes, how long do we wait until the bots are declared out of service, and how then can we get some other bots to pick up the duties. Specific bots that appear to be down include:

NekoDaemon being out of service is what brought this all to my attention, as CFD is one of my normal home playgrounds. But AFD bot appears to have an even more critical role. - TexasAndroid 15:00, 21 March 2006 (UTC)

I'll try to examine the exact behavior and write up a clone. Bear with me on this. — Mar. 21, '06 [23:14] <freakofnurxture|talk>

Anyone have a Transwiki bot?

I have been trying to clean up the cocktails articles, I have tagged about 90 articles for "move to wikibooks". Anyone have a bot that could transwiki them? They all have cocktail recipes in them, the majority are nothing but recipe. They'd need to end up in the wikibook Bartending. http://en.wikibooks.org/wiki/Category:Bartending_pages_needing_work Once transwikied, I could then clean up the wikipedia articles. --Xyzzyplugh 09:36, 26 March 2006 (UTC)

minor template fixes

a bot to make the following fixes to template could do much, much good to wikipedia (it,d avoid manual fixing of these things, at the very least lol):

  • replace id="toc" with class="toccolours"
  • remove trailing/empty rows: many templates end with |-|},which makes no sense, or include |-|-, which is equally nonsensical
  • replace <center> with align="center" + margin:0 auto; style declaration
  • Remove trailing </center> tags
  • Replace <br clear="all" /> with a clear:both; style declaration

Circeus 20:07, 31 March 2006 (UTC)

Bot needed for search and replace mission

See Talk:Voivodes of the Polish-Lithuanian Commonwealth#Bot help for details. Thanks!--Piotr Konieczny aka Prokonsul Piotrus Talk 03:27, 1 April 2006 (UTC)


Years -> Years'

We have lots of wars and we name them XX Years' War... which, is close XX Years War. I think the apostrophe is the more common way to do it... and the proper... but they are both used in some settings. Should this be bot-ted?

gren グレン 02:35, 2 April 2006 (UTC)

Yes, yes, yes. Please have a bot do this. It hurts me physically to witness the lack of apostrophes. —Nightstallion (?) Seen this already? 13:17, 5 April 2006 (UTC)
I'm looking into doing that with WP:AWB. I'll report here. Consider it done for now. --Ligulem 15:12, 5 April 2006 (UTC)
Done for the redirects Seven Years War, Thirty Years War and Hundred Years War. --Ligulem 12:22, 7 April 2006 (UTC)

POV detector bot

I can envision the development of a bot that searches for phrases such as "is a great", "is a fantastic", "is a terrific", "is an awful" ..etc. that could indicate strong POV within the article text. If the phrase appears within quoted text, i.e. as dialog, then it would be excluded.--Hooperbloob 21:04, 2 April 2006 (UTC)

Barring exceptionally brilliant AI, the bot would not be able to conclude from the context whether the statement is truly POV or not. It would still have to be reviewed by a human, and you can already achieve this functionality by doing a google search for site:en.wiki.x.io "is a great". Good idea though! GT 05:06, 5 April 2006 (UTC)
Agreed, it would be too hard to perform automatically. I guess I wasn't considering its implementation as a stand-alone bot but perhaps as an add-on to an existing spelling type bot that editors could use when browing articles. I did that exact Google search and others before putting this note here...lots of POV hits.--Hooperbloob 05:51, 5 April 2006 (UTC)
That is going to be a nightmare to implement reliably, there is no way it would be autorevert like Tawkerbot2, it would have to compile lists and post them to a page somewhere. Its food for thought, I'll throw it out and we shall see -- Tawker 06:02, 6 April 2006 (UTC)

Western Reserve bot

Someone has put the following paragraph in a bunch of articles on Ohio townships:

"(Note: The U.S. Census Bureau counts township populations in the Connecticut Western Reserve as distinct from any municipalities located within the township. For populations of any municipalities within the township, please read the corresponding articles for those municipalities.)"

This is inaccurate. Some municipalities in Western Reserve townships, such as Cortland, Ohio, are independent from surrounding townships. Others, such as Newton Falls, Ohio, are part of the surrounding townships. This is no different than anywhere else in Ohio.

I'd like to see a bot that would delete the paragraph from all pages on which it is found. -- Mwalcoff 02:24, 4 April 2006 (UTC)

Copyvio bot

(There was User:Cobo of course, but that doesn't seem to have ever taken off.)

I find a lot of copyvios that have been dormant for months... and it seems like there are probably tons out there, if I can just check a few random articles and find one pretty fast. It seems like a bot with an organized approach would uncover thousands, and with an easy methodology... just select a few random 5-10 word phrases from the article, no punctuation, and search on Google, Altavista, etc for the exact phrase. The bot would make a list of any positive results. Of course it would have to ignore wikipedia mirrors. The odds of it listing a copyvio of something that is actually PD/GPL are low, in my experience, people are more interested in copying and pasting press releases, corporate bios, etc. than Project Gutenberg kind of stuff. But even still... that's where the human factor comes in.

The bot would just create a simple list of possible copyvios (with URLs), so it would be 100% non-invasive... humans (me, for example) would go through the list and handle as appropriate. The list could be stored in the bots userspace or wherever... I imagine it wouldn't be hard to drum up some people to go through it.

There are over a million articles now so it would be a lot of work and time... but afterwards it could perhaps monitor new articles (though that might be mre difficult to implement). Also since it's not live, I'm not even sure it would need to be flagged as a bot... all it would do is upload a list eventually, or in installments perhaps.

Anyway, I'm not a programmer... so I have no idea how hard this would be to implement. But given that it's not live, it could presumably be written in any language, up to Visual Basic. I've been thinking about this for a while though, and I think it would make a very positive impact on Wikipedia, and our goal of creating a truly free encyclopedia. Thoughts? --W.marsh 22:37, 5 April 2006 (UTC)

We were talking about this idea for Tawkerbot2 (as another feature) but we've run into one big snag. There are thousands of WP mirrors out there and every one of them would screw up automated detection -- Tawker 06:03, 6 April 2006 (UTC)
I've found that a simple "-wikipedia" (or equivalent depending on search engine) in the search cuts a lot of them out... to the point where you tend to just be left with copyvios, if there are any. Another option is whitelisting the domains listed at Wikipedia:Mirrors and forks. --W.marsh 06:10, 6 April 2006 (UTC)
If its a legit M/F it would have a GFDL complaint notice and see content from Wikipedia in it. That might be our saviour, though this bot would just list on a page, no way would I want it auto blanking -- Tawker 06:22, 6 April 2006 (UTC)
I agree. We wouldn't want automated blanking by now. Fetofs Hello! 13:33, 6 April 2006 (UTC)
Yeah, the whole point is doing a comprehensive job of pointing human copyvio hunters to all the probable needles in the haystack, so to speak. A bot shouldn't actually directly do anything with the articles. --W.marsh 14:10, 6 April 2006 (UTC)

Dictionary bot

Hello,

I am trying to find out if there is a bot created already that establishes a dictionary of terms found in a wiki site. Currently, there is no listing of definitions, and the terms end up being fairly convoluted at times.

I am able to generate a list of all the terms, and also to create definitions (manually), but I need to go back through the wiki site and create links from those terms to the dictionary. Further, some of the terms are too common to automatically replace using a bot, so they need to be removed from the "link creation" process, yet still remain in the dictionary. Is there something like this I can use as a bot base? Or is it simple to write?

Please note that this is for a mediawiki site. Is there a way that mediawiki can be set up to do this? (I have only glanced at the software docs, as I am not the admin).

Admins, you may email me with any responses. Thanks in advance!

Delfeld 04:37, 6 April 2006 (UTC)

You mean a list of every word on a page, I don't know how you would define terms over other words -- Tawker 06:04, 6 April 2006 (UTC)
-----
Tawker,
I don't mean every word on the page. All terms would be defined in a separate page - a dictionary page. Each term on this page would have an associated code - a simple "1" or "0", for example. This code could be listed anywhere in the definition, hidden or not. What this code would do is tell the bot, "Ok, this term is something to go through the wiki site and make into a link back here." or else would say, "Ok, don't make links of this term on the wiki site."
Does this make sense? I am not trying to glean all terms from the pages, but rather apply the dictionary terms to the pages.
Delfeld 21:52, 6 April 2006 (UTC)

I've noticed by browsing some of the Wikipedia Chinese articles that they will often link to an English page which has no corresponding link back to the Chinese version, and even one or two Chinese pages with no link to the relevant English article. I came across this actually several times in a short period of time, and would guess it to be not all that uncommon.

It would be helpful if someone could create a bot to scan pages and follow the links to different language versions, and make sure that all of the different translations are linked up. (i.e. that if a page exists in 15 different languages on any given topic, that each of those 15 versions has links to all 14 others).

Aside from just making it easier to find content in multiple languages, this may also encourage users to contribute in more than one language if they know the article exists in a second language they are familiar with.

Any thoughts?

--Hughitt1 19:35, 6 April 2006 (UTC)

There are many bots that do this task, such as User:YurikBot, though they do not run on all wikipedias, I suspect that not many run on the chinese wiki. Details of the bot they use can be found at m:Using the python wikipediabot. Martin 19:42, 6 April 2006 (UTC)



I am hoping to recruit a bot to help with the daily archiving at WP:AfC. The task used to be done by User:Uncle G's 'bot, and there were some plans for User:ShinmaBot take its place (along with some extra functions), but neither of their operators have been around recently. What's needed is three edits a day, shortly after 0000 UTC:

  1. move Wikipedia:Articles for creation/Today to [[Wikipedia:Articles for creation/YYYY-MM-DD]] (just the article, not the talk). The date should be that of the day that's beginning, not ending).
  2. edit Wikipedia:Articles for creation/Today, remove the redirect, and replace it with a generic header like this one.
  3. edit Wikipedia:Articles for creation/List and add the day's archive to the top of the page. Monthly archiving can be left to humans.

Can anyone help? ×Meegs 18:13, 15 April 2006 (UTC)

I wrote a script, and requested approval at Wikipedia:Bots/Requests for approvals#Jitse's bot. -- Jitse Niesen (talk) 14:26, 24 May 2006 (UTC)

Replace an image on several pages

User:Germen was using Image:Nl small.gif in his signature, which was deleted as a redundant image of Image:Flag of the Netherlands.svg. Could someone run a bot to replace all instances of the text [[File:Netherlands flag small.svg|25px]] (articles) with [[Image:Flag of the Netherlands.svg|25px]]? Thanks! ~MDD4696 21:38, 20 April 2006 (UTC)

I'll do that. It's a simple task to do. Pegasus1138Talk | Contribs | Email ---- 06:06, 22 April 2006 (UTC)
I've run through the entire links here list and got no hits for the image using the find replace you specified so somebody must have beat me to it though I have no idea why anything is registering as linking to the image when I cannot find any links that actually point to it (other than this page and the tasks page for my bot of course). Pegasus1138Talk | Contribs | Email ---- 07:27, 22 April 2006 (UTC)
The list of file links at the bottom of Image:Nl_small.gif (link) is accurate, all of those talk pages (about 50-100) appear to have on them - which is what Mdd4696 wanted replaced. Keep in mind that the "Whatlinkshere" doesn't work for image uses, and also any capitalisation or underscore issues your regex might have. Also keep in mind that the flag really doesn't need to be substituted, seeing the red image links it's that distracting and many of the uses are in archives anyway.--Commander Keane 07:50, 22 April 2006 (UTC)
I just didn't want to seem like an ass, deleting Germen's sig image without fixing it again. Thanks for your help guys. ~MDD4696 15:42, 22 April 2006 (UTC)

The catch-all regex you'd want for this might look something like this:

\[\[\s*[Ii]mage\s*\:\s*[Nn]l[\s_]+small\.gif\s*(?:\|\s*[0-9]+px)?\s*\]\]

Personally I would recommend just removing it rather than replacing, but due to a history of problems with that user, I shall not be accepting this task. — Apr. 23, '06 [10:28] <freakofnurxture|talk>

Song -> Album redirects

I'm not sure if this is worth the time and effort, but what about a bot that would parse the Track Listing sections of pages at List of albums and create redirects to the album article? For example, the bot would create redirects like this one for the tracks at Operation: Mindcrime#Track listing. It would also create redirects for lowercase variations of song titles. Thanks, TheJabberwock 22:26, 20 April 2006 (UTC)

Note to whomever undertakes this task, it would be adviseable to create disambiguation pages in many cases. Thus, if the bot detects that the page it is about to create already exists as an {{R from song}}, it could determine the artist/band and the year associated with both songs and create a dab page, e.g.:

'''Foo at Tiffany's''' can refer to:
*[[Foo at Tiffany's (Steely Dan song)]], from the 1988 album ''[[Becker's Mom (album)|Becker's Mom]]''
*[[Foo at Tiffany's (Tupac song)]], from the 2007 album ''[[R U Still Buyin' Dis?]]''

Obviously the titles are fictitious, but the scenario is real. Each of these two links would then be created as a redirect to the article about the album on which the song was released. I'm thinking the above should only apply in the case of unrelated songs. Cover versions should, in my opinion, redirect to the original release, or be combined on the same line of the disambiguation page, if "Foo at Tiffany's" also refers to something else, such as a film. This would most likely require manual intervention.

In the event that the title does refer only to a song, and equally notable versions of the same song have been released my more than one artist, it should probably be created as a distinct article explaining this. — Apr. 29, '06 [10:35] <freakofnurxture|talk>

Daybar inclusion

Recently a template {{daybar}} was created to allow easy navigation through artciles for individual days given in the format below: June 10, 2004, June 11, 2004 etc. (see template page for specifics). On those pages it can be seen in use. This template could be used as the standard format for navigating through such articles, however it would be tedious to add them by hand. Hence I suggest a semi-automatic bot to add this template appropriately in the dates from January 1, 2003 till now. LukeSurl 16:29, 21 April 2006 (UTC)

Once you cull a list of articles (which would be the hard part) it would take a simple regex and ignore to do that. The regex would have it be put at the top of the articles and the ignore would prevent double placement on articles that already have it. An alternative to doing an ignore would be to use whatlinkshere to remove from the culled list all instances that already have it before running the regex. Pegasus1138Talk | Contribs | Email ---- 06:03, 22 April 2006 (UTC)
When adding this template to pages, the best wikitext is: {{daybar|{{subst:PAGENAME}}|xxxx}}. There are articles for most days, cant the ignore thing ignore nonexistent articles? -- Alfakim --  talk  13:42, 22 April 2006 (UTC)
I've made a list of days here. hope that helps someone. Martin 14:00, 22 April 2006 (UTC)
Thanks for the list, Martin! I hope Fetofsbot2 is doing it correctly, but I'll run it manually for the time being. Fetofs Hello! 14:26, 22 April 2006 (UTC)
How can I add those tags to articles that are not formatted correctly? AWB seems to have stopped at April... Fetofs Hello! 14:13, 23 April 2006 (UTC)
What do you mean by "not formatted correctly"? if you're talking about articles with bad titles, you can just manually specify the second parameter rather than using subst:PAGENAME.-- Alfakim --  talk  17:29, 23 April 2006 (UTC)
It seems like the problem was just in April. Is it only me but the page May 4, 2005 gives a terrible typo in the apache? (SpecialPage.ohp instead of SpecialPage.php). Fetofs Hello! 20:00, 23 April 2006 (UTC)

Linking

I Just wanna make my page or article "Don't Wanna lose You" that gets open when someone write "DON'T WANNA LOSE YOU", or "Don't Wanna Lose You" or "don't wanna lose you". I mean with diferent letters to the letters I've placed.—Preceding unsigned comment added by Charlie White (talkcontribs)

I've moved the page to Don't wanna lose you which is the standard form for this. As far as I can tell, the system now works to direct any form of capitalisation to that page LukeSurl 16:07, 23 April 2006 (UTC)

High Schools

Hello,

I don't know if this would be feasible, but I thought I would put it out there for discussion. It might be helpful (possible?) to create a bot that creates articles for all high schools. Just about every state has an article named "List of High Schools in XXXXX", XXXXX being the state, as well as a "Wikipedia Project Missing encyclopedic articles/High Schools/US/XXXXX", again XXXXX being the name of the state. The bot could (for instance) make an article with the name of the school and then the name of the city and state in parentheses (since a high school is often named the same as other high schools, ex: Southside High School (Gadsden, Alabama) and Southside High School (Fort Smith, Arkansas). The beginning article could be a stub that said only: "XXXXXXXXXX High School is a secondary education school located in CITY, STATE.", or something of that nature. It is my opinion that a lot of people that attend a high school, or alumni of a high school would be much more apt to edit an existing article on their high school than create a brand new article on their high school (much more so than other articles on wikipedia). (Cardsplayer4life 20:12, 23 April 2006 (UTC))

Something similar to this has been attempted before, see Wikipedia:Long term abuse/B-Movie Bandit. Horrible idea, I'm afraid. "Articles" (sub-stubs that is), especially those which are impossible to expand and impermissable to delete, should be avoided at all costs. — Apr. 24, '06 [01:24] <freakofnurxture|talk>
Perhaps I do not understand, 1) How does someone using several IPs to edit have something to do with this topic? and 2) How would the articles be impossible to expand? I have created articles in the past as stubs and they have been expanded. I was just thinking that there are only a few high schools that have articles now, and if you look at either of the 2 lists I mentioned for any of the states, you can see only about 10-20% of high schools have articles. It just seems as if there should be more high schools represented, and as it stands now, some are extremely well represented (have long articles), some have a little info, and some have absolutely no info. The types of articles that high schools fall into are different than other types of articles in that many people attended (and therefore theoretically have intimate knowledge of) said high schools, plus the types of people that will be searching for high schools will be much more likely to expand existing articles than even know how to go about creating a brand new article. This may not be the greatest idea, I admit, but the reasons you gave don't seem to apply to this request. If the concept was misunderstood, I can attempt to restate it in a different way. (Cardsplayer4life 02:06, 24 April 2006 (UTC))

It appears Wikipedia already has around 50,000 high school related pages[1], and a significant number of them are starved of content. I have a more useful suggestion, but it would require more effort to set up. A bot could scan the categories populated by the various {{*-school-stub}} templates, and for the articles we already have, e.g. Herbert Hoover High School (Des Moines), determine a [name, location] pair, form a search engine query, e.g. %2B%22Herbert+Hoover+High+School%22+%2B%22Des+Moines%22+-%22Wikipedia%22, produce a URL, e.g. [2], and post it to the article's talk page, to assist anybody who may have a keen interest in improving the article, but no idea where and how to find the info.

Further, with a little bit of fuzzy logic, it may be possible to ascertain some vital statistics (e.g. enrollment, year of establishment, mascot, name of current principal...) with a reasonable (but far from perfect) degree of accuracy, and enumerate them in the form of a bulleted list on the stub's talk page, including also a link to the page(s) from which the information was derived. A human could follow the links, verify the factoids, hopefully also locate additional information while visiting the various sites (some of them might be obscure news articles, for example), and then add the information to the article.

If somebody wanted to run a bot like that, and a critical mass of other people were willing to follow up on each result posting, this idea could be a successful operation, improve Wikipedia's coverage of non-notable schools, make the existing stubs worth keeping, and help reduce the community's animosity toward the WP:SCH project as a whole. If somebody wants to give this a serious attempt, I might be able to provide a bit of technical support. — Apr. 29, '06 [10:03] <freakofnurxture|talk>

The following appears to be a duplicate or very similar request to the one above. — Jun. 5, '06 [19:17] <freak|talk>

Could someone make a robot that would put on school information? 2 websites I would like you to use:
--Ksax 18:57, 5 June 2006 (UTC)

Is there a way of automatically placing the "featured article star" on the inbound "in other languages" links from other language wikipedias? Currently it is necessary to manually go through each linked language and add {{Link FA|en}} to the code, surely this could be done with a bot or something. Furthermore, the stars need to be removed when an article gets de-listed from FA status, this should also be done automatically. Is this even possible? Witty lama 04:18, 24 April 2006 (UTC)

Pywikipedia has a function for it, I think.--Orgullomoore 05:09, 24 April 2006 (UTC)
I am certain :) Fetofs Hello! 00:17, 25 April 2006 (UTC)
So how do we go about doing it? What has to happen to have this function enacted? Witty lama 07:18, 25 April 2006 (UTC)
You can read more on the Pywikipediabot. If you know someone who has the time to run it, read the bot policy and give it a go! Fetofs Hello! 23:02, 26 April 2006 (UTC)

arrondissment --> arrondissement

.. would be a useful spelling fix. Colonies Chris 22:47, 24 April 2006 (UTC)

I see what you mean. It's a big task, but hopefully a feasible one. Fetofs Hello! 00:24, 25 April 2006 (UTC)

General Vandalism bot.

I think that when a page or more of text is deleted from a single article, a bot should undo the change. Since the standards of information that are put into the article are fairly high, anyone deleting a lot of this information is likely a vandal. For someone who is actually doing work, the bot should send them their alterations in it's response (So if it's valid they don't have to retype it) and give a link to a human editor they can appeal to if their editing wasn't vandalism.

Meet Tawkerbot2 - it already does it and more :) -- Tawker 20:58, 26 April 2006 (UTC)

Innapropriate terminology bot?

Hi

There seem to be about 145 articles (Google search terms USS + Splashed) on US Navy ships of the Second World war that use the terms "splash", "splashing" or "splashed" as euphimisms for the shooting down by US Forces of aircraft flown specifically by Japaneese pilots, or crashes involving such aircraft. The wording appears to have been taken in verbatim from the Dictionary of American Naval Fighting Ships. It could be argued that this terminology makes light of the deaths of the young men involved. If this argument is accepted then is this something a bot could be used to fix?

PS My comment implies no endorsement of the Japanese military during WW2. I had a grand uncle who served in the pacific. --Sf 16:19, 28 April 2006 (UTC)

I think there is a general consensus to avoid using bots for POV related issues. Even if the code being used had a 0% rate of error (which seems improbable in this case), I'd still recommend manual edits. — Apr. 29, '06 [09:19] <freakofnurxture|talk>
I also oppose this per freakofnurture due to POV issues, also even if a consensus is reached regarding this it's a job that would be better served being done by hand where a person can review each edit for accuracy rather than a bot. Pegasus1138Talk | Contribs | Email ---- 22:27, 1 May 2006 (UTC)