Wikipedia:Bots/Requests for approval/Yapperbot 4
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Naypta (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 22:22, Sunday, June 21, 2020 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): Golang
Source code available: https://github.com/mashedkeyboard/yapperbot-pruner
Function overview: Removes inactive users from lists (WikiProject membership, FRS, etc) and notifies them. Also renames users who have undergone a rename, and removes indefinitely-blocked users without notifying them.
Links to relevant discussions (where appropriate): WP:VPT#Is there a tool that...
Edit period(s): Weekly
Estimated number of pages affected: As many pages as transclude the configuration, plus user talk pages of removed users
Exclusion compliant (Yes/No): Yes; note that it is exclusion-compliant on user talk pages, but not on the actual list pages that its configuration is transcluded into. If a page itself wishes to opt-out of the pruning, it just needs to have the configuration template removed.
Already has a bot flag (Yes/No): Yes
Function details: Detects users listed on a signup list, membership list or other page, such as WP:FRS or a WikiProject members list, and periodically prunes those users according to a configuration. Configuration details are available on the setup page for it, but it basically works the same way in terms of on-wiki configuration to the way in which archiving bots work - tracks transclusions of a template, with "phantom parameters", which the bot then detects and uses.
It'll remove inactive users after a timeframe since their last edit which is set in the on-page configuration, and will remove indefinitely-blocked users automatically after a grace period (defaults to 2 months, but is configurable on the page). It will also rename users who have undergone a rename.
It currently supports four formats of user list, all of which correspond to regexes that are (a recurring theme among my bot tasks; say it with me) stored on-wiki!, so they can be changed and improved easily in the future if template parameters change, or if people want to use different formats.
Discussion
[edit]- Looks like a good idea to me. The only I would reconsider is the instant removal of indeffed users: we should probably give them some time to appeal (perhaps 1-3 months?). Best, --Mdaniels5757 (talk) 15:25, 22 June 2020 (UTC)[reply]
- @Mdaniels5757: This is a good point, and is a pretty easy change to the SQL it uses internally to lookup blocks - gets out IDE let me sort that :) Naypta ☺ | ✉ talk page | 15:46, 22 June 2020 (UTC)[reply]
- Done, see code change and updated documentation. The grace period is configurable per-page, with a default of 2 months. Naypta ☺ | ✉ talk page | 17:05, 22 June 2020 (UTC)[reply]
- @Mdaniels5757: This is a good point, and is a pretty easy change to the SQL it uses internally to lookup blocks - gets out IDE let me sort that :) Naypta ☺ | ✉ talk page | 15:46, 22 June 2020 (UTC)[reply]
- Going to put a ping to MusikAnimal for their thoughts in the implementation, since they run a similar bot (though in a different scope). Primefac (talk) 18:38, 22 June 2020 (UTC)[reply]
- I have long thought about expanding my bot to work for any list. Glad to see someone is tackling this! The mechanics seem well thought out. I like that it goes by a template similar to archiving bots. Not only that, but the code is written in Go! Very cool :) Feel free to create Category:Wikipedia bots with Go source code published and put your bot in it.
- It would seem this bot is fundamentally different than the way AWBListMan works, but I'll share some pain points I ran into. The main challenge was keeping track of who's been notified, and when. While it's rare, sometimes Toolforge will hiccup or even the wiki goes into read-only mode and the task breaks halfway through. For this reason my bot kept a local cache of usernames and when they were notified. In your case you may ultimately be processing many more thousands of users than my bot ever did, so I would give thought into data persistence if you think you need it. You could of course simply check when the bot last edited the user's talk page on each run, or something along those lines, but in my case all the queries combined amounted to an already very slow task so I tried to find ways to speed it up wherever I could. Scalability may be more important in your case.
- I also seem to recall issues with checking for renames, but looking at my code I don't see anything particularly complicated about it. I went by the
metawiki_p.logging
table [1], which may be more foolproof than checking the redirect table here on enwiki (e.g. it's possible the user doesn't have a talk page yet, or they chose to redirect it to a different place in their userspace, who knows...) - Hope this helps! I am not Go-savvy so I did not do a thorough code review, but at quick glance it seems well thought out. Thanks again for taking this on, — MusikAnimal talk 20:16, 22 June 2020 (UTC)[reply]
- Oh sorry, I think I misread. Your task doesn't wait for the user to resume activity during a 1 week grace period before removing them from the list. So my points about that can be ignored! Removing someone from a WikiProject list or the like isn't that big of a deal since they can just add themselves back, as opposed to the AWB list which can only be edited by admins. — MusikAnimal talk 20:22, 22 June 2020 (UTC)[reply]
- Heh, you pre-empted my message by about twenty minutes - teaches me to go off and do something, then reply without refreshing! Naypta ☺ | ✉ talk page | 20:40, 22 June 2020 (UTC)[reply]
- Cheers MusikAnimal! I'm actually relatively new to Go, but have a lot of experience with Ruby, so looking through your code is always nice
- AWBListMan has a slightly more complex job, as it needs to warn users in advance of their removal, as well as just removing them. At the moment, Pruner doesn't warn users of their removal before it does it, much less require that they have been warned, on the basis that, unlike the AWB list, I'm not expecting it to be immediately used on lists that require applications to be re-added. If anyone requests that feature in the future, I'd be happy to look into it, but for now it's not within the specification for the bot. That means it doesn't need persistent data storage as much; the worst that can possibly happen is a few inactive users not being notified of their removal, from a list that they would be able to add themselves back to on their return. Indeed, this is the way which manual pruning often works anyway; it's rare to see someone manually pruning a WikiProject membership list and manually notifying each user they remove for being inactive.
- I did consider the idea of using `metawiki_p.logging`, but I decided against that for performance reasons, mostly - which may well be the problems you're recalling. It would either require switching database to the metawiki database part way through, which would be a bit of a pain as Golang uses DSNs for connections, meaning I'd probably have to set up an entire new DB connection for it, or using the ridiculously expensive cross-database querying (to give an idea, my current redirect method finishes the query in 0.03s on a recently-renamed user, naming no names, whereas it takes 14.26s to finish on the same user using the cross-DB query!), which both present issues. That being said, it's an excellent point that the user might choose to redirect to a different place in their userspace - I'll chop off anything after a slash! I'm already excluding redirects that go anywhere other than userspace, as you may have seen - those users would get a standard inactivity message left on their old talk, or if that redirects someplace different, wherever it happens to redirect to.
- On a personal note, the MusikBots are a "wiki-household name" to me, and I've always been thoroughly impressed by your work with them :) Naypta ☺ | ✉ talk page | 20:40, 22 June 2020 (UTC)[reply]
- Thanks! Though I'm not going to pretend my Ruby code is any bit pretty :)
- Currently, you should be able to access any wiki's db through the same connection, though this is surely not future-proof. If that causes performance problems, especially of that severity, then I concur your approach is probably the sanest. The likelihood that someone doesn't have a talk page (red link) but would be on some list seems slim anyway. — MusikAnimal talk 21:08, 22 June 2020 (UTC)[reply]
- Oh sorry, I think I misread. Your task doesn't wait for the user to resume activity during a 1 week grace period before removing them from the list. So my points about that can be ignored! Removing someone from a WikiProject list or the like isn't that big of a deal since they can just add themselves back, as opposed to the AWB list which can only be edited by admins. — MusikAnimal talk 20:22, 22 June 2020 (UTC)[reply]
- Question as someone who has her old name on some projects, will this automatically update to my "new" (probably about a decade now) name? I'd prefer not to have that for privacy reasons. Thanks! StarM 19:41, 22 June 2020 (UTC)[reply]
- @Star Mississippi: It uses username redirects to detect naming, so if your old username redirects to your new one, it will update the name. If it doesn't, it won't update. It's effectively just flattening what MassMessage would already do there, and it's only affecting pages which explicitly opt-in to it Naypta ☺ | ✉ talk page | 19:44, 22 June 2020 (UTC)[reply]
- Thanks @Naypta:. It doesn't redirect, so we're all good and my concern is alleviated. I just wasn't sure if the project might opt in as this generally sounds like a good tool. Thanks for quick response. StarM 01:11, 23 June 2020 (UTC)[reply]
- Some users redirect or soft redirect their user page to some other page, most commonly their user talkpage but any other destination is possible in theory. Does the bot cope with this? Thryduulf (talk) 21:36, 27 June 2020 (UTC)[reply]
- @Thryduulf: Sure - to be more specific, it detects redirects from user talk pages of renamed users to any other page in user talk namespace. That is to say, if User talk:Example redirected to User talk:Example2, User talk:Example2/foo, User talk:Example2/foo/bar, etc, the bot would consider that
Example
had been renamed toExample2
. If the user talk page was to redirect to any other namespace, the bot simply would not detect the rename, and would leave a message on whatever page the user talk page happened to redirect to, saying that the user had been removed for inactivity. The only situation in which a user could ever be incorrectly renamed would be if their user talk page redirected to the talk page of another user who was not them, which I would argue violates WP:TPG anyway as impersonating another editor. Naypta ☺ | ✉ talk page | 21:41, 27 June 2020 (UTC)[reply]
- @Thryduulf: Sure - to be more specific, it detects redirects from user talk pages of renamed users to any other page in user talk namespace. That is to say, if User talk:Example redirected to User talk:Example2, User talk:Example2/foo, User talk:Example2/foo/bar, etc, the bot would consider that
- Some users redirect or soft redirect their user page to some other page, most commonly their user talkpage but any other destination is possible in theory. Does the bot cope with this? Thryduulf (talk) 21:36, 27 June 2020 (UTC)[reply]
- Thanks @Naypta:. It doesn't redirect, so we're all good and my concern is alleviated. I just wasn't sure if the project might opt in as this generally sounds like a good tool. Thanks for quick response. StarM 01:11, 23 June 2020 (UTC)[reply]
{{BotTrial}} Whichever comes first. Primefac (talk) 18:27, 30 June 2020 (UTC)[reply]- Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. 1 full run, per User_talk:Cyphoidbomb#Yapperbot_Pruner_-_trial_invite. Primefac (talk) 22:10, 30 June 2020 (UTC)[reply]
- Trial note: I observed an initial error where renamed users were not being correctly renamed, but rather having their names lopped off their listing! This has now been corrected, and you can see the bot working correctly here. Naypta ☺ | ✉ talk page | 20:38, 30 June 2020 (UTC)[reply]
- Trial note 2: Thanks to me being a bit of a klutz, I forgot to remove the editlimit initially when doing the runthrough of the WP India page. To try and make sure that the bot's run was as demonstrative as possible, I partially reverted the first pruning, which was edit-limited out of sending messages to users. This works pretty well, because the page is mostly in alphabetical order; however, in some cases, some users may have received two messages. Rest assured this is a consequence of me being a klutz, and not a bot error; this would not happen in normal operation, when the bot doesn't have an edit limit set. Sorry for the trouble! Naypta ☺ | ✉ talk page | 22:34, 30 June 2020 (UTC)[reply]
- Trial note 3: I noticed that a couple of users had not had their redirects picked up when they should have done. The problem here was that MediaWiki stores user records with spaces instead of underscores, but then stores page records with underscores instead of spaces, so users who have spaces in their names, and were on the list, and had also undergone a global rename, did not have their redirects picked up correctly. This affects a very small proportion of the users, but nonetheless, it is now fixed. Naypta ☺ | ✉ talk page | 22:46, 30 June 2020 (UTC)[reply]
- Update: Upon further investigation, 11 users were affected by this bug, all of whom I have now reinstated to the list. Naypta ☺ | ✉ talk page | 18:50, 1 July 2020 (UTC)[reply]
- Trial complete.. With thanks to Primefac for allowing an extension of the trial to run over the page, and Cyphoidbomb for adding the configuration template to that page! With the exception of the above mentioned issues, the bot run seems to have been successful. Naypta ☺ | ✉ talk page | 22:50, 30 June 2020 (UTC)[reply]
- Could you please post a link(s) to the relevant edits/sequence of edits? Primefac (talk) 17:38, 5 July 2020 (UTC) (please do not ping on reply)[reply]
- Of course! Here it is - you can see where I restarted the bot, as well, because it leaves another edit summary on the WPIN page. The only duplicate messages are sent after that point, as a consequence of that, and not of the bot itself. Naypta ☺ | ✉ talk page | 17:48, 5 July 2020 (UTC)[reply]
- Could you please post a link(s) to the relevant edits/sequence of edits? Primefac (talk) 17:38, 5 July 2020 (UTC) (please do not ping on reply)[reply]
- The signature regex in formats.json is wrong. It should be
\d{1,2}
instead of\d\d
to handle single-digit days. SD0001 (talk) 13:44, 2 July 2020 (UTC)[reply]- @SD0001: You are, of course, absolutely right - thanks very much for pointing it out! Fixed Naypta ☺ | ✉ talk page | 13:47, 2 July 2020 (UTC)[reply]
- Approved. Minor changes to coding (i.e. errors like those mentioned above) can be made without discussion, but please discuss/request any major changes if they are needed. Primefac (talk) 21:14, 6 July 2020 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.