Jump to content

Wikipedia talk:AutoWikiBrowser/Typos/Archive 5

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1Archive 3Archive 4Archive 5

Improvements for WPCleaner?

Hi team! While I enjoy using WPCleaner, there are some articles that take a long time to load and a long time to save. Today I decided to look at the log after processing List of television channels in Vietnam and found lots of messages about the typo list. Are there improvements we can make to these rules to make them go faster?

00:01:16.820 [Thread-59] INFO  PERF - Slow regular expression (List of television channels in Vietnam): Typo AWB Commercially(797922ms):(?<![a-z]+-)\b([cC])ommerciall?y-(?=[a-z]+(?:ble\b|ed\b|ful\b))(?![a-z]+-)
00:01:35.562 [Thread-60] INFO  PERF - Slow regular expression (List of television channels in Vietnam): Typo AWB premiere(94110ms):\b(?<=(?:film|movie)\s+)premier\b
18:02:14.403 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 34
\ball[-–]time(?<=\sof\s+all[-–]time)(?=(?:[,\.\)]|\s+(?:in|by)\s))
18:02:14.404 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 64
\bDep(artment|uty)\b(?<=(?:(?:\bAs|The|\s[a-z]+|[-–;,])\s+|\()\w+)\s+[cC]hair(m[ae]n|persons?|wom[ae]n)?\b(?=(?:\s+of\s+the(?:\s+[aA]dvisory)?\s+[bB]oard\b|\s+(?:a(?:fter|nd|t)|b(?:etween|y)|during|f(?:or|rom)|i[ns]|on|since|to|until|w(?:as|ith))\s|[,;\.\)])|\s+[a-z]+[,;\.\)]|\s+[io]n\s|\s+of\s+the\s+[a-z]|(?:\s+[a-z]+){3,}|['’´]s\s+[a-z])
                                                                ^
18:02:14.404 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 67
\b(\d+)[-\s]+and[-\s]+a[-\s]+half(?<=\s\d+[-\s]+and[-\s]+a[-\s]+half)[-\s]+(?=(?:centuries|d(?:ays|ecades)|feet|hours|m(?:i(?:l(?:es|lennia)|nutes)|onths)|s(?:e(?:asons|conds|mesters)|tars)|points|weeks|years)\b)
                                                                   ^
18:02:14.404 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 356
\bpurpose\s+built(?=\s+(?:arenas?|buildings?|c(?:ampus|lubhouse|om(?:munity|plex))|depot|f(?:ac(?:ilit(?:ies|y)|tor(?:ies|y))|o(?:otball|r))|g(?:a(?:llery|rage)|round)|location|m(?:osques?|useum)|new|offices?|premises|road|s(?:chool|et|ite|t(?:a(?:dium|ge)|ore|rip|udios?))|t(?:heatre|raining)|unit)\b)|purpose(?<=(?:,|\b(?:[aA]|first|its|new|[tT]he))\s+\w+)\s+built\b
                                                                                                                                                                                                                                                                                                                                                                    ^
18:02:14.406 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 53
\bDean(?<=[,\.]\s+\w+|\s[a-z]+\s+\w+|\bThe\s+\w+|\(\w+)(?<!Office\s[a-z\s]+\w+)\s+[oO]f\s+[sS](ciences?|tud(?:ents|ies))(?=(?:[,\.\)]|\s+(?:a(?:nd|t)|[a-z]{4,20}|for|in)))
                                                     ^
18:02:14.406 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 32
[-\s]+grand\b(?<=\bgreat[-\s]+\w+)[-\s]+(aunts?|child(?:ren)?|daughters?|fathers?|kids?|mothers?|n(?:ephews?|i(?:blings?|eces?))|parents?|sons?|uncles?)\b
                                ^
18:02:14.406 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 45
\bBoard(?<=[;,]\s+\w+|\s[a-z]+\s+\w+|The\s+\w+)\s+[mM]ember(s)?(?=[,;\)\.]|\s+[a-z])
                                             ^
18:02:14.406 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 33
\bBoard(?<=(?:[;,]|\s[a-z]+)\s+\w+)\s+[cC](hair(?:man|person|woman)?)(?=(?:[,;\)\.]|\s+[a-z]))
                                 ^
18:02:14.406 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 33
\bBoard(?<=(?:[;,]|\s[a-z]+)\s+\w+)\s+[pP]resident(?=[,;\)\.]|\s+[a-z])
                                 ^
18:02:14.406 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 24
\bmid\b(?<=\b[tT]he\s+\w+)[\s–]+(20\d0|1[4-9]\d0)['’;´???‘???`]?s\b
                        ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 70
\bAcademy\b(?<=\b(?:[aA]n?|[iI]ts|new|of|same|[tT]h(?:e|is)|\w+,)\s+\w+)(?!(?:\s+(?:for|o[fn])(?:\s+the)?)?\s+[\dA-Z])(?<![\w,]\s+An\s+\w+)
                                                                      ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 71
\beaster\b(?<=\b(?:[fF]ar[-–??—?\s]+easter\b|[nN]ear[-–??—?\s]+easter\b))
                                                                       ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 183
\bD(e(?:puty|sign|velopment)|istrict)(?<=(?:\(|(?:[,;]|[\s\(](?:[a-z]+|A[ns]|Current|Former|Its|The))\s+)\w+)\s+[dD]irector(s)?(?=\s+[a-z\(]|[,;\.\)])(?<!\b[A-Z][a-z]+\s+of\s+\w+\s+\w+)
                                                                                                                                                                                       ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 207
\bA(cting|dministrative|rtistic|ss(?:istant|ociate)|thletic)(?<=(?:\(|(?:[,;]|[\s\(](?:[a-z]+|A[ns]|Current|Former|Its|The))\s+)\w+)\s+[dD]irector(s)?(?=\s+[a-z\(]|[,;:\.\)])(?<!\b[A-Z][a-z]+\s+of\s+\w+\s+\w+)
                                                                                                                                                                                                               ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 191
\bM(a(?:naging|rketing)|edical|usic(?:al)?)(?<=(?:\(|(?:[,;]|[\s\(](?:[a-z]+|A[ns]?|Current|Former|Its|The))\s+)\w+)\s+[dD]irector(s)?(?=\s+[a-z\(]|[,;:\.\)])(?<!\b[A-Z][a-z]+\s+of\s+\w+\s+\w+)
                                                                                                                                                                                               ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 192
\bE(d(?:itorial|ucation)|ngineering|xecutive)(?<=(?:\(|(?:[,;]|[\s\(](?:[a-z]+|A[ns]|Current|Former|Its|The))\s+)\w+)\s+[dD]irector(s)?(?=\s+[a-z\(]|[,;:\.\)])(?<!\b[A-Z][a-z]+\s+of\s+\w+\s+\w+)
                                                                                                                                                                                                ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 64
\bD(epartment(?:['’´???‘???`]s)?)(?<=(?:\bThe|\s[a-z]+)\s+[\w'’]+)(?=\s+[a-z]+\s+[a-z\d]|[,;\.])(?!\s+of\s)
                                                                ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 67
\bCo\b(?<=(?:\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+|,\s+|\()\w+)-[dD]irector(s)?\b(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                                   ^
18:02:14.407 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 69
\bVice\b(?<=(?:\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+|,\s+|\()\w+)-[dD]irector(s)?\b(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                                     ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 42
\bDirector(s)?\b(?<=\b(?:[cC]o|[vV]ice)-\w+)
                                          ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 69
\bVice\b(?<=(?:\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+|,\s+|\()\w+)-[pP]resident(s)?\b(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                                     ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 43
\bPresident(s)?\b(?<=\b(?:[cC]o|[vV]ice)-\w+)
                                           ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 67
\bCo\b(?<=(?:\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+|,\s+|\()\w+)-[pP]resident(s)?\b(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                                   ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 67
\bCo\b(?<=(?:\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+|,\s+|\()\w+)-[fF]ounder(s)?\b(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                                   ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 29
\bFounder(s)?\b(?<=\b[cC]o-\w+)
                             ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 57
\bVice\b(?<=\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+\w+)-[cC]hair(m[ae]n|persons?|wom[ae]n)?(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                         ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 62
\bChair(m[ae]n|persons?|wom[ae]n)?\b(?<=\b(?:[cC]o|[vV]ice)-\w+)
                                                              ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 55
\bCo\b(?<=\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+\w+)-[cC]hair(m[ae]n|persons?|wom[ae]n)?(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                       ^
18:02:14.408 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 61
\bAdvisory\b(?<=\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+\w+)\s+[bB]oard(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                             ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 64
\bSupervisory\b(?<=\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+\w+)\s+[bB]oard(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                                ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 63
\bManagement\b(?<=\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+\w+)\s+[bB]oard(?=(?:[,;\.\)]|\s+[a-z\(]))(?!\s+of\s+Cabinet)
                                                               ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 73
\bE(ditorial|xecutive)\b(?<=\b(?:[a-z\d]+|Its|The|\w+['’´???‘???`]s)\s+\w+)\s+[bB]oard(?=(?:[,;\.\)]|\s+[a-z\(]))
                                                                         ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 76
\bForeword(?<=\b(?:A|a(?:nd)?|[hH](?:er|is)|new|[tT]he|[wW]ith|\w+[,;])\s+\w+)(?=(?:[,;:\)\.]|\s+(?:by|of|to|\w+,)\s))
                                                                            ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 70
\bCollege\b(?<=\b(?:[aA]n?|[iI]ts|new|of|same|[tT]h(?:e|is)|\w+,)\s+\w+)(?!(?:\s+(?:de|for|o[fn])(?:\s+the)?)?\s+[\dA-Z])(?<![\w,]\s+A\s+\w+)
                                                                      ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 73
\bUniversity\b(?<=\b(?:[aA]n?|[iI]ts|new|of|same|[tT]h(?:e|is)|\w+,)\s+\w+)(?!(?:\s+(?:at|for|o[fn])(?:\s+the)?)?\s+[\dA-Z?])(?<![\w,]\s+A\s+\w+)
                                                                         ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 46
\bCaptain(?<=\s(?:as?\s+\w+|its\s+\w+|to\s+\w+))(?=(?:[,;\.\)—])|\s+(?:and\s|in\s|–\s))
                                              ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 35
\bChief(?<=(?:\s[a-z]+|[-–;,])\s+\w+)\s+[eE](ntertainment|quipment|thics|x(?:ecutive|perimental))\s+[oO]fficer\b
                                   ^
18:02:14.409 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 80
\bChair(m[ae]n|persons?|wom[ae]n)?\b(?<=(?:(?:\bAs|The|\s[a-z]+|[-–;,])\s+|\()\w+)(?=(?:\s+of\s+the(?:\s+[aA]dvisory)?\s+[bB]oard\b|\s+(?:a(?:fter|nd|t)|b(?:etween|y)|during|f(?:or|rom)|i[ns]|on|since|to|until|w(?:as|ith))\s|[,;\.\)])|\s+[a-z]+[,;\.\)]|\s+[io]n\s|\s+of\s+the\s+[a-z]|(?:\s+[a-z]+){3,}|['’´]s\s+[a-z])
                                                                                ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 62
\bBoard(?<=\b(?:[aA]|[iI]ts|new|[tT]he|\w+['’´???‘???`]s)\s+\w+)\s+of\s+[tT]rustees(?=(?:[;\.\)]|\s+(?:a(?:nd|t)|c(?:haired|o(?:mposed|nsisting))|elected|for|i[ns]|made|t(?:hat|o)|w(?:as|h(?:ich|o))|with|–)\s))
                                                              ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 62
\bBoard(?<=\b(?:[aA]|[iI]ts|new|[tT]he|\w+['’´???‘???`]s)\s+\w+)\s+of\s+[dD]irectors(?=(?:[;\.\)]|\s+(?:a(?:nd|t)|c(?:haired|o(?:mposed|nsisting))|elected|for|i[ns]|made|t(?:hat|o)|w(?:as|h(?:ich|o)|ith)|–)\s))
                                                              ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 34
\b([cC])uidad\b(?<!Fabrice\s+Cuidad)
                                  ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 55
\bH(o(?:spital|tel))(?<=\b(?:[iI]ts|[tT]h(?:e|is))\s+\w+)(?=(?:[,\;\.\)\:]|['’´???‘???`]s\s|\s+[\(–]|\s+(?:a(?:fter|lso|n(?:d\s+[a-z]+\s+[a-z]+)?|re|[st])|b(?:ecame|u(?:ilding|rned)|y)|c(?:an|losed|omplex)|f(?:or|rom)|grounds|ha[ds]|[io]n\s+[a-z\d]+|i(?:ncluding|s)|manager|o(?:ffers|n|pened)|provides|receive[ds]|site|to|until|w(?:as|ere|he(?:n|re))|w(?:hi(?:ch|le)|i(?:ll|th)))\b))(?!\s+for(?:\s(?:an?\b|the\b))?\s+[A-Z])
                                                       ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 162
\bC(athedral|ent(?:er|re)|hapel|ity|lub|o(?:llege|m(?:mi(?:ssion|ttee)|pany))|on(?:sulate|vention)|o(?:rporation|un(?:cil|ty)))(?<=\b(?:[iI]ts|[tT]h(?:e|is))\s+\w+)(?=(?:[,\;\.\)\:]|['’´???‘???`]s\s|\s+[\(–]|\s+(?:a(?:cquired|fter|lso|n(?:d\s+[a-z]+\s+[a-z]+|nounced)?|re|s?)|b(?:e(?:fore|gan)|ut|y)|c(?:an|o(?:nducts|uld)|urrently)|during|established|f(?:or|rom)|h(?:a[ds]|osts)|is?|launched|ma(?:de|intains)|now|o(?:ffers|n\s+[a-z\d]+|perates|r)|receive[ds]|s(?:hould|upports)|t(?:he|o)|until|w(?:as|ere|hile|i(?:ll|th)|o(?:rks|uld)))\b))(?!\s+(?:for|o[fn])(?:\s(?:an?\b|the\b))?\s+[A-Z])
                                                                                                                                                                  ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 28
\bInstitute(?<=\b[tT]he\s+\w+)(?=(?:[,\;\.\)]|'s\s|\s+\(|\s+(?:a(?:fter|lso)|before|c(?:onducts|urrently)|during|from|h(?:as|osts)|is?|maintains|o(?:ffers|n\s+[a-z\d]+|perates)|supports|to|w(?:as|i(?:ll|th)|orks))\b))
                            ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 67
\bfull(?<=\b(?:[aA]|f(?:i(?:fth|rst)|ourth)|only|second|third)\s+\w+)\s+length\b
                                                                   ^
18:02:14.410 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 217
\bV(ice)\b(?<=\s+(?:a(?:cting|ppointed|s?)|be(?:en|c(?:ame|om(?:e|ing)))?|Democratic|elected|for(?:mer)?|h(?:er|i[ms])|i(?:ncumbent|s|ts)|n(?:amed|ew)|Republican|s(?:erving|itting)|t(?:heir|o)|U\.?S\.?|was|\w+'s)\s+\w+)(?<![A-Z][a-z]+\s+for\s+\w+)([-\s]+)[pP](residen(?:cy|t(?:ial|sial)?))(?=(?:[,\.;\)]|\s+[a-z]+))(?!\s+of\s)
                                                                                                                                                                                                                         ^
18:02:14.411 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 238
\bP(residen(?:cy|t(?:ial|sial)?))(?<=\s(?:a(?:cting|ppointed|s)|be(?:en|c(?:ame|om(?:e|ing)))?|Democratic|elected|for(?:mer)?|h(?:er|i[ms])|i(?:ncumbent|s|ts)|n(?:amed|ew)|Republican|s(?:erving|itting)|t(?:heir|o)|U\.?S\.?|was|\w+'s)\s+\w+)(?<![A-Z][a-z]+\s+for\s+\w+)(?=(?:[,\.;\)]|\s+[a-z]+))(?!\s+of\s)(?!\s+and\s+Vice[-–??—?\s]+President\s+of\s)
                                                                                                                                                                                                                                              ^
18:02:14.411 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 15
\b(\d+)(?<=\s\d+)[-—](\d+)(?=\s+(?:a(?:cademic|dvantage|gainst|t|way)|career|d(?:ef(?:eat|icit)|raw)|edge|final|game|home|in|l(?:ead|oss)|mark|o(?:n|ver(?:time)?)|r(?:ecord|out|un)|sc(?:hool\s+year|ore(?:line)?)|s(?:e(?:asons?|ries)|h(?:ootout|utout))|s(?:plit|tart|weep)|t(?:erm|ie|o|riumph)|upset|v(?:ictory|ote)|wi(?:n|th))\b)(?<!\b7\d7-\d+)
               ^
18:02:14.411 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 15
\b(\d+)(?<=\s\d+)[-—](\d+)(?=[,\.;\n\)])(?<!\b(?:Boeing|Columbia|Dash|LCCN|I(?:EC|NCITS|S[BS]N|SO(?:/IEC)?)|ANSI(?:/VITA)?|FIPS|N(?:ACA|[oO]\.?:?)|[nN]umber:?|#:?|P(?:art|ublication)|S(?:ection|/[nN]:?)|s/[nN]:?|VITA|Widow)\s+\d+[-—]\d+)(?<!\b(?:\d(?:[-—][02-9]\d|\d[-—][02-9]\d\d)|\1[-—]\1\b|7\d7-\d+))
               ^
18:02:14.411 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 41
\b[nN]orth(?<=(?:,|\b(?:[a-z]+|The))\s+\w+)(east|west)?[-/][sS]outh(east|west)?\b(?!\s+[A-Z])
                                         ^
18:02:14.411 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 41
\b[sS]outh(?<=(?:,|\b(?:[a-z]+|The))\s+\w+)(east|west)?[-/][nN]orth(east|west)?\b(?!\s+[A-Z])
                                         ^
18:02:14.411 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 40
\b[eE]ast(?<=(?:,|\b(?:[a-z]+|The))\s+\w+)-[wW]est\b(?!\s+[A-Z])
                                        ^
18:02:14.412 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 40
\b[wW]est(?<=(?:,|\b(?:[a-z]+|The))\s+\w+)-[eE]ast\b(?!\s+[A-Z])
                                        ^
18:02:14.413 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Illegal/unsupported escape sequence near index 1662
\b(?<=[\s\(]|\A)([aA])(?<=a|(?:[\.\n]\s\s?\s?|\A)A)\s\s?((?:[aA](?!\b|2\b|[aA]A?[aTA]?|b(?:ogado|rirse)|c(?:a(?:demiei|o)|ceptat?a?|estei|ordo|quis)|ddaswyd|ED|FN|f(?:ace|ecta)|j(?:outé|uns)|l(?:ba\b|calde|do\b|guien\b|ma)|LL|m(?:basadei|érica)|MD|[nN](?:\b|amnese|[dD]\b|daluc|G|ihila|tiga|ului)|OA|p(?:a(?:gar\b|recer\b)|ostar\b|robat\b)|quest|r(?:der\b|enys\b|matei\b|quitectura|te(?:\b|lor\b))|R[S\$]|s(?:\b|souvi)|t(?:\b|ahualpa\b|enuar|hair|lântida|riz|teint)|U[DS\$\£]|us(?:iàs|triei\b)|v(?:ançar|enida|ut\b)|WG|ZN)|[eE](?!\b|cologia|di(?:ção|l\b|tora\b)|gipto|GP|l(?:a\b|itei\b|las\b)|m(?:a\b|b(?:ajadora|oîté)|igracja|pezar\b)|n(?:core\sdu|ergia|f(?:lamm[eé]|rentar)|ga(?:gé|ñar)|loquecer|se[nñ]ar|tend(?:erse|u)\b|tra(?:da|[iî]n[eé]|r\b)|tre(?:na(?:dor|r))?|voyé)|qui(?:librista|p[ao])|RN|s(?:as?\b|c(?:a[dl]a|ola|u(?:char|ela|ltura|ridão))|fera|p(?:a(?:ldas|[nñ?])|erança))|st(?:a(?:\b|ciones|dos|r\b)|é|e(?:\b|ban)|o(?:\b|s\b)|ra(?:da|t[eé][gx]ia)|r(?:é[il]a|e(?:[il]a|llar)|uc?tura)|udia[nr])|TB\b|t(?:é|e(?:rna)?)?\b|[uU](?:[A-Za-z]{2}|\sde\b)|U[IR]|v(?:acuar|r(?:eilor|op))|w[abei]|x(?:ist[ée](?:ncia)?|p(?:ansão|eri[eê]ncia|osição|ressão))|xtranj)|h(?:aut[besu]|eir|o(?:rs\sd|ur(?:\b|[gs]|ly)))|[iI](?!\b|a(?:ij|[??]i)|DR\b|greja|[iI][iI]?[iI]?|l(?:\sraen\b|egal|ha\b)|LS|m(?:age[nm]\b|igração|magini\b)|n(?:\b|ceput|dia[’']?s\b|d(?:icação|ro\b|úst)|és|f(?:luência|orma?ní)|glat|icios|nei\b|quisição|s(?:t(?:ancias|itucí)|ulté)|t(?:e(?:gra(?:nte|rse)|ligência|r(?:preta|ven)[cç][aã]o)|imidade|ra\b)|v(?:asão|e(?:nté|stit)))|NR\b|QD\b|R(?:£|R\b)|s(?:\b|chia\b|la\b|te\b)|SK|ts?\b|u(?:bit(?:\b|-o\b)|d(?:ex\b|ice\b)|re\b)|[vx]\b|V(?:th|\b)|XC?\b|[\b\d])|M(?:D\b|VP\b)|[oO](?!\b|ax|b(?:a\b|?an|chodní|ra|tenu|?i|yv)|c(?:cidente|h(?:o|rany)|upat)|d(?:\b|e[c?]et\b)|este|f(?:\b|erecer)|ggi|hniv|ito\b|kol[íi]e?\b|l(?:ot\b|še\b|vidarte)|mnisciê|MR\b|nda\b|[nN](?:\s|[cçiIC][eaE]|[eE](?!g(?:\b|a\b|es|in)|i(?:da|[lr])|rous))|O\b|opa|p(?:éra|erador\b)|ra?\b|[rR](?:\b|a(?:[?s?]ului|z\b)|chestr\b|d(?:em|inii)|fu\b|i(?:\b|lla))|S-9\b|s(?:asuna|curas|o(?:bnosti|na))|t(?:r[ao]\b|tobre)|u(?:\b|a[bcdglt]|ed|i|tro)|v(?:elha|iedo))|u(?!\b|[A-Z\dc?ek\:\.\-]|a(?:dim|h\b|in\b)|b[aiou]|d(?:ev\b|raw\b)|fo|g(?:a(?:li|nd)|x\b)|i(?:le\b|n|ro|tat\b)|jam|l(?:ak|u)|m(?:a\b|\b|?ní\b|r(?:ia|l))|n(?:(?:\s|a(?:ni|ry|s\b|\b|te\b))|d\b|e(?:\b|i\b|sco)|o[rs]?\b|s\b|uib?)|ni(?!d(?:[eol]|io)|gn|ll|m(?:ag|[bim]|p[aeloru])|n(?:au|[cd]|eb|[fghjk]|i[nt]|oc|[tnvsq])|r(?:ad|[kr]|on))|omo|p(?:azilas?|risin\b)|r[aeiolsuy]|s[aeiou]|s(?:b(?:net)?|d)\b|s\$|s(?:hape|t(?:ream|ed(?:es)?\b))|t(?:[aeiou]|r(?:anga|ic))|[vž]|yu\b|zs\b))[^\|\[\]\<\?\>\{\}\s]{0,29})(?<=\b(?:[\S\s]){1,49}(?<!\b(?:[aA](?:baten|c(?:ceso|o(?:mpañando|sa)|t)|cusa|d(?:herits|i[oó]s|misión|vanced)|eroporto|gus|menazan|[nñN][oO]|n(?:\b|dalucía|fibio|s\b)|prendiendo|?a|spirante|t(?:acó|ención)|u(?:menta|r|sf(?:\.|ührung)|torov)|xudar|y(?:údame|ud(?:ar|dó)))|[bB](?:a(?:rokiem|ttery)|enefician|ílé|usca(?:ndo)?)|[c?C](?:a(?:bellera|lle|mino|ntan?\b|r(?:retera|tas?))|a(?:sar|tegory\:?)|e(?:n(?:sura|tral?)|rcano)?|h(?:ama|lorophyll|romogranin?)|iclista|íny|lass|o\.|om(?:ienzan|p(?:any|o(?:sition|und)|r(?:ó|ometido))|unicações)?|o(?:n(?:certo|firma|oció|trat[aoó]|vocatoria)|sta)|u(?:arto|m)|yclosporine)|[dD](?:[áàâ?åeêèé]|a(?:lla)?|e(?:dicada|[nst]|nuncian|recho|seó|tienen|s(?:apareceu|p(?:edida|iden))|voción)|[iîìíòôóuùûú]|i(?:gas|le|recto|vision)|o(?:jmy|uble)|urante)|[eéEÉ](?:\b|cusa|insatzgruppe|jecutan?|l(?:e|le)|mpecé|n(?:frenta(?:rá)?|señ(?:ame|[óo])|t(?:onces|re(?:\b|gó|vista)))?|s(:quivel|t)?|t|x(?:itos|tradita(?:do)?))|[fF](?:a(?:cilitar|z)|e(?:menina|rmato)|i(?:ammanti|chó)|ormula|rente|u(?:[ií]|sil|tbol))|[gG](?:alega|eneral?|lorie|olpe|r(?:ade|oup)|uerra)|[hH](?:istorických|o(?:mena(?:gem|je|tge)|usle))|[iI](?:\b|l\b|n(?:forma?ii|na|te(?:gran|r(?:preta|vista)))|n(?:trodu(?:cción|zione)|vita(?:ción)?)|storie|terum)|[jJ](?:r|un(?:g|ior|to))|[kK](?:lavír|r(?:ál|tek)|u(?:ltúry|?átko))|[lL](?:abe|e\b|éka?i|ewis|i(?:gada|pid|st)|íderes|iniers|le(?:ga(?:n?|r[aá]?)|va(?:n|sen))|u(?:i|xe))|[mM](?:a(?:nu|s|tar)|hic\b|\.I|[íi]nima\b|iedo|o(?:del|n(?:te|umento)|ro[?s]anu)|o(?:u(?:lin|nd)|vid[ao])|u(?:lt|sgos))|[nN](?:bsp|ei|iegan|ônibus|o(?:mbrar|tícia|us))|[oO](?:kina\}\}|cchio|lza|maggio|noare|riente|sob\b|t(?:ázky|ec\b)|u\b)|[pP](?:a(?:r[at]|s(?:ado|sou?))|e(?:ntru|r(?:ò|petua)|se)|i(?:etro|ù)|lan|o(?:int|nte)|r(?:o|ólogo|e(?:ludio|senta))|r[?u]myslu|ublicat|\.)|[qQ](?:\sand|u(?:ando|[ei]))|[rR](?:apó|e(?:c(?:ibe|ordando|usa)|ferencias|gião|i|torno)|isale|o(?:i|mân?|zší?ené))|?i\b|[sS](?:a(?:be|lve\b|tisface)|e(?:ason|cuestran|ine)|[eé]r(?:á|ie|vir)|i(?:c\s?(?:\|)?|de|ngle)|o(?:b(?:re)?|ciální|u)|p(?:ortiv?|rijin)|t(?:avební|yky)|u(?:b(?:ida|unit|(?:-)unit)|ma|p(?:le|plemento)|rt|stituye)|[\.é])|[tT](?:arda|áxi|he|o(?:da|or(?:no|turan))|r(?:azendo|en|i(?:buto|ple))|ype)|[uU](?:hlie|n[ade])|[vV](?:a(?:da|[is]?|mos?|riant|yas)|e(?:che|n(?:ce[rn]?|d[aeo]|g[ao]|t[ae]))|e(?:ta|z)|?(?:du?|rný)|i(?:agem|llena|ol(?:on?cello|u))|i(?:tamin|va(?:ce)?)|ojsko|ol(?:a(?:mos|r)|ta|v(?:amos|er(?:[aáé]|emos|te)?)|v(?:í|i(:?endo)?))|oy|uel(?:[aeu]\b|t[ao]\b|v[aeo]s?\b))|[wW]h[a?]nau|[yY]\b|[zZŽ](?:eit|ivoty))\W?\s?\s?[aA]\s?\s?\2))
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ^
18:02:14.413 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 36
\b([oO])uts(ed|ing(?<!could\s+outsing))\b
                                    ^
18:02:14.413 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 67
\bUS(?:\s+(?:D\$?|\$)|D\$?|\$)(?<=(?:\b[a-z]+\s+|[\(,]\s*)US[\s\$D]+)(?:\s+(?: \s*)?| \s*)?\$?(?<!US\$)(?=\d)
                                                                   ^
18:02:14.413 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 162
([\s\(~])\$?\s?((?:\d+(?:\.\d+)?|(?:\d+,)+\d{3}(?:\.\d\d)?))((?:\s| )+(?:[mbMB]illion|[tT](?:housand|rillion))\b)?(?<![^\$\d](?:1[89]\d\d|20\d\d))(?<!\d\s+\d+)(?:\s| )*US[D\$]\b
                                                                                                                                                                  ^
18:02:14.414 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 86
\bthe\s+U\.?S\.?A\.?(?<!Church\s+in\s+the\s+U\.?S\.?A\.?|Girl\s+Scouts\s+of\s+the\s+USA)(?=(?:,|\s+(?:a(?:fter|nd|[st])|by|f(?:or|rom)|in|to|w(?:hen|ith))\s))(?!\s+for\s+Africa)
                                                                                      ^
18:02:14.414 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 31
\bphenomena\b(?<=\b[tT]his\s+\w+)
                               ^
18:02:14.414 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 112
(?<=\s+(?:a(?:re|s)|As|be(?:c(?:ame|om(?:e|ing))|en|ing)|is|not|to\s+be|w(?:as|ere))(?:\s+(?:al(?:l|so)|now))?\s+)\bapart\s+of\b
                                                                                                                ^
18:02:14.414 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 84
\b(early|late)[–??—?](?<=\b(?:[bB]y|[dD]uring|[fF]rom|[iI]n|of|to|[uU]ntil)\s+[a-z]+-)([12]\d{3})(?=[,\.\;])
                                                                                    ^
18:02:14.414 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 215
\b(?=[hH]igh)(?<!(?:[bB]ecause\s+of\s+(?:h(?:er|is)|its|their)|(?:achiev(?:e[ds]?|ing)|creat(?:e[ds]?|ing)|display(?:ed|ing|s?)|ha(?:s|ve)|ke(?:ep(?:ing|s?)|pt)|maintain(?:ed|ing|s?)|retain(?:ed|ing|s?)|with)\s+a)\s+)([hH])igh(?<![A-Z][A-Za-z]+\s+High|specified\s+High|the\s+High)\s+profile\b(?!,|\s+(?:a(?:nd|s)|for|in|of|to)\b)
                                                                                                                                                                                                                       ^
18:02:14.414 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 35
\bopen\s+air(?<=\b[aA]n\s+open\s+air)(?=\s+(?:a(?:mphitheat(?:er|re)|ren?a|uditorium)|bath|c(?:hurch|inema|ourtyard)|d(?:ance|isplay)|exhibition|festival|m(?:a(?:ll|rket|ss)|eeting|usic)|p(?:avilion|erformance|ool|roduction)|restaurant|s(?:ervice|hopping|ite|t(?:a(?:dium|ge)|ructure)|wimming)|theat(?:er|re)|venue))
                                   ^
18:02:14.414 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 112
\b([mM])ulti[–??—?\s]*([mb]|tr)illion[–??—?\s]+(dollar|euro|pound)\b(?<!ulti(?:[mb]illion-[a-z]+|trillion-[a-z]+))
                                                                                                                ^
18:02:14.415 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 68
\b([sS])word[–??—?\s]+play\b(?<!\band[–??—?\s]+[sS]word[–??—?\s]+play)
                                                                    ^
18:02:14.415 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 109
\b[bB]at?chelor['’´???‘???`]?s?['’´???‘???`]?\s+[dD]egree(s)?\b(?<=[a-z]\s+[bB]a[a-z´???’???`']+\s+[dD]egrees?)(?<!bachelor's\s+degrees?)
                                                                                                             ^
18:02:14.415 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 105
\b[mM]aster['’´???‘???`]?s?['’´???‘???`]?\s+[dD]egree(s)?\b(?<=[a-z]\s+[mM]a[a-z´???’???`']+\s+[dD]egrees?)(?<!master's\s+degrees?)
                                                                                                         ^
18:02:14.417 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 69
\b(centuri|decad)es\s+old\b(?<=\b(?:[aA]|[tT]he(?:ir)?)\s+[a-z]+\s+old)
                                                                     ^
18:02:14.418 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 300
\b(\d+[\d,\.]*|e(?:ight(?:een|y?)|leven)|f(?:i(?:ft(?:een|y)|ve)|o(?:rty|ur(?:teen)?))|hundred|nine(?:t(?:een|y))?|one|s(?:even|ix)(?:t(?:een|y))?|t(?:en|h(?:irt(?:een|y)|ousand|ree)|w(?:e(?:lve|nty)|o)))\b(?<=\b(?:[aA](?:dditional|n?)|first|[hH](?:er|is)|[iI]ts|second|th(?:eir|ird)|Their)\s+[\da-z]+)(?: |\s+)(?!member\s+[a-z]+s\b)(acre|bed|cylinder|d(?:ay|ecker|oor)|foot|g(?:a(?:llon|me)|oal)|h(?:o(?:le|rsepower|ur)|uman)|inch|lit(?:er|re)|m(?:an|e(?:mber|t(?:er|re))|i(?:le|nute)|onth)|ounce|p(?:a(?:ge|ssenger)|erson|o(?:int|und))|r(?:o(?:om|und)|unner)|s(?:e(?:a(?:son|t(?:er)?)|cond)|ong|t(?:age|ore?y))|ton|vote|w(?:eek|heel(?:e[dr])?|oman)|y(?:ard|ear))(?=[,\s]|-(?:deep|high|long|old|tall|wide)\b)(?!\s+(?:a(?:go|[st])|by|deep|for|high|i[ns]|long|o(?:f|ld)|t(?:all|here|o)|w(?:as|i(?:de|th)))\b)(?<!\b\d{4}\s+(?:game|s(?:e(?:ason|cond)|ong|t(?:age|ory))|vote))(?<![dD]uring\s+h(?:er|is)\s+one\s+season|told\s+h(?:er|im)\s+one\s+day|send\s+for\s+h(?:er|im)\s+one\s+day)
                                                                                                                                                                                                                                                                                                            ^
18:02:14.419 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 277
\b(\d+[\d,\.]*|e(?:ight(?:een|y?)|leven)|f(?:i(?:ft(?:een|y)|ve)|o(?:rty|ur(?:teen)?))|hundred|nine(?:t(?:een|y))?|one|s(?:even|ix)(?:t(?:een|y))?|t(?:en|h(?:irt(?:een|y)|ousand|ree)|w(?:e(?:lve|nty)|o)))(?: |[–??—?\s]+)(month|year)\b(?<= [\da-z]+(?: [a-z]+|\s+[a-z]+))(?=\s+(?:a(?:bsence|ffair|greement|ss(?:ignment|ociation))|b(?:a(?:n|ttle)|reak)|c(?:a(?:mpaign|reer)|ease[-–??—?]?fire|losure|o(?:m(?:a|petition)|ntract|urse)|ruise|ycle)|d(?:e(?:a(?:dline|l)|lay|ployment)|rought|uration)|e(?:ffort|n(?:gagement|listment)|x(?:hibit(?:ion)?|i(?:le|stence)|pedition|tension))|feasibility|g(?:ap|estation|uest)|h(?:i(?:atus|story)|ospital)|i(?:llness|n(?:cumbent|jury|ternship|vestigation))|j(?:ail|ourney)|l(?:ay[-–??—?]?off|ea[sv]e|ife[-–??—?]?span|o(?:an|ckout))|m(?:aintenance|i(?:litary|ssion)|o(?:dernization|ratorium))|notice|overhaul|p(?:artnership|eriod|lan|osting|r(?:ison|o(?:cess|fessional|gram(?:me)?|ject)))|r(?:e(?:c(?:overy|urring)|fit|gular|ign|lationship|s(?:earch|idency|tricted))|otation|un)|s(?:abbatical|cho(?:larship|ol)|e(?:ason|ntence)|iege|ojourn|p(?:an|e(?:aking|ll))|t(?:a(?:rter|y)|int|r(?:ike|uggle)|udy)|u(?:bs(?:cription|idy)|pen(?:ded|sion)))|t(?:e(?:nure|rm)|our|r(?:aining|eatment|i(?:al|p)|uce))|v(?:eteran|isit|oyage)|w(?:a(?:it(?:ing)?|r)|orkshop))\b)
                                                                                                                                                                                                                                                                                     ^
18:02:14.419 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 22
\bone\b(?<=\b[aA]\s+one)[–??—?\s]+night[–??—?\s]+stand\b(?<!one-night\s+stand)
                      ^
18:02:14.419 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 108
\b[aA]ssociate['’´???‘???`]?s?['’´???‘???`]?\s+[dD]egree(s)?\b(?<=[a-z]\s+[aA]s[a-z´???’???`']+\s+[dD]egrees?)(?<!associate\s+degrees?)
                                                                                                            ^
18:02:14.419 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 26
\babove(?<=\b[tT]he\s+above)\s+mentioned\b
                          ^
18:02:14.419 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 111
\b([dD]is|[eE]x|[iuIU]ndis)tin?[gq]i?ui?sh?(ab[il][a-z]*|e[drs][a-z]*|ing[a-z]*|ment[a-z]*)?\b(?<!tinguish[a-z]*)
                                                                                                               ^
18:02:14.420 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 59
\b([bB])illard(s)?\b(?<!\b(?:[A-Z][a-z]+ Billard|de Billard))
                                                           ^
18:02:14.421 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 22
\bNor(?<=\b[tT]he\s+Nor)thernmost\b
                      ^
18:02:14.421 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 22
\bSou(?<=\b[tT]he\s+Sou)thernmost\b
                      ^
18:02:14.421 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 22
\bEas(?<=\b[tT]he\s+Eas)ternmost\b
                      ^
18:02:14.421 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 22
\bWes(?<=\b[tT]he\s+Wes)ternmost\b
                      ^
18:02:14.421 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Illegal/unsupported escape sequence near index 109
\b([iI])n[aeiou]?d[aeiou]?[aeiou]?v[aeiou]?[aeiou]?d?[aeiou]?[dl]?[aeiou]?[aeiou]?l(?<!nd(?:avl|evel))(?!e[s\b]|l(?:e|os))[aeou]?(?<!ndividu[ae]l)([a-z-\´???’???`']{0,99})
                                                                                                             ^
18:02:14.421 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 45
\bapprox(?<=located\s+approx|situated\s+approx)\.?(?=\s)
                                             ^
18:02:14.421 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Look-behind group does not have an obvious maximum length near index 23
\s+,(?<=[A-Za-z\d\)]\s+,)\s?
                       ^
18:02:14.422 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New additions:
  Illegal/unsupported escape sequence near index 114
\b([fF]|[pP]ref|[rR]ef|[uU]nf)omat(?!\s+[mM]artin|[mM]artin)(t?(?:a(?:ble|nks?)|e(?:d?|rs?)|i(?:ngs?|on(?:als?|[s\b])|v(?:e(?:ly|s?)|ity))|k(?:ii|y)|or(?:ies|y)|s(?:k(?:ii|y))?))?
                                                                                                                  ^
18:02:14.422 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: medals:
  Look-behind group does not have an obvious maximum length near index 86
\b([bB]ronze|[gG]old|[sS]ilver)[-–??—?\s]+([mM]edal)[-–??—?\s]+winning(?<!\w+\s+\w+-\w+)(?=\s)
                                                                                      ^
18:02:14.422 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: medals:
  Look-behind group does not have an obvious maximum length near index 58
\bGold\s+[mM]edal(s)?\b(?<=\b(?:[a-z]+|Olympic)\s+\w+\s+\w+)(?=(?:[-,;/:/.\)])|\s+(?:a(?:nd|re|t)|by|for|game|i[ns]|with)\b)
                                                          ^
18:02:14.422 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: medals:
  Look-behind group does not have an obvious maximum length near index 60
\bSilver\s+[mM]edal(s)?\b(?<=\b(?:[a-z]+|Olympic)\s+\w+\s+\w+)(?=(?:[-,;/:/.\)])|\s+(?:a(?:nd|re|t)|by|for|game|i[ns]|with)\b)
                                                            ^
18:02:14.422 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: medals:
  Look-behind group does not have an obvious maximum length near index 60
\bBronze\s+[mM]edal(s)?\b(?<=\b(?:[a-z]+|Olympic)\s+\w+\s+\w+)(?=(?:[-,;/:/.\)])|\s+(?:a(?:nd|re|t)|by|for|game|i[ns]|with)\b)
                                                            ^
18:02:14.422 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: medals:
  Look-behind group does not have an obvious maximum length near index 49
\bGold\s+[mM]edalist(s)?\b(?<=\b[a-z]+\s+\w+\s+\w+)
                                                 ^
18:02:14.422 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: medals:
  Look-behind group does not have an obvious maximum length near index 51
\bSilver\s+[mM]edalist(s)?\b(?<=\b[a-z]+\s+\w+\s+\w+)
                                                   ^
18:02:14.423 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: medals:
  Look-behind group does not have an obvious maximum length near index 51
\bBronze\s+[mM]edalist(s)?\b(?<=\b[a-z]+\s+\w+\s+\w+)
                                                   ^
18:02:14.423 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: replace space by hyphen:
  Look-behind group does not have an obvious maximum length near index 70
\bpart[–??—?\s]+time\b(?<=\b(?:as|for|ha(?:[ds]|ve)|on)\s+a\s+\w+\s+\w+)(?!\s+unit)|\bpart\s+time(?=\s+(?:basis|employ(?:ees?|ment)|jobs?|st(?:aff|udents?)|work(?:ers?)?)\b)
                                                                      ^
18:02:14.423 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: replace space by hyphen:
  Look-behind group does not have an obvious maximum length near index 70
\bfull[–??—?\s]+time\b(?<=\b(?:as|for|ha(?:[ds]|ve)|on)\s+a\s+\w+\s+\w+)(?!\s+unit)|\bfull\s+time(?=\s+(?:basis|employ(?:ees?|ment)|jobs?|st(?:aff|udents?)|work(?:ers?)?)\b)
                                                                      ^
18:02:14.423 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: replace space by hyphen:
  Look-behind group does not have an obvious maximum length near index 163
\b((?:\d+,)?\d\d+)[–??—?\s]+seat(?=\s+(?:a(?:rena|uditorium)|black\s+box|concert|lecture|majority|restaurant|st(?:adium|udio)|theat(?:er|re))\b)(?<!\bto\s+\d+\s+\w+)
                                                                                                                                                                   ^
18:02:14.423 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: remove other hyphens (replace with space):
  Look-behind group does not have an obvious maximum length near index 32
\bsworn-in\b(?<!\b[aA]\s+sworn-in)
                                ^
18:02:14.423 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: Euphemisms:
  Look-behind group does not have an obvious maximum length near index 343
(?<=\b(?:brothers?|c(?:hild(?:ren)?|ousins?)|daughters?|f(?:athers?|riends?)|grand(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|He|h(?:e|usbands?)|mothers?|n(?:ephews?|ieces?)|parents?|s(?:he|isters?|ons?|pouses?|tep(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|tudents?)|[A-Z][a-z]+|She|[tT]hey|wi(?:fe|ves))\s+)(?:sadly\s+)?(?:pass(?:e([ds]))?\s+away|lose(s)?\s+(?:their|h(?:er|is)(?:\s+or\s+h(?:er|is)|[/\\]h(?:er|is))?)\s+li(?:fe|ves))(?! from earthly existence)
                                                                                                                                                                                                                                                                                                                                                       ^
18:02:14.424 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#New: calendar dates:
  Look-behind group does not have an obvious maximum length near index 195
\b(A(?:pril|ugust)|December|February|J(?:anuary|u(?:ly|ne))|Ma(?:rch|y)|November|October|September)(?<=\b(?:[aA]fter|and|[bB](?:e(?:fore|tween)|orn|y)|[dD]ied|[fF]rom|[oO]n|to|[uU]ntil|\w+,)\s+\w+)\s+([1-3]?\d),\s+([12]\d\d\d)(?=\s+\w)
                                                                                                                                                                                                   ^
18:02:14.424 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#Academic titles:
  Look-behind group does not have an obvious maximum length near index 77
\bVisiting(?<=\b(?:[aA][ns]|[a-z]+,?|[fF]ormer|[tT]enure(?:d|-[tT]rack))\s+\w+)\s+[pP]rofessor(?=[,;\.\)]|\s+(?:at|[a-z]{4,}|in|o[fn])\s)
                                                                             ^
18:02:14.424 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#Academic titles:
  Look-behind group does not have an obvious maximum length near index 50
\bLecturer(?<=\b(?:[aA]s?\s+\w+|is\s+\w+|was\s+\w+))(?=\s+(?:at\b|in\b|o[fn]\b))
                                                  ^
18:02:14.424 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#Academic titles:
  Look-behind group does not have an obvious maximum length near index 54
\bInstructor(?<=\b(?:[aA][ns]\s+\w+|is\s+\w+|was\s+\w+))(?=\s+(?:at\b|in\b|o[fn]\b))
                                                      ^
18:02:14.425 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#Academic titles:
  Look-behind group does not have an obvious maximum length near index 89
\bResearch(?<=\b(?:[aA][ns]|[a-z]+,?|[fF]ormer|[tT]enure(?:d|-[tT]rack)|[vV]isiting)\s+\w+)\s+[pP]rofessor(?=[,;\.\)]|\s+(?:at|[a-z]{4,}|in|o[fn])\s)
                                                                                         ^
18:02:14.425 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#Academic titles:
  Look-behind group does not have an obvious maximum length near index 110
\bA(djunct|ss(?:istant|ociate))(?<=\b(?:[aA][ns]|[a-z]+,?|[fF]ormer|[tT]enure(?:d|-[tT]rack)|[vV]isiting)\s+\w+)\s+[pP]rofessor(?=[,;\.\)]|\s+(?:at|[a-z]{4,}|in|o[fn])\s)
                                                                                                              ^
18:02:14.425 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#Academic titles:
  Look-behind group does not have an obvious maximum length near index 77
\bProfessor(?<=\b(?:[aA]|[fF]ormer|[tT]enure(?:d|-[tT]rack)|[vV]isiting)\s+\w+)(?=(?:[,;\.\)]|\s+(?:at\s|in\s|o[fn]\s)))
                                                                             ^
18:02:14.425 [Thread-3] WARN  o.w.api.data.Suggestion - Incorrect AWB pattern syntax in Wikipedia:AutoWikiBrowser/Typos#Academic fields:
  Look-behind group does not have an obvious maximum length near index 371
\bZ(o(?:o(?:archaeology|logy|semiotics|tomy)|roastrianism))(?<=\b(?:(?:(?:d(?:egrees?|octorates?)|graduat(?:ed?|ing)|instruct(?:ion|or)|lectur(?:e[dr]|ing)|major(?:ed|ing)?|[MB]\.?[AS]c?\.?|of\s+(?:Arts|Science)|Ph\.?D\.?|stud(?:ents?|ies)|tutor)\s+(?:in|of))|pr(?:acti[sc](?:e[ds]?|ing)|ofessor\s+(?:in|of))|read|stud(?:ie[ds]|y(?:ing)?)|t(?:aught|each(?:es|ing)?))\s+\w+)(?=(?:\s+(?:a(?:nd\s+an?|t)|by|from|in|with)\s|[,;\.\)]|\s+\())

Thanks! GoingBatty (talk) 00:06, 3 September 2022 (UTC)

Thanks, this is super useful! I hadn't crawled into the look aheads and behinds at all -- but I'll take a look. I'd been avoiding it because it required a full unfolding, and it wasn't clear if that would be worthwhile. Clearly, it is! (I'll probably take a look this weekend. I'm trying to pretend that this isn't the kind of thing I do on a friday night 🤣 )
Mason (talk) 00:35, 3 September 2022 (UTC)
Thanks GoingBatty for the report, and Mason to look into it.
WPCleaner is coded in Java, the Java regular expressions are almost identical to the .Net ones used by AWB. But the Java engine doesn't allow look-behind groups with a clear maximum length, so regular expressions which fail this restriction end up ignored by WPCleaner with a warning in the log, but the don't have any impact on time for analysis (they are simply ignored).
A few regular expressions are more problematic, that's the ones that take a very long time, apparently Commercially (797922ms) and premiere (94110ms) here. Some ideas:
  • Commercially, <Typo word="Commercially" find="(?<![a-z]+-)\b([cC])ommerciall?y-(?=[a-z]+(?:ble\b|ed\b|ful\b))(?![a-z]+-)" replace="$1ommercially "/> : maybe move the look-behind later in the expression, because it filters out almost nothing but takes time at each character in the text. Something like \b(?<![a-z]+-)([cC])ommerciall?y-(?=[a-z]+(?:ble\b|ed\b|ful\b))(?![a-z]+-) (start with the word boundary which filters a few things before) or even better something like \b([cC])ommercial(?<![a-z]+-([cC])ommercial)l?y-(?=[a-z]+(?:ble\b|ed\b|ful\b))(?![a-z]+-) (only use the look-behind once commercial has been found).
  • premiere, <Typo word="premiere" find="\b(?<=(?:film|movie)\s+)premier\b" replace="premiere"/> : same reasoning and same idea, something like \bpremier(?<=(?:film|movie)\s+premier)\b (only use the look-behind once premier has been found)
--NicoV (Talk on frwiki) 17:32, 3 September 2022 (UTC)
@NicoV: Thanks for the suggestions! I've changed the "premiere" rule and reran the article through WPCleaner, and the log no longer shows the issue. It appears Chris the speller added the "Commercially" rule in 2015, so let's see what input Chris can provide. Thanks! GoingBatty (talk) 19:09, 3 September 2022 (UTC)
Maybe you guys think I have gotten smarter in the last 7 years. I'm willing to try to find out. I will look onto it and get back to you. Chris the speller yack 04:19, 4 September 2022 (UTC)
OK, I have reworked "Commercially"; it should no longer be the worst pig in that list. Chris the speller yack 20:21, 4 September 2022 (UTC)
Also, reworked "premiere". Chris the speller yack 20:32, 4 September 2022 (UTC)

Stupid question

Are the typo definitions stored independently of individual settings files? In other words, do I need to refresh typos when switching from one settings file to another (that hasn't been used in several months or years)? Dawnseeker2000 01:44, 20 September 2022 (UTC)

@Dawnseeker2000: the typo definitions are indeed independent, and load the latest version of WP:AWB/T when either first clicking on the "Regex typo fixing" checkbox, opening a settings file, or selecting "Refresh status/typos" from File dropdown menu.   ~ Tom.Reding (talkdgaf)  03:48, 20 September 2022 (UTC)

loading almost 5 min for article on wpcleaner

i assumed the issue is with awb, if not please let me know. 10th South Indian International Movie Awards took almost 5 min to load on wpcleaner. here is the relevant log : 08:23:33.512 [Thread-6] INFO API - GET http://en.wiki.x.io/w/api.php?continue=&prop=pageprops%7Cinfo&format=xml&action=query&generator=links&gpllimit=max&titles=10th South Indian International Movie Awards&ppprop=disambiguation&gplnamespace=0 08:23:33.513 [MW-5] INFO API - GET http://en.wiki.x.io/w/api.php?curtimestamp=1&continue=&prop=revisions%7Cinfo&inprop=protection&format=xml&rvslots=main&action=query&titles=10th South Indian International Movie Awards&rvprop=content|ids|timestamp 08:23:34.770 [Thread-6] INFO API - GET http://en.wiki.x.io/w/api.php?redirects=&continue=&prop=pageprops&format=xml&action=query&titles=Neelam Productions|News18|Pramod Panju|SIIMA|SIIMA Award for Best Actor (Telugu)|SIIMA Award for Best Actor in a Negative Role (Telugu)|SIIMA Award for Best Actor in a Negative Role ? Kannada|SIIMA Award for Best Actor in a Negative Role ? Malayalam|SIIMA Award for Best Actor in a Negative Role ? Tamil|SIIMA Award for Best Actor ? Kannada|SIIMA Award for Best Actor ? Tamil|SIIMA Award for Best Actress (Telugu)|SIIMA Award for Best Actress ? Kannada|SIIMA Award for Best Actress ? Tamil|SIIMA Award for Best Cinematographer (Telugu)|SIIMA Award for Best Cinematographer ? Kannada|SIIMA Award for Best Cinematographer ? Malayalam|SIIMA Award for Best Cinematographer ? Tamil|SIIMA Award for Best Comedian (Telugu)|SIIMA Award for Best Comedian ? Kannada|SIIMA Award for Best Comedian ? Malayalam|SIIMA Award for Best Comedian ? Tamil|SIIMA Award for Best Debut Director (Telugu)|SIIMA Award for Best Director (Telugu)|SIIMA Award for Best Director ? Kannada|SIIMA Award for Best Director ? Tamil|SIIMA Award for Best Female Debut (Telugu)|SIIMA Award for Best Female Playback Singer (Telugu)|SIIMA Award for Best Female Playback Singer ? Kannada|SIIMA Award for Best Female Playback Singer ? Malayalam|SIIMA Award for Best Female Playback Singer ? Tamil|SIIMA Award for Best Film (Telugu)|SIIMA Award for Best Film ? Tamil&ppprop=disambiguation 08:23:35.655 [Thread-6] INFO API - GET http://en.wiki.x.io/w/api.php?redirects=&continue=&prop=pageprops&format=xml&action=query&titles=SIIMA Award for Best Lyricist (Telugu)|SIIMA Award for Best Lyricist ? Kannada|SIIMA Award for Best Lyricist ? Malayalam|SIIMA Award for Best Lyricist ? Tamil|SIIMA Award for Best Male Debut (Telugu)|SIIMA Award for Best Male Playback Singer (Telugu)|SIIMA Award for Best Male Playback Singer ? Kannada|SIIMA Award for Best Male Playback Singer ? Malayalam|SIIMA Award for Best Male Playback Singer ? Tamil|SIIMA Award for Best Music Director (Telugu)|SIIMA Award for Best Music Director ? Kannada|SIIMA Award for Best Music Director ? Malayalam|SIIMA Award for Best Music Director ? Tamil|SIIMA Award for Best Supporting Actor (Telugu)|SIIMA Award for Best Supporting Actress (Telugu)|South Indian cinema|Sukumar Writings|Wayfarer Films Production&ppprop=disambiguation 08:23:36.177 [Thread-6] INFO API - GET http://en.wiki.x.io/w/api.php?continue=&prop=revisions&format=xml&rvslots=main&action=query&titles=Neelam Productions|News18|Pramod Panju|SIIMA|SIIMA Award for Best Actor (Telugu)|SIIMA Award for Best Actor in a Negative Role (Telugu)|SIIMA Award for Best Actor in a Negative Role ? Kannada|SIIMA Award for Best Actor in a Negative Role ? Malayalam|SIIMA Award for Best Actor in a Negative Role ? Tamil|SIIMA Award for Best Actor ? Kannada|SIIMA Award for Best Actor ? Tamil|SIIMA Award for Best Actress (Telugu)|SIIMA Award for Best Actress ? Kannada|SIIMA Award for Best Actress ? Tamil|SIIMA Award for Best Cinematographer (Telugu)|SIIMA Award for Best Cinematographer ? Kannada|SIIMA Award for Best Cinematographer ? Malayalam|SIIMA Award for Best Cinematographer ? Tamil|SIIMA Award for Best Comedian (Telugu)|SIIMA Award for Best Comedian ? Kannada|SIIMA Award for Best Comedian ? Malayalam|SIIMA Award for Best Comedian ? Tamil|SIIMA Award for Best Debut Director (Telugu)|SIIMA Award for Best Director (Telugu)|SIIMA Award for Best Director ? Kannada|SIIMA Award for Best Director ? Tamil|SIIMA Award for Best Female Debut (Telugu)|SIIMA Award for Best Female Playback Singer (Telugu)|SIIMA Award for Best Female Playback Singer ? Kannada|SIIMA Award for Best Female Playback Singer ? Malayalam|SIIMA Award for Best Female Playback Singer ? Tamil|SIIMA Award for Best Film (Telugu)|SIIMA Award for Best Film ? Tamil|SIIMA Award for Best Lyricist (Telugu)|SIIMA Award for Best Lyricist ? Kannada|SIIMA Award for Best Lyricist ? Malayalam|SIIMA Award for Best Lyricist ? Tamil|SIIMA Award for Best Male Debut (Telugu)|SIIMA Award for Best Male Playback Singer (Telugu)|SIIMA Award for Best Male Playback Singer ? Kannada|SIIMA Award for Best Male Playback Singer ? Malayalam|SIIMA Award for Best Male Playback Singer ? Tamil|SIIMA Award for Best Music Director (Telugu)|SIIMA Award for Best Music Director ? Kannada|SIIMA Award for Best Music Director ? Malayalam|SIIMA Award for Best Music Director ? Tamil|SIIMA Award for Best Supporting Actor (Telugu)|SIIMA Award for Best Supporting Actress (Telugu)|South Indian cinema|Sukumar Writings&rvprop=content 08:23:37.820 [Thread-6] INFO API - GET http://en.wiki.x.io/w/api.php?continue=&prop=revisions&format=xml&rvslots=main&action=query&titles=Wayfarer Films Production&rvprop=content 08:26:13.383 [Thread-6] INFO PERF - Slow regular expression (10th South Indian International Movie Awards): Typo AWB dies/died(143793ms):(?<=\b(?:brothers?|c(?:hild(?:ren)?|ousins?)|daughters?|f(?:athers?|riends?)|grand(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|He|h(?:e|usbands?)|mothers?|n(?:ephews?|ieces?)|parents?|s(?:he|isters?|ons?|pouses?|tep(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|tudents?)|[A-Z][a-z]+|She|[tT]hey|wi(?:fe|ves))\s+)(?:sadly\s+)?(?:pass(?:e([ds]))?\s+away|lose(s)?\s+(?:their|h(?:er|is)(?:\s+or\s+h(?:er|is)|[/\\]h(?:er|is))?)\s+li(?:fe|ves))(?! from earthly existence) 08:29:00.648 [AWT-EventQueue-0] INFO PERF - Slow regular expression (10th South Indian International Movie Awards): Typo AWB dies/died(143506ms):(?<=\b(?:brothers?|c(?:hild(?:ren)?|ousins?)|daughters?|f(?:athers?|riends?)|grand(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|He|h(?:e|usbands?)|mothers?|n(?:ephews?|ieces?)|parents?|s(?:he|isters?|ons?|pouses?|tep(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|tudents?)|[A-Z][a-z]+|She|[tT]hey|wi(?:fe|ves))\s+)(?:sadly\s+)?(?:pass(?:e([ds]))?\s+away|lose(s)?\s+(?:their|h(?:er|is)(?:\s+or\s+h(?:er|is)|[/\\]h(?:er|is))?)\s+li(?:fe|ves))(?! from earthly existence) 08:33:30.086 [AWT-EventQueue-0] INFO PERF - Slow regular expression (10th South Indian International Movie Awards): Typo AWB dies/died(144844ms):(?<=\b(?:brothers?|c(?:hild(?:ren)?|ousins?)|daughters?|f(?:athers?|riends?)|grand(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|He|h(?:e|usbands?)|mothers?|n(?:ephews?|ieces?)|parents?|s(?:he|isters?|ons?|pouses?|tep(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|tudents?)|[A-Z][a-z]+|She|[tT]hey|wi(?:fe|ves))\s+)(?:sadly\s+)?(?:pass(?:e([ds]))?\s+away|lose(s)?\s+(?:their|h(?:er|is)(?:\s+or\s+h(?:er|is)|[/\\]h(?:er|is))?)\s+li(?:fe|ves))(?! from earthly existence) 08:38:13.130 [AWT-EventQueue-0] INFO PERF - Slow regular expression (10th South Indian International Movie Awards): Typo AWB dies/died(144465ms):(?<=\b(?:brothers?|c(?:hild(?:ren)?|ousins?)|daughters?|f(?:athers?|riends?)|grand(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|He|h(?:e|usbands?)|mothers?|n(?:ephews?|ieces?)|parents?|s(?:he|isters?|ons?|pouses?|tep(?:child(?:ren)?|daughters?|fathers?|mothers?|parents?|sons?)|tudents?)|[A-Z][a-z]+|She|[tT]hey|wi(?:fe|ves))\s+)(?:sadly\s+)?(?:pass(?:e([ds]))?\s+away|lose(s)?\s+(?:their|h(?:er|is)(?:\s+or\s+h(?:er|is)|[/\\]h(?:er|is))?)\s+li(?:fe|ves))(?! from earthly existence) 08:40:22.657 [Thread-7] INFO API - POST http://en.wiki.x.io/w/api.php?action=edit&assert=user&basetimestamp=2022-09-20T20:07:55Z&bot=&format=xml&minor=&starttimestamp=2022-09-21T08:23:35Z&summary=v2.05 - Fix errors for CW project (Template value ends with break - Link equal to linktext - Spelling and typography)&tags=WPCleaner&text=...&title=10th South Indian International Movie Awards&watchlist=nochange 08:40:26.097 [MW-6] INFO API - POST http://chec.wiki.toolforge.org/cgi-bin/checkwiki_bots.cgi?action=mark&id=59&project=enwiki&title=10th South Indian International Movie Awards 08:40:26.108 [MW-7] INFO API - POST http://chec.wiki.toolforge.org/cgi-bin/checkwiki_bots.cgi?action=mark&id=64&project=enwiki&title=10th South Indian International Movie Awards 08:49:33.740 [AWT-EventQueue-0] INFO API - POST http://en.wiki.x.io/w/api.php?action=logout&format=xml, should i report seperately if it takes more than 5 min to process? -jindam, vani (talk) 09:01, 21 September 2022 (UTC)

@Jindam vani: I've disabled that spelling rule. With the complicated lookbehind potentially being evaluated at every word boundary, I'm not surprised that some regex implementations choke on it. Courtesy ping to Smasongarrison (talk · contribs). -- John of Reading (talk) 09:24, 21 September 2022 (UTC)
hello User:John of Reading, that was super quick, thank you -jindam, vani (talk) 09:28, 21 September 2022 (UTC)
thanks for the ping! These lookbehinds are super pesky -- it has me thinking that we should rethink and limit their use as much as possible. Let me do something deep thinking on how to do it. Mason (talk) 11:42, 21 September 2022 (UTC)

Curly quotes

Per WP:MOS, I think we should replace all curly quotes (“ ” ‘ ’) with straight quotes (" '). 27 is my favorite number. You can ask me why here. 20:34, 24 October 2022 (UTC)

AWB will do that for you. Neils51 (talk) 08:29, 25 October 2022 (UTC)
Thanks! 27 is my favorite number. You can ask me why here. 19:00, 27 October 2022 (UTC)

"on" --> "in" in date expressions

Also, there's a problem with inappropriate conversion of "on" to "in" in cases such as "Average ratings calculated by chess-results.com based on August 2014 ratings" --> "Average ratings calculated by chess-results.com based in August 2014 ratings", Example: 41st_Chess_Olympiad. Colonies Chris (talk) 17:02, 1 November 2022 (UTC)

@Colonies Chris: I've adjusted the rule so that it won't damage "based on August 2014 ratings". I've thrown in a few other words from a similar rule of my own. There's still plenty of scope for false positives, too hard for a regular expression to catch, along the lines "an increase of an astounding 24 million dollars on August 2014 figures" -- John of Reading (talk) 18:05, 1 November 2022 (UTC)
Thanks. Colonies Chris (talk) 18:31, 1 November 2022 (UTC)

Fractions

One of the fraction conversion rules is changing ½ to 12, which should not be happening on chess-related pages, where ½ is always used to indicate the points allocation in a drawn game, per MOS:FRAC. Example: 40th Chess Olympiad. Colonies Chris (talk) 16:45, 1 November 2022 (UTC).

@Colonies Chris - I have disabled these rules, since they can't tell whether the article falls under one of the exemptions listed at MOS:FRAC. GoingBatty (talk) 19:03, 10 November 2022 (UTC)
Thanks. Colonies Chris (talk) 08:30, 11 November 2022 (UTC)

at the at the

I'm working my way through "at the at the" -"at the" and have got this down to 68 remaining. But it probably makes more sense to feed it into AWB. ϢereSpielChequers 11:39, 18 October 2022 (UTC)

Done. Neils51 (talk) 08:24, 25 October 2022 (UTC)
Ta muchly ϢereSpielChequers 19:46, 14 November 2022 (UTC)

a eu -> an eu

Can we remove the general fix "a eu" -> "an eu"? It is causing false positives with French text. Example diff. cc @Kudpung, MB, and Elinruby:. Thanks. –Novem Linguae (talk) 23:49, 9 November 2022 (UTC)

Novem Linguae, "a eu" is the French for "has had". I wasn't aware that AWB is supposed to be partly a translation tool. You may also wish to consider "dont" which in French means "of which" and not "don't". There are probably thousands more false positives, I can immediately think of dozens, but the effort should be to explain to users that mainspace is probably not the best place to dump an article in French (or any other language) into mainspace even if the intention is to translate it. Kudpung กุดผึ้ง (talk) 00:20, 10 November 2022 (UTC)
We do get quotations which aren't marked clearly enough for AWB to skip them. This occurs not only with other languages but with archaic formes of Engliſhe. We just click on the line to undo the fix (or skip RETF in JWB) and move on. A good compromise might be to require more letters after eu, so "a eusomething" is fixed but not "il a eu un ...". However, most "eu*" words take "a" – a Euro, a euphemism, a eureka moment – and I'm struggling to think of one that needs "an". Do we know which rule is doing this? Is it the enormous "A to An" regex which won't fit on my screen? Can we just remove [the eu part of] whatever rule is doing this? Certes (talk) 09:59, 10 November 2022 (UTC)
I suggest we seek some other remedy than scolding translators, as this vastly decreases the number of people willing to do that work. This particular diff is an edge case caused by problems in another tool, as previously discussed at great length with Kudpung (talk · contribs). However, I also think that Novem Linguae (talk · contribs) and Certes (talk · contribs) are correct; I cannot think of an instance where changing a->an before the string ‘eu’ would result in correct English, and having just expressed a willingness to improve tools, in general, perhaps I should just ask where I might find this regex? Possibly I can help, and would not make any changes without discussing them. It is a small part of the issue I was complaining about, but the small improvement of fixing this one small problem would nonetheless be an improvement. Elinruby (talk) 19:30, 17 November 2022 (UTC)
I don't think anyone's tracked down exactly which regex makes the change. I guessed at the "A to An" entry in WP:AWB/T#New additions (a larger and more complex regex than I ever saw in 30 years as a software professional) but regex101.com says neither "il a eu un" nor "a euphemism" match it. Certes (talk) 22:07, 17 November 2022 (UTC)
Agreed (and I've done some pretty tricky + and - lookaround expressions). Even short regex strings can be challenging, and it would be arrogant for anyone to say that one that long and complex properly handles the cases it was designed for, without corrupting ones it wasn't. And that's without even considering the unmaintainability of such a monstrosity by other users, even if it were "correct" under some limited set of conditions. Stuff like that is more about look-at-me bragging rights, than actually helping Wikipedia. Mathglot (talk) 22:14, 17 November 2022 (UTC)
Ping Novem Linguae (talk · contribs), MB (talk · contribs) Elinruby (talk) 19:32, 17 November 2022 (UTC)
@Mathglot: this is what I was asking you about Elinruby (talk) 20:02, 17 November 2022 (UTC)
One approach here, in my opinion, concerns the use of the {{lang}} template. As long as a eu, dont (or any other text, in any language other than English) is contained in a {{lang}} template, then *with proper coding* in AWB, the problems can be avoided. It would be unfair and impractical to require hundreds of regular expressions to be changed, just to deal with this; in my opinion, this is an AWB-wide issue, for all regular expressions that involve typos, therefore, the proper approach, in my opinion, is to make a Change Request to change the operation of AWB itself. What should happen, is, when an AWB regex is tagged as a typo (I don't use AWB so I don't know how that is done) then the code in AWB itself, should ignore cases embedded within a {{lang}} tag (unless the specified language is English, if that ever happens; possibly in copy/paste or translation from other Wikipedias into English). Meanwhile, users should be reminded to use {{lang}} for all foreign text, whose original raison d'etre was about metadata and this just provides another reason to do so. Mathglot (talk) 20:49, 17 November 2022 (UTC)

@Certes, Novem Linguae, Mathglot, and Elinruby: The change was indeed made by the "A to An" rule. You can check this by looking in the "Typos" tab at the bottom right of the AWB window to see which rules have just fired. The rule will change "a" to "an" before all words beginning with "E" or "e" except for some listed exceptions: [eE](?!\b|cologia ... |xtranj). One of the listed exceptions is that it will not change "euphemism" or any other word beginning with "eu" and at least two more letters: [uU](?:[A-Za-z]{2}|\sde\b). After checking the following word, the rule looks back at the preceding word to check for some foreign-language false positives; this stops it changing "il a eu un". The part that spots "il" is [iI](?:\b|l\b| ... |storie|terum)

I have added u\b to the list of exceptions so that in future the rule will change "a EU" but not "a eu" or "a Eu".

But, yes, this rule has false positives, and all AWB users should be checking every edit they make, and {{lang}} tags are a good idea. -- John of Reading (talk) 08:17, 18 November 2022 (UTC)

thank you for the explanation and the fix for the specific error. I also agree that Mathglot (talk · contribs) has made a good case for the lang template. Since I frequently see quotes in other languages, I will see about applying it in such cases, if it prevents headaches over here. Elinruby (talk) 15:55, 18 November 2022 (UTC)

till -> until

In the phrase "glacial till", the word "till" should not be changed to "until". Glacial till is glacial sediment. example diff Kaltenmeyer (talk) 20:13, 13 November 2022 (UTC)

Should we remove this well-intentioned new rule altogether? Several other uses such as "to till the pasture" and "Gone till November" would be hard to detect and skip. Certes (talk) 22:03, 13 November 2022 (UTC)
Since this has been added, I've corrected dozens, maybe hundreds of "till"s and many/most are "till [date]". Allowing it to at least change it when followed by a number should be an improvement (although "till 20 acres" is still a problem). MB 22:54, 13 November 2022 (UTC)
There are about 1K+ exceptions, combinations/permutations of such as soil, glacial, plain, money, with around 50K entries. So unless some serious work is done around exception processing then would vote yes, for suspension. I couldn't find a discussion around the expunging of 'till'? Editors who want to see it removed could use the current regex in their own configs as they are more likely to check use syntax. Maybe it should be targeted as suggested by MB. Neils51 (talk) 23:51, 13 November 2022 (UTC)
Changing "till" to "until" before a number not followed by acre etc. sounds like a good compromise. Most occurrences of "till word" seem to be titles of works such as From Dusk till Dawn or land-related use such as "till plain". (Non-optimal regexp: \btill(?=\s+\d)(?!\s+\d+[-\s]+(acre|hectare)s?\b)until.) Certes (talk) 12:16, 14 November 2022 (UTC)
I've not been editing for several days, and have come here because in the few i've done so far today i'm seeing a silly number of "till" > "until"s; why are we making this change? "Till" is not incorrect, more a matter of style or taste, isn't it? Was there any discussion about this change? Happy days ~ LindsayHello 12:50, 15 November 2022 (UTC)

My go-to reference for usage is Bryan Garner; here is his entry for till:

till; until. Till is, like until, a bona fide preposition and conjunction. Though less formal than until, till is neither colloquial nor substandard. It's especially common in BrE—e.g. ...

followed by several examples from BrE usage, and then:

And it still occurs in AmE—e.g. ...

followed by examples, and then:

If a form deserves a sic it's the incorrect 'til. Worse yet is 'till which is abominable, ...

followed by yet more examples of that monstrosity in reliable, printed publications. My take: get rid of it; it's neither helpful, nor correct. Mathglot (talk) 20:39, 17 November 2022 (UTC)

Not certain, Mathglot which "it" you are recommending we get rid of; if you mean the "till > until" change, i fully agree; when AWB editing it is the typo"fix" i most often have to unfix. Only thing is, it's a month since the last comment here, and the thing still is there. Any possibility that someone with the ability can/will remove it? Happy days ~ LindsayHello 11:06, 24 December 2022 (UTC)
 Done. Mathglot (talk) 19:15, 24 December 2022 (UTC)

Fiancée

I recently fixed about 100 instances of "Fianceé", mostly to "Fiancée" but a few to the masculine "Fiancé". I was thinking of adding something like

<Typo word="Fiancée" find="\b([fF])ianc[eé]é" replace="$1iancée"/>

That works in testing but has no effect in the actual WP:AWB/T file. It might be a useful addition if anyone can get it to actually do something! (I omitted the customary final \b for the benefit of JWB, which doesn't recognise "é" as a letter. It should be safe unless you know of a Mr. Fianceéwibble.) Certes (talk) 16:11, 25 November 2022 (UTC)

Double entry? Did some testing and what's working is this current entry;
 <Typo word="Fiancé" find="\b([fF])ianc[eè](e)?\b(?![^\s\.]*\.\w)" replace="$1iancé$2"/><!--avoid domains-->
Neils51 (talk) 21:31, 24 December 2022 (UTC)
Yes, that similar entry works. It's fixing Fiancee and Fiancèe (also without the final e); I'm trying to fix Fianceé and Fiancéé. Certes (talk) 22:32, 24 December 2022 (UTC)
Tried with unicode values and seems to work. Neils51 (talk) 12:49, 25 December 2022 (UTC)
Thanks. That should work in AWB. I've removed the final \b, partly to match "Fianceés" but mainly because JWB's \b matches only an A-Z boundary, which does not occur between é and space in text such as "...fianceé was...". Certes (talk) 13:05, 25 December 2022 (UTC)

(Arch)nemesis rule

"Archnemesis" is not a typo on the wiki I'm editing -- how do I disable this rule?

<Typo word="Nemesis" find="\barch[–‑−—―\s]?nemesis\b" replace="nemesis"/>

Nightyb (t) 03:38, 21 January 2023 (UTC)

@Nightyb Hi there! Could you please help us understand why you want to disable this rule on the wiki you're editing? Are you stating that "archnemesis" is a valid word on the wiki you're editing? Thanks! GoingBatty (talk) 04:29, 21 January 2023 (UTC)
Hi, yes that's exactly what I'm saying! 🙂 Nightyb (t) 04:36, 21 January 2023 (UTC)
@Nightyb - Which dictionary or other source does your wiki use to define "archnemesis" as a valid word? GoingBatty (talk) 20:41, 21 January 2023 (UTC)
@GoingBatty -- The wiki I am editing pertains to a game that features content known as "Archnemesis". In any case, it does appear to be a legitimate word, as verified by its presence in multiple online dictionaries [1] [2]. Consequently, I believe it would be unjustified to label it as a typo, especially on the game's developer's part. I am happy to disable this rule locally if you know of a way to do it in AWB. -- Nightyb (t) 02:44, 22 January 2023 (UTC)
@Nightyb: Based on the Merriam-Webster link you provided, maybe we should change the rule to:
<Typo word="Archemesis" find="\b([Aa])rch[–‑−—―\s]nemesis\b" replace="$1rchnemesis"/>
What do you think? GoingBatty (talk) 02:49, 22 January 2023 (UTC)
@GoingBatty If I read that regex correctly, that rule only removes hyphens? That would suit my use-case perfectly, thanks. Nightyb (t) 02:58, 22 January 2023 (UTC)
@Nightyb The \s in the rule would also remove a space between "arch" and "nemesis". I've made the change to the typo rule. GoingBatty (talk) 03:07, 22 January 2023 (UTC)

Human typo-checking question

I’m certain this is the wrong place to ask, but I hope people here understand the problem: How would I go about correcting simple “typos” when half of them are likely to be correct?

Example: Act or Bill should only be capitalised when part of the title of some Act or Bill. When talking about them as common nouns, they should be lowercase. This is contrary to style for a lot of writers in legal professions, but obviously can’t be corrected by a simple "Act" -> "act" replacement through the entire article.

I’m happy to go through case by case, but I’d like to be able to queue up a whole category of articles through JWB or similar, rather than open each page one-by-one and have to set up multiple find-replace pairs one-by-one as I go. — HTGS (talk) 00:00, 21 January 2023 (UTC)

@HTGS In AWB, I could do a Wiki search (text) for Act insource:/the Act / and see 5,940 results. If JWB has a similar functionality, that should keep you busy for a while. Happy editing! GoingBatty (talk) 04:27, 21 January 2023 (UTC)
Yes, Cirrus search is easy in JWB (Setup → Generate → Wiki search). I'd do something like "the Act" insource:/ the Act[^a-z]/ to include "the Act." etc. and eliminate Tithe Act etc., but no search is perfect and there will still be false positives such as Signatories of the Act of Independence of Lithuania. I got 13,275 hits in mainspace plus a handful of portal pages. Certes (talk) 16:08, 21 January 2023 (UTC)
Thank you both, but what I’m looking for is a tool that allows you to review each found item—as you would have to in the standard wiki editor—but also queues up multiple articles in the way that JWB and AWB do, only without their replace-all default. Finding articles to queue isn’t my main concern.
This single-step find-replace click by click is important, because even in this case alone, as Certes points out, the search finds phrases like “Catholics were barred from the throne by the Act of Settlement 1701, and there could be thousands of exceptions, so I certainly don’t want to start with the presumption of replacing every instance. Obviously I also want to replace instances of Act that aren’t preceded by the, and figuring out clever-enough searches is beyond me, so I’d be happier just manually clicking yes or no to each one. I’d also like the ability to load multiple different find-replace “typos”, in the way that the autobrowsers do currently (Act, Bill, etc).
Entirely possible that there isn’t a tool that suits this use case, so maybe I’ll just have to ask the AWB/JWB devs… or maybe over at the idea lab? — HTGS (talk) 22:01, 21 January 2023 (UTC)
I use JWB with the presumption that change is needed and simply press Skip or Save on each article. That seems efficient for me (if not my PC) even when I expect 90% of my list to be false positives. With the diff clearly presented, I find it easy to dismiss inappropriate suggestions such as the act of Settlement accurately, AWB behaves similarly, with the advantage that you can easily click away one loser amongst a page of good changes. Certes (talk) 22:59, 21 January 2023 (UTC)
@HTGS With AWB, you can queue up a list of articles. In the Options tab, you can choose whether to have general fixes on or off, and whether to have typo fixing on or off. In addition, in the Find and replace Normal settings, you can set up many of your own rules. For example, a simple rule with the could be to replace the Act with the act (each one containing a space at the beginning and the end), and make sure the "CaseSensitive" box is checked. You could also use click the "Regex" box, and make a rule that would replace the Act([^a-z]) with the act$1 (again, each one containing a space at the beginning and the end).
There is more information on how to use AWB in its user manual. Hope this helps, and happy editing! GoingBatty (talk) 23:34, 21 January 2023 (UTC)
Even better: replace something like \bthe Act\b(?! of [A-Z]) by the act. Cirrus Search doesn't support word boundary checking, but AWB and JWB do with \b. Similarly, search can't look ahead, but AWB and JWB can with (?!…). Certes (talk) 11:29, 22 January 2023 (UTC)

"Taking one's life" is not a euphemism

This should probably be removed from the list of expressions to replace—not every figure of speech referring to suicide is a euphemism. Nothing about the phrase is intended to avoid the perceived force, stigma, or other negative connotations of the word "suicide" or the phrase "killed himself". This seems like an overly-broad categorization of ordinary English; an encyclopedia that expressly permits "died by suicide", which actually is euphemistic, should not be prescribing standard—and venerable—formulations that are not euphemisms. The expressions listed in MOS:SUICIDE are clearly not intended to be an exhaustive list of allowed formulations—hence the statement that "[t]here are many other appropriate, common, and encyclopedic ways to describe a suicide" language. "Taking one's life" is one of those ways; simple, straightforward, and perfectly encyclopedic; in no way is it a circumlocution. P Aculeius (talk) 12:02, 20 February 2023 (UTC)

Since you previously posted about this issue on my talk page, I have done my best to form somewhat of an opinion. I do agree with the addition to the list of expressions, simply because it is clearer. Yes, "to take one's own life" sounds nicer (in my opinion), but it is not as straightforward as "to commit suicide" or "to kill oneself". Can you explain why you defend the usage of "to take one's own life" over other, arguably clearer, options? Ideasmete (talk) 14:34, 20 February 2023 (UTC)
Too much to unpack here. First, I posted on your talk page as a courtesy because your name was attached to the edits. You said you didn't really care, and that I needed to take it up here instead. Now you're chasing me around Wikipedia undoing my edits to restore the original text, and telling me that I'm the one who needs consensus to undo controversial bot edits! That's not how Wikipedia editing policy works. If you want to make a controversial change, you're the one who needs to develop a consensus for it. I don't need to defend ordinary English; you need to show why it should be proscribed. You want to change Wikipedia's policy about describing suicide to say "there are only a couple of appropriate ways to describe it, and these are the only ones you're allowed to use", you take it up there. Kindly revert your changes until you have some kind of consensus to do exactly that. P Aculeius (talk) 16:24, 20 February 2023 (UTC)
@P Aculeius: I have disabled these re-spelling rules since the guidance at MOS:SUICIDE is not currently stable, having flipped back and forth during January. Any discussion about what MOS:SUICIDE should say belongs at the talk page there. Once the manual of style is agreed and stable again, then it might be safe to re-enable the spelling rules in some form. -- John of Reading (talk) 17:22, 20 February 2023 (UTC)
Thank you, much appreciated. P Aculeius (talk) 17:28, 20 February 2023 (UTC)

Exception for "state wide reciever(s)"

Here in this edit you can see "Penn State wide receivers" was changed to "Penn Statewide receivers". An exception for "state wide" when followed by "reciever(s)" would probably prevent this. — Nythar (💬-🍀) 12:29, 28 April 2023 (UTC)

@Nythar: Hopefully done with this edit without breaking anything. -- John of Reading (talk) 13:01, 28 April 2023 (UTC)

"Aberrant/Aberration" rule is matching "Abberath"

<Typo word="Aberrant/Aberration" find="\b([aA])b(?:ber?|e)ra([nt][a-z]+)(?<!Aberangell)\b" replace="$1berra$2"/>

Hi again! This rule also captures "Abberath" which is a valid word on the wiki I'm editing. (It's the name of an NPC in the game.) — Nightyb (t) 04:23, 20 February 2023 (UTC)

Here's my proposed change, which excludes both "Abberath" and "Aberangell", and improves performance:
<Typo word="Aberrant/Aberration" find="\b(a|A)b(?:ber?|e)ra((?:n|t)[i-z]+)\b" replace="$1berra$2"/>
https://regex101.com/r/rFQPcd/1Nightyb (t) 02:22, 21 February 2023 (UTC)
It's been a while since I suggested this and there's been no feedback, so what happens now? Do I just change it myself? —Nightyb (t) 04:26, 2 May 2023 (UTC)
@Nightyb: Be bold and change it yourself. Thanks! GoingBatty (talk) 05:40, 2 May 2023 (UTC)

"at the at the"

I've been working my way through "at the at the" changing them to "at the" and I'm not getting enough false positives to be a problem for AWB. Could this be added to the rules? ϢereSpielChequers 17:38, 2 July 2023 (UTC)

Done - Neils51 (talk) 20:54, 3 July 2023 (UTC)
Ta muchly ϢereSpielChequers 21:44, 3 July 2023 (UTC)

From year to year, twice

Any way to handle He was also mayor of Halifax from 1874–1876 and 1884–1885. better?
Right now it changes this to He was also mayor of Halifax from 1874 to 1876 and 1884–1885. -- Jonatan Svensson Glad (talk) 22:00, 28 August 2023 (UTC)

I've seen similar cases. We could do a negative lookahead for "and number", resulting in these cases not being changed. Otherwise it would have to be a full check for "from n1-n2 and n3-n4", repeating all the logic needed when n2 has fewer digits than n1, etc. Certes (talk) 22:46, 28 August 2023 (UTC)

These replacements also create false positives, e.g. "the vase dates from 1100–1200" probably shouldn't be changed. Certes (talk) 22:48, 28 August 2023 (UTC)

Capitalisation of partial title of institution name

There is a capitalisation dispute at Royal Commission into the New South Wales Police Service, wherein Wikipedia:Manual_of_Style/Capital_letters#Institutions is used to determine the partial title of the same body. The dispute is over the interpretation of the rule, and a potential conflict with this AWB capitalisation bot that is converting all the initial caps of the partial title to lower case, and raising generic terms into initial upper case. However, the rule states that if a partial title is used, the same capitalisation must be used as for the full title.

My interpretation of this rule is therefore as below:
Full title: The Royal Commission into the New South Wales Police Service
Partial title: The Royal Commission
Partial title: The Commission

Each is referring to the same, specific body, and so I believe the same capitalisation should apply everywhere it is used. Disagreement has arisen wherein it has been claimed that when used in the third form, 'commission' becomes generic, and therefore no initial capital should apply. I disagree, on the grounds that the article is referring to the Commission, not a commission or generic commissions and I believe this conforms with the above rule. If one is talking about a royal commission, or royal commissions, or even just commissions in general, no capitalisation should be used, but if one is referring to the Royal Commission as a partial title for the Royal Commission into the New South Wales Police Service then initial capitals should be used.

Elsewhere in the article, reference is made to 'royal commissions' and 'a royal commission', e.g. "even by the standards of a royal commission" - both generic, and therefore both should be in lower case according to the same rule adduced above. The article Royal Commission into the New South Wales Police Service has been published since 2005, i.e. 18 years, with the capitalisation as described above, so if this bot is responsible for reverting such a longstanding text, then I believe it is the bot that needs revision, not existing articles such as this one, that obey the existing capitalisation rule adduced above. Chrisdevelop (talk) 16:38, 6 May 2023 (UTC)

@Chrisdevelop: Wikipedia's style for this is described at MOS:INSTITUTIONS. Following Wikipedia's style, the first would definitely be capped, the third would definitely be lower case. I understand your point that it is referring to a specific commission, but Wikipedia's style, which is common among publications, is to use lower case. Many institutions capitalize "the University" or "the Zoo" when talking about themselves, but not when talking about other universities or zoos. Since Wikipedia talks about almost everything, and is not affiliated with just one institution, it has chosen to always use lower case. (Though The Signpost capitalizes "the Board" when referring to the the Wikimedia Foundation's Board of Trustees, so even Wikipedia doesn't follow its own style consistently.) The middle one technically should be lower case too, but sometimes that looks wrong to me for two-word or longer phrases, so I might change it or not.
P.S. This frequently comes up at WT:MOSCAPSSchreiberBike | ⌨  19:10, 6 May 2023 (UTC)
@SchreiberBike: Thank you for weighing in. This article is about the Wood Royal Commission, so when using the short titles, ‘Wood Commission’, ‘Royal Commission’, or just ‘Commission’, it is ‘referring to itself’, as in the Wikipedia example you mentioned. In the current context, making the word 'commission' generic means it can be any commission (other commissions are named in the article), and it can also mean 'commission of a crime' or 'commission a composition', or ‘the commission’ as a homonymic concept, although of course, if you read the article it becomes clear. Such an interpretation doesn't however comport with the MOS rule adduced above. Please refer to this sentence in the rule: "Also treat as a proper name a shorter but still specific form, consistently capitalized in reliable generalist sources (e.g., US State Department or the State Department, depending on context)." Chrisdevelop (talk) 16:53, 7 May 2023 (UTC)
@Chrisdevelop: I'd say that if it's difficult to tell if commission refers to the Wood Royal Commission or to something else, the sentence should be rewritten. I can see that in some places capitalizing that way can be helpful, but the rule in Wikipedia, an encyclopedia about many commissions, universities, zoos, etc., comes as a response to people wanting to capitalize their "University", their "Zoo", etc. in the articles about those institutions. Also note that the examples you give above from MOS:INSTITUTIONS does not include "the Department". If you're not finding my explanation persuasive, feel free to ask at WT:MOSCAPS. Thank you. SchreiberBike | ⌨  20:34, 7 May 2023 (UTC)
@Chrisdevelop: What is the name of the "AWB capitalisation bot"? Thanks! GoingBatty (talk) 17:50, 8 May 2023 (UTC)
AutoWikiBrowser, the bot that was run on the Wood Royal Commission page. Chrisdevelop (talk) 18:08, 8 May 2023 (UTC)
@Chrisdevelop: AutoWikiBrowser (AWB) is the name of a piece of software used by both human editors and bots. Looking at the article history, it appears that human editor Neils51 was running AWB manually with the typo correction feature enabled. This discussion would be better suited for Wikipedia talk:AutoWikiBrowser/Typos. Bots using AWB do not perform typo corrections. GoingBatty (talk) 18:32, 8 May 2023 (UTC)

Discussion moved Neils51 (talk) 23:14, 8 May 2023 (UTC)

Using "the commission" doesn't infer it's generic, but that it's the object you're referring back to. Like if I was discussing the RMS Queen Mary, I'd refer back to it as "the ship" or "the ocean liner", rather than "the Ship" or "the Ocean Liner". Uppercasing the object makes great sense in marketing materials or event invitations, but in plain ol' descriptive encyclopedic text, I can't see a necessity. Stefen Towers among the rest! GabGruntwerk 20:17, 25 September 2023 (UTC)

"short distance overland" -> "short-distance overland"?

In Fort Wayne, Indiana, AWB wants to correct the sentence "The most important geographical feature of the area is the short distance overland between the Three Rivers system, which eventually flows to the Atlantic, and the Wabash system, which eventually flows to the Gulf of Mexico." by hyphenating "short-distance". I guess I'm a little befuddled at the language in the sentence. Maybe it should say "short distance over land"? Or is 'overland' a noun in this case by chance? Stefen Towers among the rest! GabGruntwerk 01:35, 1 October 2023 (UTC)

It should be "short distance over land", since "overland" is an adjective but is being misused as a noun phrase here. In a construction like "It was a short-distance overland journey" both hyphenated "short-distance" and fully compounded "overland" would be correct.  — SMcCandlish ¢ 😼  11:11, 8 October 2023 (UTC)
Thank you. Assuming they didn't intend to say "overland journey", I've copyedited the article in question to split 'overland'. Stefen Towers among the rest! GabGruntwerk 16:04, 8 October 2023 (UTC)

Needs to stop removing hyphens from compound modifiers

In "a well-received presentation" and similar constructions, the hyphen belongs there per MOS:HYPHEN.  — SMcCandlish ¢ 😼  11:08, 8 October 2023 (UTC)

I agree with SMcCandlish, this rule (listed under the section "New: remove other hyphens (replace with space)"):
<Typo word="well received" find="\b([wW])ell-received\b(?=\.|\s+(?:at\b|by\b|in\b))" replace="$1ell received"/>
seems to violate MOS:HYPHEN, specifically the section that states: "A hyphen is normally used when the adverb well precedes a participle used attributively (a well-meaning gesture; but normally a very well managed firm, because well itself is modified) and even predicatively, if well is necessary to, or alters, the sense of the adjective rather than simply intensifying it (the gesture was well-meaning, the child was well-behaved, but the floor was well polished)." This topic was thoroughly discussed on my user talk for those who are interested (User talk:Wikipedialuva/Archives/2023/October#Cleanup that is the opposite of cleanup). Thanks! Wikipedialuva (talk) 08:41, 11 October 2023 (UTC)
I think the intent of the correction was to fix something like "The plan was well-received by Bubba" which shouldn't have a hyphen. Definitely this should be reviewed to be rewritten to avoid false positives. Stefen Towers among the rest! GabGruntwerk 08:50, 11 October 2023 (UTC)
That rule does not change "a well-received presentation", since the "?=" clause checks for a following full stop or three specific prepositions. You can verify this using the current contents of User:John of Reading/X3. -- John of Reading (talk) 09:35, 11 October 2023 (UTC)
Not "a well-received presentation", but an example of the error is when AWB changed "The film was well-received by critics, although with the occasional reservation." to "The film was well received by critics, although with the occasional reservation." on the article The Hustler. Here is the diff: [3] Thanks! Wikipedialuva (talk) 10:25, 11 October 2023 (UTC)
Courtesy ping to Chris the speller who has worked on this rule previously. -- John of Reading (talk) 11:02, 11 October 2023 (UTC)
I think that SMcCandlish used to understand the difference in hyphenation (predicately vs. attributively) stated in MOS:HYPHEN, so I am very surprised that he has complained about this instance. The MOS states "A hyphen is normally used ... and even predicatively, if well is necessary to, or alters, the sense of the adjective rather than simply intensifying it". When the case of "was well received by critics" is compared to the specific examples in MOS:HYPHEN, " the child was well-behaved, but the floor was well polished", it is much more like the "well polished" example. Chris the speller yack
Hmm. Well, people do monkey around with the MoS wording from time to time. I can't think of a good reason for "the film was well received by critics" or "the floor was well polished" in material that would write "the child was well-behaved". The notion of a split along those lines is apt to be confusing to most editors (and readers), and it presumes a level of linguistic-analysis expertise that will be missing from most people in both categories.  — SMcCandlish ¢ 😼  17:17, 11 October 2023 (UTC)
One difference in hyphenation between "the child was well-behaved" and "the film was well received by critics" is that the hyphen in "well-behaved" is so entrenched that (American, at least) dictionaries specify a hyphen, as it is somewhat of an idiom. But "well received by" has not gotten to that point and may never get there, as many kids are growing up with the punctuation on boxes of breakfast cereal as their model. Chris the speller yack 02:39, 12 October 2023 (UTC)
At any rate, that specific fix doesn't appear to be in error, per my example. Stefen Towers among the rest! GabGruntwerk 21:41, 11 October 2023 (UTC)
Also, I'm not saying our own Wikipedia article is a definitive source, but Compound modifier#Hyphenation of elements in English by my reading seems to back up not using a hyphen in that case. Stefen Towers among the rest! GabGruntwerk 22:23, 11 October 2023 (UTC)
Note that I'm not accusing the rule of having an error, just stating that a review of it may be warranted. Stefen Towers among the rest! GabGruntwerk 21:43, 11 October 2023 (UTC)

Nobold and Noitalic templates --> Normal

@Remsense Hi there! I see you added a rule to convert {{nobold}} & {{noitalic}} to {{normal}}. At first, I was going to suggest using WP:AWB/TR instead of this page. But then I looked at the instructions at Template:Nobold and Template:Noitalic, and don't see any mention of converting these templates. Is there consensus to make these conversions? GoingBatty (talk) 19:58, 12 October 2023 (UTC)

Understood, I'll broach it there before I imply it's normative with a rule like this. :) Remsense 20:11, 12 October 2023 (UTC)
Sounds good. Further, this would not be a typo rule at any rate per GoingBatty. Template replacements are a different AWB department. Stefen Towers among the rest! GabGruntwerk 20:18, 12 October 2023 (UTC)
I had an inkling, but I suppose the stakes for learning are never too high on Wikipedia. :) Remsense 20:32, 12 October 2023 (UTC)

re: de facto/jure

@Certes indeed! but according to MOS:FOREIGNITALICS, seeming air-tight enough, it should absolutely not be italicized, just because it's prepositional borrowed from Latin used in law. It gives the example that "etc." shouldn't, and provides the rule of thumb that it shouldn't be italicized if it's in multiple English dictionaries. I would reckon 'de facto' and 'de jure' have been in every English dictionary since maybe those were invented. Remsense 18:25, 12 October 2023 (UTC)

Discussion here! Remsense 18:32, 12 October 2023 (UTC)
Thanks for starting a(nother) discussion. I hope it will attract further informed opinions, as we both seem to have reasonable but incompatible cases. Certes (talk) 22:39, 12 October 2023 (UTC)
if you are so inclined and have a chance, i would be curious to hear yours! Remsense 23:29, 12 October 2023 (UTC)
Of the articles containing the expression "de facto", 2,391 use italics and 938 do not. Corresponding figures for "de jure" are 3,636 with italics and 1,655 without. Automated typo fixes are meant for changes that have universal or near-universal agreement. It looks as if 70% of editors prefer not to make this particular change. Certes (talk) 12:18, 13 October 2023 (UTC)
oh certainly, i think the auto typo fix was not correct on my part, to be clear Remsense 16:18, 13 October 2023 (UTC)

I was just a little overconfident a bit ago when I tried to address false positive cases like "University of [[Louisville Cardinals]]" being "fixed" by lower-casing 'university', which is definitely not the desired result. Is there a way to accomplish this in the regex? Stefen Towers among the rest! GabGruntwerk 02:32, 28 October 2023 (UTC)

Note: My change worked in Find & Replace, but not typo correction. Apparently the typo correction code can't see [[ links, so I can't take them into account. Stefen Towers among the rest! GabGruntwerk 23:59, 28 October 2023 (UTC)

"Childrens"

Replacing "Childrens" with "Children's" can result in (equally) incorrect prose. See Special:Diff/1185384334 for example diff, changing "childrens'" to "children's'"

The relevant rule looks to be

<Typo word="Children's" find="\b([cC]|[gG]randc|[sS]tepc)hild(?:er|re)ns(?:['’´ˈ׳᾿‘′Ꞌꞌ`]s?)?\b" replace="$1hildren's"/>

A modification of the regex to

\b([cC]|[gG]randc|[sS]tepc)hild(?:er|re)ns\b(?:['’´ˈ׳᾿‘′Ꞌꞌ`]s?)?

should fix this specific issue, but I'm not sure there's not some other issue it causes in turn and would thus like a sanity-check before I go modifying the regex myself. Credit to Aidan9382 for spotting the fix. Ljleppan (talk) 12:06, 16 November 2023 (UTC)

Thank you; that looks like a useful improvement. I've also tripped over this problem and had to fix children's' manually, but at least AWB found the problem. There is also the issue of She had two childrens, which this won't fix, but there can be no automated way to tell the two apart and again at least the problem is found and presented to the AWB user for manual improvement. Certes (talk) 18:36, 16 November 2023 (UTC)

"days latter"

Can we add "days latter" as a typo of "days later" please? There are about thirty of them, I've also checked for Montha and years latter but those seem rarer. ϢereSpielChequers 08:38, 4 December 2023 (UTC)

@WereSpielChequers: plus Added rule and  Fixed 26 typos with AWB plus 2 manually. I did not see any instances of "weeks latter". GoingBatty (talk) 14:48, 4 December 2023 (UTC)
Thanks, that would have been much quicker for you than it would have been for me. I must ask Santa for a Windows machine so I can get AWB running again. ϢereSpielChequers 16:04, 4 December 2023 (UTC)
AWB runs with difficulty on some other OS. User:Certes/AWB on Ubuntu is outdated but may help. However, I just use JWB which does simple things just as well and can manage some of the more complex AWB stuff too. Certes (talk) 18:30, 4 December 2023 (UTC)

"C"-shaped, etc.

@Chris the speller: nice find! When you're done, could you create a rule for these?   ~ Tom.Reding (talkdgaf)  17:40, 4 January 2024 (UTC)

It's completely cool to clear the clutter. Yes, i will add a rule. Chris the speller yack 17:48, 4 January 2024 (UTC)
@Tom.Reding: No, I won't. What was I thinking? "AWB purposely avoids fixing typos in certain areas of the wiki-text. Typo fixing is prevented within: image names, template names and parameters, wikilink targets, text in quotations and italics": so it won't remove quotation marks. Chris the speller yack 18:30, 4 January 2024 (UTC)

Incorrectly changing en dashes to hyphens in violation of MOS:SUFFIXDASH

I was recently doing page clean-up with AWB, and AWB suggested changing "Academy Award–winning" (with an en dash) to "Academy Award-winning" (with a hyphen) on the article "Society of the Snow" (see this edit[4]). User Nardog noticed my edit and reverted it[5] citing MOS:SUFFIXDASH. I started a discussion about this (see: Wikipedia talk:Manual of Style#Clarification on the use of a hyphen or an en dash for "Academy Award winning"), and it appears all users who commented are in agreement that en dashes should be used in these situations and not hyphens. Thanks! Wikipedialuva (talk) 09:45, 27 January 2024 (UTC)

When using the typo fixes, it's important to not save it unless you know it is correct. The typo list should be thought of as suggestions that are usually correct but cannot foresee all scenarios. That said, I've probably saved this with a hyphen forgetting the guideline. At any rate, this may or may not be fixable in the typo rules because not all of these scenarios can be covered programmatically with Regular Expressions. My hunch is we can make it pass by more of these false positives than before. Stefen Towers among the rest! GabGruntwerk 09:55, 27 January 2024 (UTC)
@StefenTower: Thanks for looking into this. I want to acknowledge that I understand that the user is ultimately responsible for the edits made with AWB and that this erroneous edit was my fault for saving the edit. As was noted on MOS talk, violations of the MOS:SUFFIXDASH rule in regards to awards are extremely widespread on the project (extending into article titles, which I have been working on fixing). I also understand that it will be impossible to try to come up with every possible variation of this rule. I don't know if it would be appropriate or even possible to try to make a rule for some of the most common awards that cause the error or not. Thanks again for responding and looking into the issue! Wikipedialuva (talk) 10:24, 27 January 2024 (UTC)
I'm glad there is understanding of the use of AWB Typos and how it works but to the crux of the matter, perhaps if we didn't place the hyphen when "Award" is capitalized and preceded by another capitalized word, it would cut out most false positives. That should be a simple fix. Stefen Towers among the rest! GabGruntwerk 20:58, 27 January 2024 (UTC)
@Wikipedialuva  Done The simple fix I described is now complete. Stefen Towers among the rest! GabGruntwerk 00:47, 28 January 2024 (UTC)
@StefenTower: Fix works great! Thanks for you help! Wikipedialuva (talk) 07:56, 28 January 2024 (UTC)

Backspace in regex name

Greetings! I'm recycling AWB regexes in a script that can efficiently scan database dumps on Linux. The use of backspaces in word="you'(d\ve\re\ll)_" creates some weird escaping issues. Any objection to rename it "you'(d/ve/re/ll)_"? That seems more like standard slash usage to me anyway. Thanks, Beland (talk) 01:05, 29 January 2024 (UTC)

@Beland: That shouldn't cause any issues. I believe AWB itself doesn't use the names for anything. -- John of Reading (talk) 07:42, 29 January 2024 (UTC)
Beland  Done As John of Reading said, changing this won't impact AWB typo correction. I wish the rule name did display in the log, but alas it does not (I need to request that :) ). Stefen Towers among the rest! GabGruntwerk 08:43, 29 January 2024 (UTC)
@John of Reading and StefenTower: Thanks for the quick fix! -- Beland (talk) 20:17, 29 January 2024 (UTC)

"ua" was flagged as a typo of "uk" (sorry about the previous mistakes, im sleep deprived). I'm not sure if this was the typos or something else that flagged this though. I'm not seeing a regex that would have done that, just saw the attempted diff. DarmaniLink (talk) 02:10, 13 February 2024 (UTC)

@DarmaniLink: That was a general fix, not a typo fix. There is an entry in Wikipedia:AutoWikiBrowser/Template redirects that tells the software to replace instances of the redirect {{lang-ua}} with {{lang-uk}}. -- John of Reading (talk) 07:59, 13 February 2024 (UTC)
ah, i assumed lang-uk was british english, I really don't know why
should have checked that, sorry DarmaniLink (talk) 15:11, 13 February 2024 (UTC)

False Positive

Vice-President gets corrected to Vice-president, should be Vice President or vice president (I think) DarmaniLink (talk) 18:13, 11 February 2024 (UTC)

With respect to companies/organizations, I defer to them using a hyphen in the title per their choice. At the same time, titles like "Vice President of the United States" definitely don't have a hyphen. At any rate, Chris the speller created rules related to this, so maybe he has some thoughts here. In the meantime, feel free to skip any typo corrections you don't feel comfortable with, or do a manual edit if you so choose. Stefen Towers among the rest! GabGruntwerk 17:38, 12 February 2024 (UTC)
Stefen is correct. Note that in Europe the hyphen is generally used, while in the US it is less often used, but this has to be handled case by case. Chris the speller yack 02:06, 13 February 2024 (UTC)
Is it "Vice-president" or Vice-President" for british english/europe? DarmaniLink (talk) 02:09, 13 February 2024 (UTC)
Most resources seem to suggest that for 'Vice President' both are capitalized when used as a title. Neils51 (talk) 23:54, 13 February 2024 (UTC)

Typos restructuring

CoolieCoolster As this is a complicated tool rather than an article, any major restructuring needs to be discussed. Please use this topic to explain what you think ought to be done here. Also, if changes are to be done, they need to be done more piecemeal, so editors can readily see what is moving where. A lot of difficult work has gone into building the list over time, and we need to be extra careful. Stefen Towers among the rest! GabGruntwerk 04:35, 12 April 2024 (UTC)

Apologies, and noted. As it stands, the current structure of the list is highly disorganized, with many fixes lasting more than a year in a general section at the top of the page. This not only makes finding which issues have or haven't been addressed difficult, but also results in situations where the same issue may be covered twice via different means, such as km² via its own unicode to sup tag listing and one that also addresses m² and cm². I don't mean to remove recent additions from the top of the list, as I understand the importance of testing them extensively before they are integrated into the main lists, but I also think that it's important to sort articles into sections after the year time period has passed in order to maintain an effective organization scheme. In terms of organization, the Capitalisation, Grammar, and sections at the top do a pretty good job at dividing rules into groups (improving navigability, facilitating standardization, and making it easy to see what's not represented), ones with too many listings inevitably become bloated and should be divided into meaningful subsections. To illustrate my point, here's a restructured collection of the current sections, to be amended when the listings at the top are sorted:
   4 Typo list
       4.1 Recent additions
           4.1.1 Unsorted
           4.1.x Common subsections (TBD)
       4.2 Academia
           4.3.1 Academic titles
           4.3.2 Academic fields
           4.3.3 College degrees
       4.3 Capitalisation
           4.4.1 Brand names
               4.4.1.1 Colleges and universities
               4.4.1.2 Companies and organizations
               4.4.1.3 Products
               4.4.1.4 Technology
               4.4.1.5 Websites
               4.4.1.6 Unsorted
           4.4.2 Placenames (high-level)
               4.4.2.1 Continents and subcontinents
               4.4.2.2 Oceans
               4.4.2.3 Geographical proper names
           4.4.3 Placenames (low-level)
               4.4.3.1 Canada
               4.4.3.2 France
               4.4.3.3 United Kingdom
               4.4.3.4 United States (states)
               4.4.3.5 United States (cities)
           4.4.4 Time
               4.4.4.1 Calendrical proper nouns
               4.4.4.2 Holidays
               4.4.4.3 Epochs, ages and dynasties
           4.4.5 Society
               4.4.5.1 Cultures, languages, and ethnic groups
               4.4.5.2 Ethnicity & language
               4.4.5.3 Religious
           4.4.6 Unsorted
       4.4 Decapitalisation
           4.5.1 Medals
           4.5.2 Miscellaneous
       4.5 Mispellings
           4.5.1 A
           4.5.2 B
           4.5.3 C
           4.5.4 D
           4.5.5 E
           4.5.6 F
           4.5.7 G
           4.5.8 H
           4.5.9 I
           4.5.10 J
           4.5.11 K
           4.5.12 L
           4.5.13 M
           4.5.14 N
           4.5.15 O
           4.5.16 P
           4.5.17 Q
           4.5.18 R
           4.5.19 S
           4.5.20 T
           4.5.21 U
           4.5.22 V
           4.5.23 W
           4.5.24 X
           4.5.25 Y
           4.5.26 Z
       4.6 Accents and diacritics
           4.7.1 Proper nouns
       4.8 Formatting
           4.8.1 Calendar dates
           4.8.2 SI unit symbols
           4.8.3 Symbols and HTML entities
       4.9 Grammar
           4.9.1 Articles
           4.9.2 Contractions
           4.9.3 Replace space by hyphen
           4.9.4 Joined words
           4.9.5 Split words
           4.9.6 Duplicated words
           4.9.7 Redundant words
           4.9.8 Euphemisms
           4.9.9 Preposition usage
           4.9.10 Punctuation
           4.9.11 Remove hyphens after adverbs ending in -ly
           4.9.12 Remove other hyphens (replace with space)
       4.10 General rules
           4.10.1 Unsorted
           4.10.2 Beginnings
           4.10.3 Middles
           4.10.4 Endings
               4.10.4.1 A
               4.10.4.2 B
               4.10.4.3 C
               4.10.4.4 D
               4.10.4.5 E
               4.10.4.6 F
               4.10.4.7 G
               4.10.4.8 H
               4.10.4.9 I
               4.10.4.10 J–K
               4.10.4.11 L
               4.10.4.12 M
               4.10.4.13 N
               4.10.4.14 O
               4.10.4.15 P
               4.10.4.16 Q
               4.10.4.17 R
               4.10.4.18 S
               4.10.4.19 T
               4.10.4.20 U–V
               4.10.4.21 W
       4.11 Incorrect phrases
While it would make the TOC longer, I think it would make it much easier for people to find issues they'd like to address (using the RegEx replacement rules to search for articles that fulfill those criteria), make it easier for people to make new rules to fill in the gaps of existing ones (such as the 'cubed' rule I previously added on the basis of the 'squared' rule), and facilitate the adding of new rules (by categorizing similar rules together, it's easier to see what's missing). For instance, by grouping together university capitalization rules into their own subsection, people may think of additional entries to add to that specific section that they might not have otherwise in looking at a more generalized list.
Apologies again for making such a drastic edit without consultation; it just seems like the potential of RegEx typo fixing would be doubled if there were a greater deal of clarity and structure in the rule categorization scheme. CoolieCoolster (talk) 05:16, 12 April 2024 (UTC)
CoolieCoolster I'm looking through this and it's quite overwhelming. It would really help if you made a list of "change x to y" proposals, and possibly separating those out so they can be discussed individually. It's really difficult to grasp the value of this restructuring as a whole. What we had wasn't not working, and I can't yet see that the potential would be doubled with the proposed changes. Stefen Towers among the rest! GabGruntwerk 21:26, 12 April 2024 (UTC)
It's just my proposed solution; the overall problem is that while the page mentions sorting listings into sections after a year, it doesn't appear that that is occurring in an organized manner on a consistent basis, making it difficult to interpret what is or isn't present without using the browser's search function for individual words. I don't mean to be blunt, but given that an effective reorganization would involve moving hundreds of lines to new or existing sections, discussing the moving of any individual line would be missing the forest for its trees. While moving any one line has no inherent value on its own, and should only be done if it functions identically when sorted as it did initially, the value of the sorted list as a whole is the ability to see what functions are currently unaccounted for, particularly for common typos that one might not have considered otherwise.
My intention is to help, not harm the project, so until consensus on the matter is reached I'll stick to just organizing any replacement rules I add myself to subcategories of the existing New additions section. However, given that organization is already listed as being part of the list-making process, as long as new organization keeps rules above the General rules section to avoid rule interference, it seems a shame to forgo the benefits of a sorted list for the sake of avoiding change for change's sake. CoolieCoolster (talk) 21:56, 12 April 2024 (UTC)
There is no "avoiding change for change's sake" going on here. I asked for a detailed explanation of the specific x-to-y changes. That's all. I already have read your overall contention about the structure. I'm not inclined to agree to such massive changes unless and until I (and hopefully others) can understand that they make sense. I am not against change or improvement. Stefen Towers among the rest! GabGruntwerk 22:01, 12 April 2024 (UTC)
Need a project approach here. Firstly, is the proposal a good idea? Namely, create a redefined list and cleanup entries. In the interests of consensus, yes. Sure what's there works however it can be tricky to navigate and there is duplication. Next, how to approach it? It makes sense to me that a parallel (new) list is built on a subpage and material coped to it, removing the potential for harm of the active page. A list of editors willing to be involved to be obtained on that page and a work list created where editors can put their name against specific components and thus spread the load. A lot of the work may be sheer copy/paste, which if divvied up won’t be so onerous. I would suggest that any entries that in the interim are added or revised in the active list have a date stamp against them (that seems to be happening) and perhaps the editor’s guesstimate as to classification against the new list. There are other aspects that will need addressing (testing, etc.) however at this stage not much point in writing a book. Neils51 (talk) 01:33, 13 April 2024 (UTC)
To make sure that there's enough people that both think list restructuring would be worthwhile and are willing to help with the restructure on a parallel page (enabling everyone involved to review it so that a consensus on a functional list can be established), I'll make a signup list below. Per my statements above, it won't involve making any modifications to the current list until consensus can be reached that the structure of the parallel list meets the needs of all users involved. I think having at least three people willing to work on the list would help in splitting the workload and ensuring that the structure is a product of consensus and not unilateral editing. CoolieCoolster (talk) 22:57, 15 April 2024 (UTC)
1. CoolieCoolster (talk) 22:57, 15 April 2024 (UTC)
2. Neils51 (talk) 09:03, 16 April 2024 (UTC)
3.
I have doubts most folks who use the typo list are looking at the structure. They're just pulling the whole lot into AWB and using them for typo hunting. At any rate, I'd be happy to review what's done and offer feedback. My main concern is the typo rules themselves staying intact in the result and that the list is no less readable than it is now. I didn't put my name on the list because I don't entirely agree with the premises as stated and my time is eaten up with too many other things. Stefen Towers among the rest! GabGruntwerk 06:35, 18 April 2024 (UTC)

Tie scores being treated as false positives

1-1, 2-2, etc. scores are being overlooked for ndash replacement and with all the sports articles I work on, this is getting a little too frustrating. Tom.Reding - I found out you had made a change to make this a false positive in March 2020. Is there any way this can be made to bypass fewer of these (i.e. look for additional text to match)? A *lot* of tie scores aren't getting a correction (unless I catch it and perform the correction manually). Stefen Towers among the rest! GabGruntwerk 23:02, 3 February 2024 (UTC)

@StefenTower: the relevant part of rule "0–0", (?<!\b\1[-—]\1\b), was moved the next day into rule "2–1", precisely so that "0–0" can find draws and ties. If "0–0" isn't finding ties now, it's not because of that lookbehind; it's because "0–0" needs to be expanded with more relevant keywords.   ~ Tom.Reding (talkdgaf)  13:33, 4 February 2024 (UTC)
Tom.Reding Thanks for the reply. (?<!\b\1[-—]\1\b) is still in "2–1", and stopping corrections of draws/ties that don't meet "0–0". "0–0" currently doesn't cover a lot of scenarios I'm seeing in my typo correction work, but I don't know why the draws/ties need to be avoided in "2–1" 's general case in the first place. Is there any harm from removing that code? That's what I'm driving at. Stefen Towers among the rest! GabGruntwerk 17:37, 4 February 2024 (UTC)
@StefenTower: the reason that lookbehind exists is because the rule was incorrectly catching journal volume numbers, e.g. "Some Journal 5-5", preceded by "5-4" and succeeded by "5-6" etc., which should not be changed/en-dashed.   ~ Tom.Reding (talkdgaf)  17:46, 4 February 2024 (UTC)
I can see why we don't want a journal volume number en-dashed, but I don't quite understand why volume numbers that look like draws/ties are more problematic than those where the numbers aren't equal (e.g. "5-4"). If the "5-4" is in the middle of the series instead of "5-5", wouldn't it be falsely corrected? Stefen Towers among the rest! GabGruntwerk 17:58, 4 February 2024 (UTC)
Tom.Reding Is it that a draw/tie is guaranteed to not be a number range if referring to a volume? If that's the case, I understand it better now after thinking more about it. At any rate, I am working on more cases to convert the ties to use ndash outside of this false positive check. But I wish this check wasn't needed - we really miss a lot of genuine draws/ties as it stands. Stefen Towers among the rest! GabGruntwerk 00:01, 5 February 2024 (UTC)
@StefenTower: do you have a list of pages where ties are known to have been missed?   ~ Tom.Reding (talkdgaf)  11:59, 5 February 2024 (UTC)
@Tom.Reding I don't keep a list of these. I just keep seeing these missed, and I have to correct them manually. It's any draw/tie that the "2–1" rule would ordinarily catch but doesn't due to the false positive catch. If you run AWB typo checks in sports articles especially, it can get rather frustrating. Stefen Towers among the rest! GabGruntwerk 16:26, 5 February 2024 (UTC)
Here is an example. While it shows "17-17" and "7-7" being corrected, that happened only because I saw in my AWB viewer they weren't corrected, and I manually placed the en dash there. Stefen Towers among the rest! GabGruntwerk 16:56, 5 February 2024 (UTC)

@StefenTower: rule amended to find "17-17" and "7-7" in that example, but with enough specificity, I think, to avoid most/all freeform journal citations, which are unlikely to end the line at a journal volume (if so, they can/should have the page # appended).   ~ Tom.Reding (talkdgaf)  21:32, 19 June 2024 (UTC)

Tom.Reding Thanks for addressing this. While I usually have no issue reading RegEx, it has been many months since I looked at this. So, I guess what you did is add code to catch more cases for the "0–0" rule, so any tie ranges within those cases won't be skipped in the "2–1" rule. Do I have it right? Stefen Towers among the rest! GabGruntwerk 05:25, 20 June 2024 (UTC)
@StefenTower: correct for the "0–0" rule, which I did isolate & test. I did not look at the "2–1" rule, since it wasn't relevant to, nor triggered for, the "17-17"/"7-7" example. Also, rules aren't necessarily run in the order they appear on the typo page, and so should be coded in a way that is independent of the firing sequence of other rules.   ~ Tom.Reding (talkdgaf)  18:39, 20 June 2024 (UTC)

Eurodance

Shouldn't Eurodance be capitalized always? In the article itself, it always remains capitalized even when it's not at the start of the sentence. Wiiformii (talk) 03:27, 27 July 2024 (UTC)