![]() | This page is an archive. Do not edit the contents of this page. Please direct any additional comments to the current main page. |
-- Asartea Talk | Contribs 19:33, 21 January 2022 (UTC)
__INDEX__
without surrounding <nowiki>
which warns, but since in mainspace you aren't discussing it and it doesn't do anything its probably fine to just flatout block. --
Asartea
Talk |
Contribs
16:59, 22 January 2022 (UTC)https://en.wikipedia.org/?sort=last_edit_desc&search=insource%3Aindex+insource%3A%2F__INDEX__%2F+-insource%3A%22%3Cnowiki%3E+index+nowiki%22&title=Special%3ASearch&profile=advanced&fulltext=1&ns0=1
and selecting the namespace(s). Annoyingly enough I can't just post the urls as external links cause MediaWiki tries to be smart and breaks them (also note I substracted one from the all other namespaces count, since I have an open AWB perm request which contains INDEX in a search url, which I just realised I also need to fix).__INDEX__
only does anything in User: and User talk:, so BEANS shouldn't be an issue. I'm currently writing a patch to make the template hardfail with warning in all other namespaces. --
Asartea
Talk |
Contribs
17:39, 22 January 2022 (UTC)
what sorts of folks think they should be adding it. Right now the filter is really broad; logging all uses of INDEX and NOINDEX in all namespaces, by all users. It might be worth excluding users with (say) more than 1000 edits, but I want to see what it catches for now. Suffusion of Yellow ( talk) 00:09, 23 January 2022 (UTC)
2409:4073:2094:FF07:3452:920C:D85A:CFEF ( talk) 14:03, 22 January 2022 (UTC)
In the past day I've seen (and reverted) edits on three talk pages: on Talk:Trivia, on Talk:Messenger, and on Talk: Google Ngram Viewer.
They're unrelated and come from unrelated IP addresses, but all create a new discussion with a single-word title and a single word of content. And if I saw three of them on my watchlist there are probably thousands of others. Any thoughts? Thanks, Dan Bloch ( talk) 19:05, 21 December 2021 (UTC)
Talk
link to anons on mobile on 15 November 2021.Talk
visible to anons on mobile is impacting metrics like: talk page revert rate, talk page bounce rate and talk page page views.Vitaium ( talk) 06:39, 2 February 2022 (UTC)
-- lomrjyo ( ✉ • 📝) 21:29, 9 February 2022 (UTC)
2.55.13.156 ( talk) 04:34, 9 March 2022 (UTC)
DVdm ( talk) 22:20, 8 March 2022 (UTC)
Additional IP:
Adakiko ( talk) 01:26, 9 March 2022 (UTC)
Possibly helpful information: all the IPs geolocate to Canberra, Australia, and there is a Dr Peter J. Riggs based there. Tercer ( talk) 08:38, 9 March 2022 (UTC)
Additional new editor:
Adakiko ( talk) 22:12, 17 March 2022 (UTC)
Ivanvector ( Talk/ Edits) 15:27, 23 February 2022 (UTC)
Okay, this is a very specific but limited request for a bot to do some sleuthing. I just discovered an editor, using IPs, mainly from Ontario, Canada but previously from all over, who has a strange kind of vandalism. They create pages and post long stories about riding an elevator. Here is a diff of an example of the same content that just gets reposted. Any way, they are always posted on Portal talk:Current events pages for different days of the year. I was just finding Portal talk pages from random days in 2007 but the example I just shared was a day in 2013. Going through some pages I found that they were doing this, posting this same content about riding an elevator back in 2012! Since I doubt any editors have Portal talk pages on their Watchlist, there might be a lot of this nonsense that still exists or it could be that I found all of it today and there is none left.
Could a bot run a check on Portal talk pages for different days of the year and see if there are any pages that have this strange content? So far, I haven't found any reason for there to be a Portal talk page for each day of every year so maybe a mass deletion is in order if there are a lot of these pages that have been created. Since these pages are typically not seen by readers, this is obviously not a high priority task but since the vandal has recently been very active at doing this this month, this might serve to discourage them. Thank you. Liz Read! Talk! 01:18, 31 March 2022 (UTC)
{{ resolved}}
If this isn't feasible/reasonable, no problem. Doug Weller talk 12:01, 16 March 2022 (UTC)
{{ resolved}}
{{ resolved}}
Helen( 💬 📖) 16:21, 16 April 2022 (UTC)
This is infrequent enough that I'm not really sure if an edit filter is justified, but would it be worth it to have a filter that tags additions of strings like "keter", "overseer council", "secure. contain, protect" etc. to pages with SCP in the title other than SCP Foundation and SCP: Containment Breach? See, for example, [49], [50]. casualdejekyll 03:17, 11 March 2022 (UTC)
Was helping out another project, anyone got a good tip for this case: They are trying to stop subtle numeric vandalism. I was thinking possibly something along the lines of comparing only the numbers in the old to the numbers in the new to see if any changed. I would expect a ton of FP's here.
pseduo code:
(user_age == 0) & (summary == '') & (action == 'edit') & ( (FROM: removed_lines - extract and concatenate just [0-9]) != (FROM: added_lines - extract and concatenate just [0-9]) )
Think this would be too "expensive" on any busy project as well, any thoughts? — xaosflux Talk 15:25, 12 April 2022 (UTC)
extract and concatenate just [0-9]? I'd love a
str_replace_regexp()
function; then we could say str_replace_regexp(added_lines, "[^0-9]", "")
. But I don't see how to do this with just str_replace()
.
Suffusion of Yellow (
talk)
19:29, 12 April 2022 (UTC)
string(get_matches("[0-9]+", text))
suffice?
Certes (
talk)
19:58, 12 April 2022 (UTC)
text
Which might be better than nothing, I guess.
Suffusion of Yellow (
talk)
20:09, 12 April 2022 (UTC)
Looks for matches of the regex needle ... in the haystack; it actually looks for only one match. Certes ( talk) 20:23, 12 April 2022 (UTC)
parts := "(?s)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(.*)";
old := get_matches(parts, removed_lines);
new := get_matches(parts, added_lines);
text_old := (old1 + old3 + old5 + old7 + old9 + old11 + old13 + old15 + old17 + old19 + old21]);
text_new := (new1 + new3 + new5 + new7 + new9 + new11 + new13 + new15 + new17 + new19 + new21]);
nums_old := (old2 + old4 + old6 + old8 + old10 + old12 + old14 + old16 + old18 + old20]);
nums_new := (new2 + new4 + new6 + new8 + new10 + new12 + new14 + new16 + new18 + new20]);
text_old == text_new & nums_old != nums_new
+
uses up conditions, so would could even go for more than ten, in theory. No idea about the time this will take.
Suffusion of Yellow (
talk)
20:38, 12 April 2022 (UTC)not_num := "(?s)^(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(.*)$\K";
get_matches(not_num, removed_lines) == get_matches(not_num, added_lines) &
added_lines != removed_lines
grep -P
scans through a 100MB enwiki database dump in less than a second, while the pattern from filter 384 takes about 8 seconds.
Suffusion of Yellow (
talk)
03:19, 13 April 2022 (UTC)67.21.154.193 ( talk) 15:29, 20 April 2022 (UTC)
This isn't really a proper filter request, but more of a question about if a filter might be needed. The <big></big> markup can be nested indefinitely, and ends up ridiculously large very quickly. I was surprised to note that there isn't an abuse filter preventing non-autoconfirmed users from doing this outside of their own userspace or the sandbox. See diffs below for an example of the sort of abusive use that I am thinking of. With emoji, with text. Basically just replace the text with some array of choice words, and the emoji with something more insulting. These were both me on an alternate account, testing to see if any filters would trip, as whilst I couldn't see any public filters that would trip, I wasn't sure if a private filter might. However, nothing tripped, not even merely to log it. Might this be a problem? Mako001 (C) (T) 🇺🇦 14:19, 26 April 2022 (UTC)
page_namespace == 0 & ( added_lines contains <big><big> )
would likely do it (not including the non-confirmed check) - I'm not sure if this requires a PST check or not. EDIT: Actually, reversing that and doing the big-big check first would likely save conditions , unless it needs PST.
casualdejekyll
16:35, 26 April 2022 (UTC)
Would it be possible and useful to extend 249 (non-autoconfirmed user rapidly reverting edits) to IPs with this sort of editing pattern? I'd expect some newly registered accounts to do it ten times to win a prize ( allow four days for delivery), but even from an IP it disruptively clogs the page histories. Certes ( talk) 20:21, 21 April 2022 (UTC)
This is from the edit history of 95.70.214.241 ( talk · contribs), who appears to be a certain LTA. The height of eight athlete BLPs was changed by one or two cm. Only three of the edits had "(Tags: ... changing height and/or weight). Would it be possible to have all such changes tagged? It could help with wp:RCP. Thank you! Adakiko ( talk) 07:31, 20 April 2022 (UTC)
! ( page_title in added_lines )
in filter
391 (
hist ·
log).
Maxim, I realize I'm asking you about something from 11 years ago, but any idea what that's doing? I can't figure it out. It effectively prevents the filter from tagging any time there's already a reference on the same line with article's subject in the title. But people change referenced figures all the time.
Suffusion of Yellow (
talk)
21:08, 10 May 2022 (UTC)
! ( page_title in added_lines )
but instead ! article_text in added_lines
, which was changed by Zzuuzz in 2019
here. But, article_text
and page_title
have the same meaning per
documentation, but the former is deprecated. I don't recall why that line was in there—it could be something as banal as thinking that article_text
was something that it is not. I am fairly sure that I used a different filter as a starting point for this one; it's possible that this line was germane to the other filter, but I really don't remember what the other filter could have been, nor why the line is here now. I've just tested the filter with and without offending line (line 3), and that seems to pick up the untagged edits (which were not caught because the lines in question had a reference which contain the title of the article). I think the offending line can just be deleted, but I'm interested to see whether if you had any thoughts on this. As a final note, I find it very heartening that this filter, generally with the same rules as when it was written, is still useful after 11 years. :-)
Maxim(talk)
00:03, 11 May 2022 (UTC)
this and this seem rather obvious instances... RandomCanadian ( talk / contribs) 22:17, 26 April 2022 (UTC)
I'm surprised ClueBot didn't catch this edit, but seems it didn't. Could we make an edit filter for additions of repeated strings like it? {{u| Sdkb}} talk 02:51, 28 April 2022 (UTC)
Can this string ("poo poo") be added to the ones prevented by the "poop" filter? I see virtually no legitimate use for this string in mainspace. 💩 Mako001 (C) (T) 🇺🇦 12:20, 12 May 2022 (UTC)
The filter here (for unreferenced articles) was deleted back in 2013 [52] because it apparently had no purpose. Now, a warn+tag filter exists are submitting completely unsourced afc submissions see here. I think reintrodicing filter 402 (with warn) would help against non-notable or spam creations, as well as make new users add more reliable sources.
Also, are the above "pronoun change" and "adding death" filters going to be implemented?
67.21.154.193 ( talk) 15:34, 2 May 2022 (UTC)
Pinging @ Tamzin: since she's an edit filter manager, and no EF manager has responded to any section below this one yet. 67.21.154.193 ( talk) 13:41, 30 May 2022 (UTC)
Sumanuil. 05:48, 27 May 2022 (UTC)
67.21.154.193 ( talk) 15:29, 20 April 2022 (UTC)
Here are the two most recent examples:
This has been happening for several months, perhaps back to last year. I've seen various combinations of the wording "white supremacist" and "racist " edited into political articles, both currently serving individuals and historical figures. These would be edits made after the article was already created. Not limited by geographical area, time period, living or deceased office holders. Can we create a bot that blocks these? And once created, can we update it if a new similar term begins happening? — Maile ( talk) 22:40, 30 September 2021 (UTC)
Senator McSenatorface resigned after admitting to sending hundreds of racist texts...<ref><ref><ref>. It looks like we're whitewashing. So maybe I'll just disallow very small edits without refs, e.g. Special:Diff/1053952115 and leave 189 to tag the rest. Suffusion of Yellow ( talk) 23:43, 7 November 2021 (UTC)
Should this section still be pinned? It has been months since the last comment here 67.21.154.193 ( talk) 12:13, 14 June 2022 (UTC)
should characters in the 33xx range and circled letters and numbers be included in the filter?
67.21.154.193 ( talk) 15:25, 24 May 2022 (UTC)
Vaguely related: are "funny" Greek letters being normalised to, er, normal Greek letters? This looks like an FP:
Special:AbuseLog/32463333 (06:00, 27 April 2022: Ωχγ triggered filter 1,168).
Certes (
talk)
16:03, 24 May 2022 (UTC)
Also:
Change:ℬ and ℭ should have a pipe
Add (test for false positives by unicode normalization first):
|
between them. This is on the line that starts with “(accountname rlike "℀|℁|ℂ|℃|℄|℅|℆|ℇ|℈|℉…”
ªº
(ordinal indicators),
ₐₑₒₓₔₕₖₗₘₙ₊₋₌₍₎ₚₛₜⱼ
(subscript),
ꬲꬽꬾ
(blackletter),
ⁱⁿ⁺⁻⁼⁽⁾
(superscript),
ⱻꜰɢʛʜɪʟɴɶʀꝶꜱʏꭥꞮꟸᶦᶧᶫᶰʶᶸ
(small caps/modifier)
ᶛᶜᶝᶞᶟᶠᶡᶢᶣᶤᶥᶨˡᶩᶪᵚᶬᶭᶮᶯᶱᶲᶳᶴᶵᶶᶷᶹᶺᶻᶼᶽᶾᶿꟹꭟʰʱʲʳʴʵʷˣʸꭜꭝꭞ
(modifiers)
ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩⅪⅫⅬⅭⅮⅯⅰⅱⅲⅳⅴⅵⅶⅷⅸⅹⅺⅻⅼⅽⅾⅿↀↁↂↃↅↆↇↈ
(roman numerals)
(more small caps and modifiers --->)
ꟲ (unicode A7F2)
ꟳ (unicode A7F3)
ꟴ (unicode A7F4)
𝼂 (unicode 1DF02)
𝼄 (unicode 1DF04)
𝼐 (unicode 1DF10)
Latin Extended-F unicode block (10780-107BF; bunch of modifier letters)
67.21.154.193 (
talk)
12:14, 2 June 2022 (UTC)
67.21.154.193 ( talk) 15:40, 20 April 2022 (UTC)
Came across deleted filter 40, but it seems like it would require significant rework if we were to revive it. 67.21.154.193 ( talk) 12:44, 8 June 2022 (UTC)
{{dead link}}
..., try to find an archive yourself at one of these sites (add links to useful archive sites here) or, if you have an account, try using (IABot console link here)."Mako001 (C) (T) 🇺🇦 05:50, 28 May 2022 (UTC)
<source>
tag detection in Filter 432Currently, filter 432 does a check to ensure that <source lang=
is not in the new wikitext. However, this has 2 failures to it that make the detection practically useless.
First of all, <source>
is
deprecated, and has been superceded by <syntaxhighlight>
, which is used instead, so the detection should be at least swapped from <source lang=
to <syntaxhighlight lang=
(Or both, though judging from the changes in the deprecation tracking category, I don't see source getting used ever).
Second of all, the filter immediately follows the check with a look for lang=
, which disregards the possibility of the inline
attribute which could come before it (E.g. <syntaxhighlight inline lang=text>
).
Side note: I have no idea if this is the correct place to suggest an to a filter rather than a new filter, but I don't see any pages anywhere for filter requests other than this, so I'm putting it here. Aidan9382 ( talk) 08:16, 16 June 2022 (UTC)
!( "<source lang=" in new_wikitext ) &
with one of the following should resolve this request:
!( new_wikitext rlike "<(source|syntaxhighlight) (inline )?lang=" ) &
!( new_wikitext rlike "<syntaxhighlight (inline )?lang=" ) &
in
might have better performance than using rlike
. Adding these lines would add the described detection:
!( "<source inline lang=" in new_wikitext ) &
!( "<syntaxhighlight inline lang=" in new_wikitext ) &
!( "<syntaxhighlight lang=" in new_wikitext ) &
Sheep ( talk) 17:21, 18 June 2022 (UTC)
I propose the following code, but by all means, please check it first:
page_namespace == 0 & ( added_lines irlike "(~~~|~~~~|~~~~~)" )
Thanks, NotReallySoroka ( talk) 13:00, 7 July 2022 (UTC)
added_lines
evaluates before signature expansion, see
traps and pitfalls. added_lines_pst
would need to be used for the current pattern in 1090 to catch a signature.
PHANTOMTECH (
talk)
17:38, 7 July 2022 (UTC)
"niggah" is matched twice in the regex of AbuseFilter/260. AbuseFilter/384 has it as well. They should be unified. 0x Deadbeef 15:29, 16 July 2022 (UTC)
![]() | This page is an archive. Do not edit the contents of this page. Please direct any additional comments to the current main page. |
-- Asartea Talk | Contribs 19:33, 21 January 2022 (UTC)
__INDEX__
without surrounding <nowiki>
which warns, but since in mainspace you aren't discussing it and it doesn't do anything its probably fine to just flatout block. --
Asartea
Talk |
Contribs
16:59, 22 January 2022 (UTC)https://en.wikipedia.org/?sort=last_edit_desc&search=insource%3Aindex+insource%3A%2F__INDEX__%2F+-insource%3A%22%3Cnowiki%3E+index+nowiki%22&title=Special%3ASearch&profile=advanced&fulltext=1&ns0=1
and selecting the namespace(s). Annoyingly enough I can't just post the urls as external links cause MediaWiki tries to be smart and breaks them (also note I substracted one from the all other namespaces count, since I have an open AWB perm request which contains INDEX in a search url, which I just realised I also need to fix).__INDEX__
only does anything in User: and User talk:, so BEANS shouldn't be an issue. I'm currently writing a patch to make the template hardfail with warning in all other namespaces. --
Asartea
Talk |
Contribs
17:39, 22 January 2022 (UTC)
what sorts of folks think they should be adding it. Right now the filter is really broad; logging all uses of INDEX and NOINDEX in all namespaces, by all users. It might be worth excluding users with (say) more than 1000 edits, but I want to see what it catches for now. Suffusion of Yellow ( talk) 00:09, 23 January 2022 (UTC)
2409:4073:2094:FF07:3452:920C:D85A:CFEF ( talk) 14:03, 22 January 2022 (UTC)
In the past day I've seen (and reverted) edits on three talk pages: on Talk:Trivia, on Talk:Messenger, and on Talk: Google Ngram Viewer.
They're unrelated and come from unrelated IP addresses, but all create a new discussion with a single-word title and a single word of content. And if I saw three of them on my watchlist there are probably thousands of others. Any thoughts? Thanks, Dan Bloch ( talk) 19:05, 21 December 2021 (UTC)
Talk
link to anons on mobile on 15 November 2021.Talk
visible to anons on mobile is impacting metrics like: talk page revert rate, talk page bounce rate and talk page page views.Vitaium ( talk) 06:39, 2 February 2022 (UTC)
-- lomrjyo ( ✉ • 📝) 21:29, 9 February 2022 (UTC)
2.55.13.156 ( talk) 04:34, 9 March 2022 (UTC)
DVdm ( talk) 22:20, 8 March 2022 (UTC)
Additional IP:
Adakiko ( talk) 01:26, 9 March 2022 (UTC)
Possibly helpful information: all the IPs geolocate to Canberra, Australia, and there is a Dr Peter J. Riggs based there. Tercer ( talk) 08:38, 9 March 2022 (UTC)
Additional new editor:
Adakiko ( talk) 22:12, 17 March 2022 (UTC)
Ivanvector ( Talk/ Edits) 15:27, 23 February 2022 (UTC)
Okay, this is a very specific but limited request for a bot to do some sleuthing. I just discovered an editor, using IPs, mainly from Ontario, Canada but previously from all over, who has a strange kind of vandalism. They create pages and post long stories about riding an elevator. Here is a diff of an example of the same content that just gets reposted. Any way, they are always posted on Portal talk:Current events pages for different days of the year. I was just finding Portal talk pages from random days in 2007 but the example I just shared was a day in 2013. Going through some pages I found that they were doing this, posting this same content about riding an elevator back in 2012! Since I doubt any editors have Portal talk pages on their Watchlist, there might be a lot of this nonsense that still exists or it could be that I found all of it today and there is none left.
Could a bot run a check on Portal talk pages for different days of the year and see if there are any pages that have this strange content? So far, I haven't found any reason for there to be a Portal talk page for each day of every year so maybe a mass deletion is in order if there are a lot of these pages that have been created. Since these pages are typically not seen by readers, this is obviously not a high priority task but since the vandal has recently been very active at doing this this month, this might serve to discourage them. Thank you. Liz Read! Talk! 01:18, 31 March 2022 (UTC)
{{ resolved}}
If this isn't feasible/reasonable, no problem. Doug Weller talk 12:01, 16 March 2022 (UTC)
{{ resolved}}
{{ resolved}}
Helen( 💬 📖) 16:21, 16 April 2022 (UTC)
This is infrequent enough that I'm not really sure if an edit filter is justified, but would it be worth it to have a filter that tags additions of strings like "keter", "overseer council", "secure. contain, protect" etc. to pages with SCP in the title other than SCP Foundation and SCP: Containment Breach? See, for example, [49], [50]. casualdejekyll 03:17, 11 March 2022 (UTC)
Was helping out another project, anyone got a good tip for this case: They are trying to stop subtle numeric vandalism. I was thinking possibly something along the lines of comparing only the numbers in the old to the numbers in the new to see if any changed. I would expect a ton of FP's here.
pseduo code:
(user_age == 0) & (summary == '') & (action == 'edit') & ( (FROM: removed_lines - extract and concatenate just [0-9]) != (FROM: added_lines - extract and concatenate just [0-9]) )
Think this would be too "expensive" on any busy project as well, any thoughts? — xaosflux Talk 15:25, 12 April 2022 (UTC)
extract and concatenate just [0-9]? I'd love a
str_replace_regexp()
function; then we could say str_replace_regexp(added_lines, "[^0-9]", "")
. But I don't see how to do this with just str_replace()
.
Suffusion of Yellow (
talk)
19:29, 12 April 2022 (UTC)
string(get_matches("[0-9]+", text))
suffice?
Certes (
talk)
19:58, 12 April 2022 (UTC)
text
Which might be better than nothing, I guess.
Suffusion of Yellow (
talk)
20:09, 12 April 2022 (UTC)
Looks for matches of the regex needle ... in the haystack; it actually looks for only one match. Certes ( talk) 20:23, 12 April 2022 (UTC)
parts := "(?s)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(.*)";
old := get_matches(parts, removed_lines);
new := get_matches(parts, added_lines);
text_old := (old1 + old3 + old5 + old7 + old9 + old11 + old13 + old15 + old17 + old19 + old21]);
text_new := (new1 + new3 + new5 + new7 + new9 + new11 + new13 + new15 + new17 + new19 + new21]);
nums_old := (old2 + old4 + old6 + old8 + old10 + old12 + old14 + old16 + old18 + old20]);
nums_new := (new2 + new4 + new6 + new8 + new10 + new12 + new14 + new16 + new18 + new20]);
text_old == text_new & nums_old != nums_new
+
uses up conditions, so would could even go for more than ten, in theory. No idea about the time this will take.
Suffusion of Yellow (
talk)
20:38, 12 April 2022 (UTC)not_num := "(?s)^(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(.*)$\K";
get_matches(not_num, removed_lines) == get_matches(not_num, added_lines) &
added_lines != removed_lines
grep -P
scans through a 100MB enwiki database dump in less than a second, while the pattern from filter 384 takes about 8 seconds.
Suffusion of Yellow (
talk)
03:19, 13 April 2022 (UTC)67.21.154.193 ( talk) 15:29, 20 April 2022 (UTC)
This isn't really a proper filter request, but more of a question about if a filter might be needed. The <big></big> markup can be nested indefinitely, and ends up ridiculously large very quickly. I was surprised to note that there isn't an abuse filter preventing non-autoconfirmed users from doing this outside of their own userspace or the sandbox. See diffs below for an example of the sort of abusive use that I am thinking of. With emoji, with text. Basically just replace the text with some array of choice words, and the emoji with something more insulting. These were both me on an alternate account, testing to see if any filters would trip, as whilst I couldn't see any public filters that would trip, I wasn't sure if a private filter might. However, nothing tripped, not even merely to log it. Might this be a problem? Mako001 (C) (T) 🇺🇦 14:19, 26 April 2022 (UTC)
page_namespace == 0 & ( added_lines contains <big><big> )
would likely do it (not including the non-confirmed check) - I'm not sure if this requires a PST check or not. EDIT: Actually, reversing that and doing the big-big check first would likely save conditions , unless it needs PST.
casualdejekyll
16:35, 26 April 2022 (UTC)
Would it be possible and useful to extend 249 (non-autoconfirmed user rapidly reverting edits) to IPs with this sort of editing pattern? I'd expect some newly registered accounts to do it ten times to win a prize ( allow four days for delivery), but even from an IP it disruptively clogs the page histories. Certes ( talk) 20:21, 21 April 2022 (UTC)
This is from the edit history of 95.70.214.241 ( talk · contribs), who appears to be a certain LTA. The height of eight athlete BLPs was changed by one or two cm. Only three of the edits had "(Tags: ... changing height and/or weight). Would it be possible to have all such changes tagged? It could help with wp:RCP. Thank you! Adakiko ( talk) 07:31, 20 April 2022 (UTC)
! ( page_title in added_lines )
in filter
391 (
hist ·
log).
Maxim, I realize I'm asking you about something from 11 years ago, but any idea what that's doing? I can't figure it out. It effectively prevents the filter from tagging any time there's already a reference on the same line with article's subject in the title. But people change referenced figures all the time.
Suffusion of Yellow (
talk)
21:08, 10 May 2022 (UTC)
! ( page_title in added_lines )
but instead ! article_text in added_lines
, which was changed by Zzuuzz in 2019
here. But, article_text
and page_title
have the same meaning per
documentation, but the former is deprecated. I don't recall why that line was in there—it could be something as banal as thinking that article_text
was something that it is not. I am fairly sure that I used a different filter as a starting point for this one; it's possible that this line was germane to the other filter, but I really don't remember what the other filter could have been, nor why the line is here now. I've just tested the filter with and without offending line (line 3), and that seems to pick up the untagged edits (which were not caught because the lines in question had a reference which contain the title of the article). I think the offending line can just be deleted, but I'm interested to see whether if you had any thoughts on this. As a final note, I find it very heartening that this filter, generally with the same rules as when it was written, is still useful after 11 years. :-)
Maxim(talk)
00:03, 11 May 2022 (UTC)
this and this seem rather obvious instances... RandomCanadian ( talk / contribs) 22:17, 26 April 2022 (UTC)
I'm surprised ClueBot didn't catch this edit, but seems it didn't. Could we make an edit filter for additions of repeated strings like it? {{u| Sdkb}} talk 02:51, 28 April 2022 (UTC)
Can this string ("poo poo") be added to the ones prevented by the "poop" filter? I see virtually no legitimate use for this string in mainspace. 💩 Mako001 (C) (T) 🇺🇦 12:20, 12 May 2022 (UTC)
The filter here (for unreferenced articles) was deleted back in 2013 [52] because it apparently had no purpose. Now, a warn+tag filter exists are submitting completely unsourced afc submissions see here. I think reintrodicing filter 402 (with warn) would help against non-notable or spam creations, as well as make new users add more reliable sources.
Also, are the above "pronoun change" and "adding death" filters going to be implemented?
67.21.154.193 ( talk) 15:34, 2 May 2022 (UTC)
Pinging @ Tamzin: since she's an edit filter manager, and no EF manager has responded to any section below this one yet. 67.21.154.193 ( talk) 13:41, 30 May 2022 (UTC)
Sumanuil. 05:48, 27 May 2022 (UTC)
67.21.154.193 ( talk) 15:29, 20 April 2022 (UTC)
Here are the two most recent examples:
This has been happening for several months, perhaps back to last year. I've seen various combinations of the wording "white supremacist" and "racist " edited into political articles, both currently serving individuals and historical figures. These would be edits made after the article was already created. Not limited by geographical area, time period, living or deceased office holders. Can we create a bot that blocks these? And once created, can we update it if a new similar term begins happening? — Maile ( talk) 22:40, 30 September 2021 (UTC)
Senator McSenatorface resigned after admitting to sending hundreds of racist texts...<ref><ref><ref>. It looks like we're whitewashing. So maybe I'll just disallow very small edits without refs, e.g. Special:Diff/1053952115 and leave 189 to tag the rest. Suffusion of Yellow ( talk) 23:43, 7 November 2021 (UTC)
Should this section still be pinned? It has been months since the last comment here 67.21.154.193 ( talk) 12:13, 14 June 2022 (UTC)
should characters in the 33xx range and circled letters and numbers be included in the filter?
67.21.154.193 ( talk) 15:25, 24 May 2022 (UTC)
Vaguely related: are "funny" Greek letters being normalised to, er, normal Greek letters? This looks like an FP:
Special:AbuseLog/32463333 (06:00, 27 April 2022: Ωχγ triggered filter 1,168).
Certes (
talk)
16:03, 24 May 2022 (UTC)
Also:
Change:ℬ and ℭ should have a pipe
Add (test for false positives by unicode normalization first):
|
between them. This is on the line that starts with “(accountname rlike "℀|℁|ℂ|℃|℄|℅|℆|ℇ|℈|℉…”
ªº
(ordinal indicators),
ₐₑₒₓₔₕₖₗₘₙ₊₋₌₍₎ₚₛₜⱼ
(subscript),
ꬲꬽꬾ
(blackletter),
ⁱⁿ⁺⁻⁼⁽⁾
(superscript),
ⱻꜰɢʛʜɪʟɴɶʀꝶꜱʏꭥꞮꟸᶦᶧᶫᶰʶᶸ
(small caps/modifier)
ᶛᶜᶝᶞᶟᶠᶡᶢᶣᶤᶥᶨˡᶩᶪᵚᶬᶭᶮᶯᶱᶲᶳᶴᶵᶶᶷᶹᶺᶻᶼᶽᶾᶿꟹꭟʰʱʲʳʴʵʷˣʸꭜꭝꭞ
(modifiers)
ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩⅪⅫⅬⅭⅮⅯⅰⅱⅲⅳⅴⅵⅶⅷⅸⅹⅺⅻⅼⅽⅾⅿↀↁↂↃↅↆↇↈ
(roman numerals)
(more small caps and modifiers --->)
ꟲ (unicode A7F2)
ꟳ (unicode A7F3)
ꟴ (unicode A7F4)
𝼂 (unicode 1DF02)
𝼄 (unicode 1DF04)
𝼐 (unicode 1DF10)
Latin Extended-F unicode block (10780-107BF; bunch of modifier letters)
67.21.154.193 (
talk)
12:14, 2 June 2022 (UTC)
67.21.154.193 ( talk) 15:40, 20 April 2022 (UTC)
Came across deleted filter 40, but it seems like it would require significant rework if we were to revive it. 67.21.154.193 ( talk) 12:44, 8 June 2022 (UTC)
{{dead link}}
..., try to find an archive yourself at one of these sites (add links to useful archive sites here) or, if you have an account, try using (IABot console link here)."Mako001 (C) (T) 🇺🇦 05:50, 28 May 2022 (UTC)
<source>
tag detection in Filter 432Currently, filter 432 does a check to ensure that <source lang=
is not in the new wikitext. However, this has 2 failures to it that make the detection practically useless.
First of all, <source>
is
deprecated, and has been superceded by <syntaxhighlight>
, which is used instead, so the detection should be at least swapped from <source lang=
to <syntaxhighlight lang=
(Or both, though judging from the changes in the deprecation tracking category, I don't see source getting used ever).
Second of all, the filter immediately follows the check with a look for lang=
, which disregards the possibility of the inline
attribute which could come before it (E.g. <syntaxhighlight inline lang=text>
).
Side note: I have no idea if this is the correct place to suggest an to a filter rather than a new filter, but I don't see any pages anywhere for filter requests other than this, so I'm putting it here. Aidan9382 ( talk) 08:16, 16 June 2022 (UTC)
!( "<source lang=" in new_wikitext ) &
with one of the following should resolve this request:
!( new_wikitext rlike "<(source|syntaxhighlight) (inline )?lang=" ) &
!( new_wikitext rlike "<syntaxhighlight (inline )?lang=" ) &
in
might have better performance than using rlike
. Adding these lines would add the described detection:
!( "<source inline lang=" in new_wikitext ) &
!( "<syntaxhighlight inline lang=" in new_wikitext ) &
!( "<syntaxhighlight lang=" in new_wikitext ) &
Sheep ( talk) 17:21, 18 June 2022 (UTC)
I propose the following code, but by all means, please check it first:
page_namespace == 0 & ( added_lines irlike "(~~~|~~~~|~~~~~)" )
Thanks, NotReallySoroka ( talk) 13:00, 7 July 2022 (UTC)
added_lines
evaluates before signature expansion, see
traps and pitfalls. added_lines_pst
would need to be used for the current pattern in 1090 to catch a signature.
PHANTOMTECH (
talk)
17:38, 7 July 2022 (UTC)
"niggah" is matched twice in the regex of AbuseFilter/260. AbuseFilter/384 has it as well. They should be unified. 0x Deadbeef 15:29, 16 July 2022 (UTC)