This page [note 1] covers some common mistakes made by edit filter managers. For the full documentation, see Wikipedia:Edit filter/Documentation and mw:Extension:AbuseFilter.
When applying a throttle to an edit filter, it is important that you do so using both the ip
and user
variables wherever possible (as opposed to using either or).
Throttling by user
alone throttles by user id, not by username. All logged out editors share one user id, which is 0
. This may cause false positives and issues if many anonymous users unrelated to one another match the filter conditions when saving edits.
Throttling by ip
alone throttles logged in editors by their underlying IP address. Do not use only the ip
variable when applying a throttle, unless the filter specifically targets logged out or anonymous users only.
The user_rights
variable only contains the user's current rights. If the user has logged in using a
bot password, or is editing with an
OAuth application, user_rights
may be limited. For example, it looks like we could exclude
extended confirmed users,
bots, and
administrators with
[note 2]
!("extendedconfirmed" in user_rights) /* WRONG! */
but this will not work as expected if the user did not grant editprotected
when setting up a bot password. Instead, just specify the groups explicitly:
!contains_any(user_groups, "extendedconfirmed", "sysop", "bot")
Some variables at Special:Abusefilter/test and Special:AbuseFilter/examine [note 3] will have different values from what they would have been had the filter actually tripped at the time of the change. [note 4]
Suppose that Alice, as her first edit, adds the string "Hello, world! ~~~~" to a page that has only ever been edited by Bob. She then makes 20 more edits.
One week later, we look at her edit [note 5] with Special:AbuseFilter/examine. Some results may be surprising:
Variable | At save | At /examine or /test |
---|---|---|
added_lines | Hello, world! ~~~~ | Hello, world! [[User:Alice|Alice]] ([[User talk:Alice|talk]]) 21:07, 14 November 2019 (UTC) [note 6] |
user_editcount | 0 | 20 |
user_groups | ["*", "user"] | ["*", "user", "autoconfirmed"] |
page_recent_contributors | Bob | Alice Bob |
rlike
and other keywords have a higher precedence than +
. This does not check if added_lines
contains "foo" or "bar":
added_lines rlike "foo" + "|bar" /* WRONG! */
Instead use:
added_lines rlike ("foo" + "|bar")
The norm() function performs the following modifications to the string value to it in the following specific execution order:
This can lead to unexpected results if one is unaware of the function's specific execution order:
string_example := "A@ AB,BCC";
norm(string_example) == "ABC" /* FALSE */
norm(string_example) == "AABBC" /* TRUE */
You may be asking yourself, "what happened here?" Take a look below to see how the norm() function's execution order modifies string_example step-by-step:
string_example = "A@ AB,BCC" //This is the initial string that we originally assigned to string_example. Now we run the norm() function to it...
string_example = "AA AB,BCC" //The first task (replacement of confusing characters) would result in the '@' being replaced by the letter 'A'.
string_example = "A AB,BC" //The second task would remove the repeated 'A' and 'C' characters, leaving one of each.
string_example = "A ABBC" //The third task removes all special characters, meaning that the comma (',') in this string is removed.
string_example = "AABBC" //The last task would then remove the space.
string_example = "AABBC" //The resulting string will be "AABBC".
When in doubt, use the debugging tool to assist you.
Tags are created automatically when a filter is saved. Do not use the interface at the top of Special:Tags, unless you also want to activate the tag for manual use. Mistakenly activated tags may be deactivated from Special:Tags.
The only operation that really works with arrays is length
. Other operations will implicitly cast an array to a string first. This could give an unintuitive result. For example, page_namespace in [12, 34]
is in fact equivalent to string(page_namespace) in "12\n34\n"
. Therefore, when page_namespace
is 1, 2, 3, or 4, the expression will be evaluated to true as well. In the above case, use equals_to_any(page_namespace, 12, 34)
as a workaround instead.
On the other hand, if you want to compute the amount of text added (removed), you might be tempted to use strlen(added_lines)
, strlen(removed_links)
or similar. However, strlen
, length
and count
do not implicitly cast arrays to string and will return the length of the array (i.e., number of lines), not the character count, instead. The cast needs to be explicit, i.e., strlen(string(added_lines))
.
One might expect that page_namespace / 2 === 0
will check if page_namespace
is either 0 or 1. However, the division operation in fact doesn't discard the remainder. That means, if the numerator is not divisible by the denominator, the result will be a float. In the above case, use equals_to_any(page_namespace, 0, 1)
instead.
Like in PHP, null
is smaller than any number, i.e. null < -1234567
is true. This is especially problematic when using edit_delta
: if the action being filtered is not an edit, edit_delta < -5000
will evaluate to true. Remember to check that action === "edit"
when using edit_delta
like that.
Filter logs can disappear under these circumstances: 1) If an edit is saved and then rev-deleted or oversighted, then the filter log disappears from view (including from sysops). 2) Oversighters can remove the logs of either saved or unsaved edits. Edit filter counters will always increment, therefore, a filter may have fewer visible logs than the number of hits.
For historical reasons, some variable names do not fit the general naming pattern:
Page content variables | Pre-save transform variables | ||
---|---|---|---|
Old | New | Sent variable | Transformed variable |
old_wikitext |
new_wikitext |
added_lines |
added_lines_pst
|
old_html (disabled) |
new_html |
edit_diff |
edit_diff_pst
|
old_links |
all_links (not new_links) |
new_wikitext |
new_pst (not new_wikitext_pst)
|
extendedconfirmed
rights, according to
Special:UserGroupRights
added_lines_pst
This page [note 1] covers some common mistakes made by edit filter managers. For the full documentation, see Wikipedia:Edit filter/Documentation and mw:Extension:AbuseFilter.
When applying a throttle to an edit filter, it is important that you do so using both the ip
and user
variables wherever possible (as opposed to using either or).
Throttling by user
alone throttles by user id, not by username. All logged out editors share one user id, which is 0
. This may cause false positives and issues if many anonymous users unrelated to one another match the filter conditions when saving edits.
Throttling by ip
alone throttles logged in editors by their underlying IP address. Do not use only the ip
variable when applying a throttle, unless the filter specifically targets logged out or anonymous users only.
The user_rights
variable only contains the user's current rights. If the user has logged in using a
bot password, or is editing with an
OAuth application, user_rights
may be limited. For example, it looks like we could exclude
extended confirmed users,
bots, and
administrators with
[note 2]
!("extendedconfirmed" in user_rights) /* WRONG! */
but this will not work as expected if the user did not grant editprotected
when setting up a bot password. Instead, just specify the groups explicitly:
!contains_any(user_groups, "extendedconfirmed", "sysop", "bot")
Some variables at Special:Abusefilter/test and Special:AbuseFilter/examine [note 3] will have different values from what they would have been had the filter actually tripped at the time of the change. [note 4]
Suppose that Alice, as her first edit, adds the string "Hello, world! ~~~~" to a page that has only ever been edited by Bob. She then makes 20 more edits.
One week later, we look at her edit [note 5] with Special:AbuseFilter/examine. Some results may be surprising:
Variable | At save | At /examine or /test |
---|---|---|
added_lines | Hello, world! ~~~~ | Hello, world! [[User:Alice|Alice]] ([[User talk:Alice|talk]]) 21:07, 14 November 2019 (UTC) [note 6] |
user_editcount | 0 | 20 |
user_groups | ["*", "user"] | ["*", "user", "autoconfirmed"] |
page_recent_contributors | Bob | Alice Bob |
rlike
and other keywords have a higher precedence than +
. This does not check if added_lines
contains "foo" or "bar":
added_lines rlike "foo" + "|bar" /* WRONG! */
Instead use:
added_lines rlike ("foo" + "|bar")
The norm() function performs the following modifications to the string value to it in the following specific execution order:
This can lead to unexpected results if one is unaware of the function's specific execution order:
string_example := "A@ AB,BCC";
norm(string_example) == "ABC" /* FALSE */
norm(string_example) == "AABBC" /* TRUE */
You may be asking yourself, "what happened here?" Take a look below to see how the norm() function's execution order modifies string_example step-by-step:
string_example = "A@ AB,BCC" //This is the initial string that we originally assigned to string_example. Now we run the norm() function to it...
string_example = "AA AB,BCC" //The first task (replacement of confusing characters) would result in the '@' being replaced by the letter 'A'.
string_example = "A AB,BC" //The second task would remove the repeated 'A' and 'C' characters, leaving one of each.
string_example = "A ABBC" //The third task removes all special characters, meaning that the comma (',') in this string is removed.
string_example = "AABBC" //The last task would then remove the space.
string_example = "AABBC" //The resulting string will be "AABBC".
When in doubt, use the debugging tool to assist you.
Tags are created automatically when a filter is saved. Do not use the interface at the top of Special:Tags, unless you also want to activate the tag for manual use. Mistakenly activated tags may be deactivated from Special:Tags.
The only operation that really works with arrays is length
. Other operations will implicitly cast an array to a string first. This could give an unintuitive result. For example, page_namespace in [12, 34]
is in fact equivalent to string(page_namespace) in "12\n34\n"
. Therefore, when page_namespace
is 1, 2, 3, or 4, the expression will be evaluated to true as well. In the above case, use equals_to_any(page_namespace, 12, 34)
as a workaround instead.
On the other hand, if you want to compute the amount of text added (removed), you might be tempted to use strlen(added_lines)
, strlen(removed_links)
or similar. However, strlen
, length
and count
do not implicitly cast arrays to string and will return the length of the array (i.e., number of lines), not the character count, instead. The cast needs to be explicit, i.e., strlen(string(added_lines))
.
One might expect that page_namespace / 2 === 0
will check if page_namespace
is either 0 or 1. However, the division operation in fact doesn't discard the remainder. That means, if the numerator is not divisible by the denominator, the result will be a float. In the above case, use equals_to_any(page_namespace, 0, 1)
instead.
Like in PHP, null
is smaller than any number, i.e. null < -1234567
is true. This is especially problematic when using edit_delta
: if the action being filtered is not an edit, edit_delta < -5000
will evaluate to true. Remember to check that action === "edit"
when using edit_delta
like that.
Filter logs can disappear under these circumstances: 1) If an edit is saved and then rev-deleted or oversighted, then the filter log disappears from view (including from sysops). 2) Oversighters can remove the logs of either saved or unsaved edits. Edit filter counters will always increment, therefore, a filter may have fewer visible logs than the number of hits.
For historical reasons, some variable names do not fit the general naming pattern:
Page content variables | Pre-save transform variables | ||
---|---|---|---|
Old | New | Sent variable | Transformed variable |
old_wikitext |
new_wikitext |
added_lines |
added_lines_pst
|
old_html (disabled) |
new_html |
edit_diff |
edit_diff_pst
|
old_links |
all_links (not new_links) |
new_wikitext |
new_pst (not new_wikitext_pst)
|
extendedconfirmed
rights, according to
Special:UserGroupRights
added_lines_pst