The extension defines a domain-specific language solely to write filter rules. Since the language is not Turing complete, it cannot replace bots for more complex tasks.
Significant content taken from mw:Extension:AbuseFilter/Rules format; see page history for attribution.
The edit filter captures the following data from edits. They are stored in the following variables. They can be manipulated and analyzed with various functions and operators. The data types are int
(signed
integer values), string
(sequences of
Unicode characters), bool
(true and false), float
(signed
rational numbers), array
(non-associative
arrays) and null
(usually an uninitialized variable).
Note that some numerical variables may be defined as a string; to act based on these variables, you may need to cast them to an int
. For example, the variable timestamp
is a string; to extract the date, hour etc. you must cast it to an int
first (e.g. int(timestamp) % (60*60*24) === 22
).
"Pre-save transformed" means after the wikitext is evaluated before saving; i.e. with template substitution. The wikitext is taken from the latest version before page save. For example, the added_lines_pst
of {{
subst:Mbox}}
is {{#invoke:Message box|mbox}}
; the added_lines
is exactly {{
subst:Mbox}}
verbatim.
For an up-to-date list of variables, see the documentation on mediawiki.org.
Operator | True when ... |
---|---|
< |
the left operand is less than the right. |
> |
the left operand is more than the right. |
<= |
the left operand is less than or equal to the right. |
>= |
the left operand is more than or equal to the right. |
= or == |
the left operand is equal to the right. |
!= |
the left operand is not equal to the right. |
=== |
the left operand is equal to the right and they are of the same data type. |
!== |
the left operand is not equal to the right and they are not of the same data type. |
Operator | Operation |
---|---|
+ | Addition |
- | Subtraction |
* | Multiplication |
/ | Division |
** | Exponentiation |
% | Modulo (remainder) |
+
concatenates a string with a string or other data type.
like
returns true if the left string matches the right string; this is distinct from =
as the right string is treated as a
glob pattern.in
returns true if the right string contains the left string. contains
is its reverse, i.e. returns true if the left string contains the right string. Note: empty strings are not contained in, nor contain, any other string (not even the empty string itself).rlike
returns true if the left string matches the
regular expression pattern in the right string. irlike
is rlike
with case-insensitivity. The regex engine is
PCRE with support for Unicode characters. Beware, as regex is potentially expensive for two reasons:
if .. then .. else .. end
and the ternary conditional operator condition ? then_value : else_value
.You can declare new variables within a condition; their scope is lexical scoping limited to the condition they appear within.
For an up-to-date list of available functions see mw:Extension:AbuseFilter/Rules format#Functions.
name | description |
---|---|
lcase |
Returns the argument converted to lower case. |
ucase |
Returns the argument converted to upper case. |
length |
Returns the length of the string given as the argument. |
string |
Casts to string data type. |
int |
Casts to integer data type. |
float |
Casts to floating-point data type. |
bool |
Casts to boolean data type. |
norm |
Equivalent to rmwhitespace(rmspecials(rmdoubles(ccnorm(arg1)))) .
|
ccnorm |
Normalises confusable/similar characters in the argument, and returns a canonical form. A list of characters and their replacements can be found
on git, eg. ccnorm( "Eeèéëēĕėęě3ƐƷ" ) == "EEEEEEEEEEEEE" .
[3]
[4] Note that the extension
AntiSpoof is required for this function to have an effect. Without it the string will simply be left unchanged.
|
specialratio |
Returns the number of non-alphanumeric characters divided by the total number of characters in the argument. |
rmspecials |
Removes any special characters in the argument, and returns the result. (Equivalent to s/[^\p{L}\p{N}]//g.) |
rmdoubles |
Removes repeated characters in the argument, and returns the result. |
rmwhitespace |
Removes whitespace (spaces, tabs, and newlines). |
count |
Returns the number of times the needle (first string) appears in the haystack (second string). If only one argument is given, splits it by commas and returns the number of segments. |
rcount |
Similar to count but the needle uses a regular expression instead. Can be made case-insensitive by letting the regular expression start with "(?i)".
|
ip_in_range |
Returns true if user's IP (first string) matches specified IP ranges (second string). Only works for anonymous users. Supports IPv4 and IPv6 addresses. |
contains_any |
Returns true if the first string contains any strings from the following arguments (unlimited number of arguments). |
substr |
Returns the portion of the first string, by offset from the second argument (starts at 0) and maximum length from the third argument (optional). |
strlen |
Same as length .
|
strpos |
Returns the numeric position of the first occurrence of needle (second string) in the haystack (first string). This function may return 0 when the needle is found at the begining of the haystack, so it might be misinterpreted as false value by another comparative operator. The better way is to use == or !== for testing whether it is found.
|
str_replace |
Replaces all occurrences of the search string with the replacement string. The function takes 3 arguments in the following order: text to perform the search, text to find, replacement text. |
rescape |
Returns the argument with some characters preceded with the escape character "\", so that the string can be used in a regular expression without those characters having a special meaning. |
set |
Sets a variable (first string) with a given value (second argument) for further use in the filter. Another syntax: name := value .
|
set_var |
Same as set .
|
If a user triggers a filter, the edit filter can apply any of the following sanctions based on the severity of the offense:
The following actions are currently not available on this wiki:
Note: Individual sanctions can be disabled selectively. Any edit filter manager can restore autoconfirmed status in case of an error.
The condition limit is a limit imposed by the software on the total number of conditions that can be evaluated by the filters. It is arbitrarily fixed at 2,000
[5]. While the aim of this limit is to avoid performance issues, it should be noted that this isn't a good metric of how heavy a filter is: for instance, a filter using dozens of simple comparisons (and thus dozens of conditions) is much lighter than one using a single check on the all_links
variable. See
mw:Extension:AbuseFilter/Conditions and
mw:Extension:AbuseFilter/Rules format#Performance for more details.
All edits triggering an action will produce a report at Special:AbuseLog. On this page, a brief log entry is entered. Users with the appropriate permissions may view the log summary. Users with certain higher permissions may view details on the log entry. This includes all information available to the filter when it ran, and may be useful for debugging purposes. Users with the highest level of log-viewing permissions may view private data about the action which caused the log event, such as the user's IP address. See the AbuseFilter documentation for more details on the permissions structure.
The details link brings up a screen like that on the right.
To protect the wiki against poorly configured filters, a technical limit is imposed on the maximum percentage of actions that will trigger a given filter. Other technical limits are in the process of being written.
All notifications are based on the template {{ edit filter warning}}.
Standard notifications shown to a user triggering a filter action:
Message name | Message text | ||
---|---|---|---|
abusefilter-disallowed |
| ||
abusefilter-autopromote-blocked |
|
Generic warning message is below. Admins are advised to use custom warnings.
Message name | Message text | ||
---|---|---|---|
abusefilter-warning |
|
Some existing filters and their warnings:
Filter and message | Message text | ||
---|---|---|---|
30: large deletions |
| ||
132: removal of all categories |
|
If a filter is set to warn and disallow, then a user clicking "Save page" will alternatively see that warning and standard disallowed message.
The extension defines a domain-specific language solely to write filter rules. Since the language is not Turing complete, it cannot replace bots for more complex tasks.
Significant content taken from mw:Extension:AbuseFilter/Rules format; see page history for attribution.
The edit filter captures the following data from edits. They are stored in the following variables. They can be manipulated and analyzed with various functions and operators. The data types are int
(signed
integer values), string
(sequences of
Unicode characters), bool
(true and false), float
(signed
rational numbers), array
(non-associative
arrays) and null
(usually an uninitialized variable).
Note that some numerical variables may be defined as a string; to act based on these variables, you may need to cast them to an int
. For example, the variable timestamp
is a string; to extract the date, hour etc. you must cast it to an int
first (e.g. int(timestamp) % (60*60*24) === 22
).
"Pre-save transformed" means after the wikitext is evaluated before saving; i.e. with template substitution. The wikitext is taken from the latest version before page save. For example, the added_lines_pst
of {{
subst:Mbox}}
is {{#invoke:Message box|mbox}}
; the added_lines
is exactly {{
subst:Mbox}}
verbatim.
For an up-to-date list of variables, see the documentation on mediawiki.org.
Operator | True when ... |
---|---|
< |
the left operand is less than the right. |
> |
the left operand is more than the right. |
<= |
the left operand is less than or equal to the right. |
>= |
the left operand is more than or equal to the right. |
= or == |
the left operand is equal to the right. |
!= |
the left operand is not equal to the right. |
=== |
the left operand is equal to the right and they are of the same data type. |
!== |
the left operand is not equal to the right and they are not of the same data type. |
Operator | Operation |
---|---|
+ | Addition |
- | Subtraction |
* | Multiplication |
/ | Division |
** | Exponentiation |
% | Modulo (remainder) |
+
concatenates a string with a string or other data type.
like
returns true if the left string matches the right string; this is distinct from =
as the right string is treated as a
glob pattern.in
returns true if the right string contains the left string. contains
is its reverse, i.e. returns true if the left string contains the right string. Note: empty strings are not contained in, nor contain, any other string (not even the empty string itself).rlike
returns true if the left string matches the
regular expression pattern in the right string. irlike
is rlike
with case-insensitivity. The regex engine is
PCRE with support for Unicode characters. Beware, as regex is potentially expensive for two reasons:
if .. then .. else .. end
and the ternary conditional operator condition ? then_value : else_value
.You can declare new variables within a condition; their scope is lexical scoping limited to the condition they appear within.
For an up-to-date list of available functions see mw:Extension:AbuseFilter/Rules format#Functions.
name | description |
---|---|
lcase |
Returns the argument converted to lower case. |
ucase |
Returns the argument converted to upper case. |
length |
Returns the length of the string given as the argument. |
string |
Casts to string data type. |
int |
Casts to integer data type. |
float |
Casts to floating-point data type. |
bool |
Casts to boolean data type. |
norm |
Equivalent to rmwhitespace(rmspecials(rmdoubles(ccnorm(arg1)))) .
|
ccnorm |
Normalises confusable/similar characters in the argument, and returns a canonical form. A list of characters and their replacements can be found
on git, eg. ccnorm( "Eeèéëēĕėęě3ƐƷ" ) == "EEEEEEEEEEEEE" .
[3]
[4] Note that the extension
AntiSpoof is required for this function to have an effect. Without it the string will simply be left unchanged.
|
specialratio |
Returns the number of non-alphanumeric characters divided by the total number of characters in the argument. |
rmspecials |
Removes any special characters in the argument, and returns the result. (Equivalent to s/[^\p{L}\p{N}]//g.) |
rmdoubles |
Removes repeated characters in the argument, and returns the result. |
rmwhitespace |
Removes whitespace (spaces, tabs, and newlines). |
count |
Returns the number of times the needle (first string) appears in the haystack (second string). If only one argument is given, splits it by commas and returns the number of segments. |
rcount |
Similar to count but the needle uses a regular expression instead. Can be made case-insensitive by letting the regular expression start with "(?i)".
|
ip_in_range |
Returns true if user's IP (first string) matches specified IP ranges (second string). Only works for anonymous users. Supports IPv4 and IPv6 addresses. |
contains_any |
Returns true if the first string contains any strings from the following arguments (unlimited number of arguments). |
substr |
Returns the portion of the first string, by offset from the second argument (starts at 0) and maximum length from the third argument (optional). |
strlen |
Same as length .
|
strpos |
Returns the numeric position of the first occurrence of needle (second string) in the haystack (first string). This function may return 0 when the needle is found at the begining of the haystack, so it might be misinterpreted as false value by another comparative operator. The better way is to use == or !== for testing whether it is found.
|
str_replace |
Replaces all occurrences of the search string with the replacement string. The function takes 3 arguments in the following order: text to perform the search, text to find, replacement text. |
rescape |
Returns the argument with some characters preceded with the escape character "\", so that the string can be used in a regular expression without those characters having a special meaning. |
set |
Sets a variable (first string) with a given value (second argument) for further use in the filter. Another syntax: name := value .
|
set_var |
Same as set .
|
If a user triggers a filter, the edit filter can apply any of the following sanctions based on the severity of the offense:
The following actions are currently not available on this wiki:
Note: Individual sanctions can be disabled selectively. Any edit filter manager can restore autoconfirmed status in case of an error.
The condition limit is a limit imposed by the software on the total number of conditions that can be evaluated by the filters. It is arbitrarily fixed at 2,000
[5]. While the aim of this limit is to avoid performance issues, it should be noted that this isn't a good metric of how heavy a filter is: for instance, a filter using dozens of simple comparisons (and thus dozens of conditions) is much lighter than one using a single check on the all_links
variable. See
mw:Extension:AbuseFilter/Conditions and
mw:Extension:AbuseFilter/Rules format#Performance for more details.
All edits triggering an action will produce a report at Special:AbuseLog. On this page, a brief log entry is entered. Users with the appropriate permissions may view the log summary. Users with certain higher permissions may view details on the log entry. This includes all information available to the filter when it ran, and may be useful for debugging purposes. Users with the highest level of log-viewing permissions may view private data about the action which caused the log event, such as the user's IP address. See the AbuseFilter documentation for more details on the permissions structure.
The details link brings up a screen like that on the right.
To protect the wiki against poorly configured filters, a technical limit is imposed on the maximum percentage of actions that will trigger a given filter. Other technical limits are in the process of being written.
All notifications are based on the template {{ edit filter warning}}.
Standard notifications shown to a user triggering a filter action:
Message name | Message text | ||
---|---|---|---|
abusefilter-disallowed |
| ||
abusefilter-autopromote-blocked |
|
Generic warning message is below. Admins are advised to use custom warnings.
Message name | Message text | ||
---|---|---|---|
abusefilter-warning |
|
Some existing filters and their warnings:
Filter and message | Message text | ||
---|---|---|---|
30: large deletions |
| ||
132: removal of all categories |
|
If a filter is set to warn and disallow, then a user clicking "Save page" will alternatively see that warning and standard disallowed message.