This script is largely obsolete; it is recommended that you use AutoEd instead, which provides virtually all of the same functions and is easier to customize. I will no longer actively maintain this script, unless there is a specific request, because AutoEd serves as a replacement. All of the fixes except ASCII arrow to unicode character conversion and HTML to wikitable conversion are included in the "basic" preset for AutoEd, although the table conversion is in the "wikichecker" preset and, most likely, the "advanced" preset once it is created. If you have any questions, please feel free to ask at WT:AutoEd or my talk page. – Drilnoth ( T • C • L) 21:37, 1 May 2009 (UTC) |
Be advised that even when using these scripts, you take full responsibility for any action done using them. You must understand Wikipedia policies and use this tool within that policy, or risk being blocked for its misuse. |
CodeFixer is a user script that allows you to quickly and easily update common mistakes in HTML and WikiText. Please note that this script may have bugs and not all of its final functionality has been implemented. Additionally, as the script is being worked on, errors may be introduced which cause it (and possibly other user scripts which you are using) to stop functioning for a short time, although this should usually be fixed within a few minutes.
This script is based upon MECU's BR fixer, Plastikspork's script, and Formatter, and combines various elements from them with new code. Each script functions slightly differently, however, so you can choose whichever one suits your personal tastes.
When CodeFixer is installed (see #Installation), you should see two new tabs at the top of each page while in edit mode and when viewing the page normally: "fix code" and "fix code (+)". Clicking either of these tabs tab will cause the script to automatically edit the page and fix common errors, or just cleanup code even if it wasn't an error. The script contains a list of all things that it can do (although formatted in JavaScript, it should be pretty clear to anyone what the fixed problems are). The range of what is fixed is always expanding.
The two buttons make slightly different fixes. The first button represents the "standard" version of CodeFixer. Although it has some false positives (see below), most all of its fixes shouldn't be problematic. The "codefixer (+)" button, however, starts CodeFixerPlus, which in addition to the "basic" fixes of the normal version performs more advanced cleanup. CodeFixerPlus's edits, however, usually are better at just helping to cleanup code... the script, while working on advanced things like converting HTML tables to WikiText tables, will cause errors in doing so that need to be cleaned up by humans before the edit is saved. CodeFixerPlus, then, serves as a quick way to "help" cleanup such code, but it doesn't do the whole thing and you need to go over it to make sure that it all looks good, previewing and editing further to make sure.
Note that you should always check the diff of any edit made using this script before saving, to make sure that there weren't any false positives (the "show changes" button is automatically clicked when you use this script, so that you don't need to do it manually). If there are, please fix them before saving and report it at #Bugs and to-do list, so that I am aware of the issue.
CodeFixer works with Formatter, although it contains many of the same fixes. Someday CodeFixer may incorporate all of Formatter's current fixes, except maybe for whitespace fixing.
{{Template:Infobox author}}
becomes simply {{Infobox author}}
.Because of the simplicity of this script's RegEx, it will find some false positives, such as those below. These can hopefully be fixed at some point; if you find any other false positives, please add them here so that I can try to figure out how to fix them. CodeFixerPlus is known to have false positives which need further human cleanup before saving; this list contains only those caused by the standard CodeFixer.
To install CodeFixer for your own use, simply add the following to your monobook.js page (if you're using a non-standard skin, you probably know what to do).
importScript('User:Drilnoth/codefixer.js'); //See [[User:Drilnoth/codefixer.js/doc]] for details
After adding this, just purge your cache and the script should begin working.
By default, CodeFixer marks edits you make using it as minor. If you'd prefer that they not be, you can add the following right under the "importScript" text on your monobook.js page:
codefixerMinor = false;
Please note that setting this to false will actually mark the edit as major even if it had been marked as minor before you used the script, due to a technical restriction. You can, however, still manually check and uncheck the "mark this edit as minor" box, depending on what other edits you make.
If you have found a bug: Please leave a descriptive entry here and a note on my talk page about the issue. Thanks!
If you would like to request that a new fix be added to the script, please edit this section and add your request here. If you request an addition here, I'll do that as soon as possible if it is a logical fix for this script. Once the fix is added, I will remove the request from this section and notify you on your talk page. If the requested fix cannot be implemented due to technical restrictions, or because it is an inappropriate task for this script, I will say such in the edit history and leave a note on your talk page.
(NOTE: Previous discussion on this topic removed because page was getting too large. See [3] for earlier parts of this discussion. – Drilnoth ( T • C • L) 16:22, 21 April 2009 (UTC))
You should probably note that the period in '<.BR>' matches all characters, hence all the particular matches you have after that are redundant. This is probably fine, unless you are worried about matching something like '<abr>' or '<ubr>', although I don't there are any three letter tags that end in br. If you really want to match a literal period, you have to backslash it. The three lines above could be reduced to the following two lines
txt.value = txt.value.replace(/(?:<[\\\/\.]+BR[\\\/\. ]*>|<[\\\/\. ]*BR[ ]*[\\\/\.]+[ ]*>)/gi, '<br />');
txt.value = txt.value.replace(/<[ ]*BR[ ]*>/gi, '<br>');
The second line will perform one no-op, which is replace '<br>' with '<br>', but it's not a big deal in my opinion since it reduces the complexity of the match. By the way, I maintain my own script, which does some cleanup but it's mostly orthogonal to what you have here. Although there is some minor overlap. Thanks for your contributions! Plastikspork ( talk) 00:32, 20 April 2009 (UTC)
txt.value = txt.value.replace(/<[\\\/\.]+BR[\\\/\. ]*>/gi, '<br />'); // Tag starts with a slash or period
txt.value = txt.value.replace(/<[\\\/\. ]*BR[ ]*[\\\/\.]+[ ]*>/gi, '<br />'); // Tag ends with a slash or period
txt.value = txt.value.replace(/<[ ]*BR[ ]*>/gi, '<br>'); // Tag contains no slashes
I don't think your code is completely safe in that if the <hr> is preceded by a pipe, it will look like a |- for a table. For example, try it on the following (check the wikisource to see exactly what I am talking about):
For this reason, I suggest the following code instead, which will add a newline if the 'hr' is not at the start of the line:
txt.value = txt.value.replace(/([\r\n])[\t ]*<[\\\/\. ]*HR[\\\/\. ]*>/gi, '$1----');
txt.value = txt.value.replace(/(.)<[\\\/\. ]*HR[\\\/\. ]*>/gi, '$1\n----');
Let me know if there is anything else I can do to help. Plastikspork ( talk) 23:03, 21 April 2009 (UTC)
I noticed an issue; sometimes this script offers to convert mundane elements like the b-element to wiki-markup, but gets confused by any attributes present.
This does still work, but is definitely not something we want happening — to anything, not just sigs. An expression for this would be rather messy as there's a lot that could be going on in the middle.
fyi, Jack Merridew 06:37, 20 April 2009 (UTC)
txt.value = txt.value.replace(/<(B|STRONG)[ ]*>([^<>]*)<\/\1[ ]*>/gi, "'''$2'''"); // Wikify <B> and <STRONG>
txt.value = txt.value.replace(/<(I|EM)[ ]*>([^<>]*)<\/\1[ ]*>/gi, "''$2''"); // Wikify <I> and <EM>
I noticed someone made an edit that was attributed to this script, and that edit added lots of incorrect spaces: [4]. There's no need to replace entities with unicode in the first place, but if the script does do the replacement it should not add spaces. — Carl ( CBM · talk) 17:11, 22 April 2009 (UTC)
I notice that the code substitutes '<H1>' for '=' without checking to see if it's preceded by a newline. Note that while HTML is pretty much newline insensitive, wikipedia text is not, especially in this case. A simple, appears to be completely safe, solution is to match the start and end tags together, and make sure there are no problems with newlines. For example this is<h1>a heading</h1>with problems
would not render the same as this is=a heading=with problems
. You could start by adding newlines where needed:
txt.value = txt.value.replace(/([^\r\n ])[\t ]*(<H[1-6][^<>]*>)/gim, '$1\n$2'); // Make sure <H1>, ..., <H6> is after a newline
txt.value = txt.value.replace(/(<\/H[1-6][^<>]*>)[\t ]*([^\r\n ])/gim, '$1\n$2'); // Make sure </H1>, ..., </H6> is before a newline
and then follow that by a match which replaces only those which can be safely changed
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H1[^<>]*>([^\r\n]*?)<\/H1[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1=$2=$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H2[^<>]*>([^\r\n]*?)<\/H2[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1==$2==$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H3[^<>]*>([^\r\n]*?)<\/H3[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1===$2===$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H4[^<>]*>([^\r\n]*?)<\/H4[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1====$2====$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H5[^<>]*>([^\r\n]*?)<\/H5[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1=====$2=====$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H6[^<>]*>([^\r\n]*?)<\/H6[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1======$2======$3');
Note that if there is a newline inside of the headline tags, then the substitution is not safe, which is why my suggestion guards against newlines inside. A solution to this problem would be to remove these newlines before performing the substitution. If you want that code, search for 'remove newlines from inside' in my script. You can copy the for loop and change 'str' to 'value.txt' and it should work. Plastikspork ( talk) 02:15, 25 April 2009 (UTC)
Hi. I use a Norwegianised version of your script along with the formatter-script, and appreciate it very much. However, the latter can add and remove whitespaces in ways that are not always desirable. I was wondering if it was possible for you to add a fix similar to the link simplifier (simplifies some links e.g. [[Dog|dog]] to [[dog]], [[Dog|dogs]] to [[dog]]s and [[Dog|canine]]s to [[Dog|canines]]) to your script, eliminating my need to use both scripts? Thanks, and keep up the good work. - Helt ( talk) 14:39, 27 April 2009
I really appreciate both of your efforts. I do pop in now and again to copy some updates, but I guess I need to copy the whole page from time to time and translate the lot again, as I might miss the odd bugfix when I sow it together bit by bit. I don't know any java, but a few things seem quite intuitive, so I'm able to weed out the bits that don't fit the Norwegian standards. Adding my own fixes, now that requires more than my little brain can handle, so I'll have to ask you. Like the new error from Check Wikipedia, the indented list. Would it be much of a problem changing the :* ::* to ** and *** and so on? - Helt ( talk) 18:52, 28 April 2009 (UTC)
Here's an interesting case you may want to look into: [[text#text|text]] (where all three "text"s are identical). This type of link may occur when an inexperienced user tries to bypass a redirect to a section link by copying from the URL (e.g. if Foo redirects to Bar#Foo, when you click on a link to Foo, the URL becomes http://en.wikipedia.org/wiki/Foo#Foo - the user then copies "Foo#Foo" from the URL and pastes it into the link in the original article, perhaps following the lead from other piped links on that article; thus [[Foo]] becomes [[Foo#Foo|Foo]]). Probably, the best option per WP:R2D is just to replace it with a non-section, non-piped link. For an example, see this edit which fixed one such occurrence - ADV Films redirects to A.D. Vision#ADV Films. 「 ダイノ ガイ 千?!」(Dinoguy1000) 19:59, 28 April 2009 (UTC)
This script is largely obsolete; it is recommended that you use AutoEd instead, which provides virtually all of the same functions and is easier to customize. I will no longer actively maintain this script, unless there is a specific request, because AutoEd serves as a replacement. All of the fixes except ASCII arrow to unicode character conversion and HTML to wikitable conversion are included in the "basic" preset for AutoEd, although the table conversion is in the "wikichecker" preset and, most likely, the "advanced" preset once it is created. If you have any questions, please feel free to ask at WT:AutoEd or my talk page. – Drilnoth ( T • C • L) 21:37, 1 May 2009 (UTC) |
Be advised that even when using these scripts, you take full responsibility for any action done using them. You must understand Wikipedia policies and use this tool within that policy, or risk being blocked for its misuse. |
CodeFixer is a user script that allows you to quickly and easily update common mistakes in HTML and WikiText. Please note that this script may have bugs and not all of its final functionality has been implemented. Additionally, as the script is being worked on, errors may be introduced which cause it (and possibly other user scripts which you are using) to stop functioning for a short time, although this should usually be fixed within a few minutes.
This script is based upon MECU's BR fixer, Plastikspork's script, and Formatter, and combines various elements from them with new code. Each script functions slightly differently, however, so you can choose whichever one suits your personal tastes.
When CodeFixer is installed (see #Installation), you should see two new tabs at the top of each page while in edit mode and when viewing the page normally: "fix code" and "fix code (+)". Clicking either of these tabs tab will cause the script to automatically edit the page and fix common errors, or just cleanup code even if it wasn't an error. The script contains a list of all things that it can do (although formatted in JavaScript, it should be pretty clear to anyone what the fixed problems are). The range of what is fixed is always expanding.
The two buttons make slightly different fixes. The first button represents the "standard" version of CodeFixer. Although it has some false positives (see below), most all of its fixes shouldn't be problematic. The "codefixer (+)" button, however, starts CodeFixerPlus, which in addition to the "basic" fixes of the normal version performs more advanced cleanup. CodeFixerPlus's edits, however, usually are better at just helping to cleanup code... the script, while working on advanced things like converting HTML tables to WikiText tables, will cause errors in doing so that need to be cleaned up by humans before the edit is saved. CodeFixerPlus, then, serves as a quick way to "help" cleanup such code, but it doesn't do the whole thing and you need to go over it to make sure that it all looks good, previewing and editing further to make sure.
Note that you should always check the diff of any edit made using this script before saving, to make sure that there weren't any false positives (the "show changes" button is automatically clicked when you use this script, so that you don't need to do it manually). If there are, please fix them before saving and report it at #Bugs and to-do list, so that I am aware of the issue.
CodeFixer works with Formatter, although it contains many of the same fixes. Someday CodeFixer may incorporate all of Formatter's current fixes, except maybe for whitespace fixing.
{{Template:Infobox author}}
becomes simply {{Infobox author}}
.Because of the simplicity of this script's RegEx, it will find some false positives, such as those below. These can hopefully be fixed at some point; if you find any other false positives, please add them here so that I can try to figure out how to fix them. CodeFixerPlus is known to have false positives which need further human cleanup before saving; this list contains only those caused by the standard CodeFixer.
To install CodeFixer for your own use, simply add the following to your monobook.js page (if you're using a non-standard skin, you probably know what to do).
importScript('User:Drilnoth/codefixer.js'); //See [[User:Drilnoth/codefixer.js/doc]] for details
After adding this, just purge your cache and the script should begin working.
By default, CodeFixer marks edits you make using it as minor. If you'd prefer that they not be, you can add the following right under the "importScript" text on your monobook.js page:
codefixerMinor = false;
Please note that setting this to false will actually mark the edit as major even if it had been marked as minor before you used the script, due to a technical restriction. You can, however, still manually check and uncheck the "mark this edit as minor" box, depending on what other edits you make.
If you have found a bug: Please leave a descriptive entry here and a note on my talk page about the issue. Thanks!
If you would like to request that a new fix be added to the script, please edit this section and add your request here. If you request an addition here, I'll do that as soon as possible if it is a logical fix for this script. Once the fix is added, I will remove the request from this section and notify you on your talk page. If the requested fix cannot be implemented due to technical restrictions, or because it is an inappropriate task for this script, I will say such in the edit history and leave a note on your talk page.
(NOTE: Previous discussion on this topic removed because page was getting too large. See [3] for earlier parts of this discussion. – Drilnoth ( T • C • L) 16:22, 21 April 2009 (UTC))
You should probably note that the period in '<.BR>' matches all characters, hence all the particular matches you have after that are redundant. This is probably fine, unless you are worried about matching something like '<abr>' or '<ubr>', although I don't there are any three letter tags that end in br. If you really want to match a literal period, you have to backslash it. The three lines above could be reduced to the following two lines
txt.value = txt.value.replace(/(?:<[\\\/\.]+BR[\\\/\. ]*>|<[\\\/\. ]*BR[ ]*[\\\/\.]+[ ]*>)/gi, '<br />');
txt.value = txt.value.replace(/<[ ]*BR[ ]*>/gi, '<br>');
The second line will perform one no-op, which is replace '<br>' with '<br>', but it's not a big deal in my opinion since it reduces the complexity of the match. By the way, I maintain my own script, which does some cleanup but it's mostly orthogonal to what you have here. Although there is some minor overlap. Thanks for your contributions! Plastikspork ( talk) 00:32, 20 April 2009 (UTC)
txt.value = txt.value.replace(/<[\\\/\.]+BR[\\\/\. ]*>/gi, '<br />'); // Tag starts with a slash or period
txt.value = txt.value.replace(/<[\\\/\. ]*BR[ ]*[\\\/\.]+[ ]*>/gi, '<br />'); // Tag ends with a slash or period
txt.value = txt.value.replace(/<[ ]*BR[ ]*>/gi, '<br>'); // Tag contains no slashes
I don't think your code is completely safe in that if the <hr> is preceded by a pipe, it will look like a |- for a table. For example, try it on the following (check the wikisource to see exactly what I am talking about):
For this reason, I suggest the following code instead, which will add a newline if the 'hr' is not at the start of the line:
txt.value = txt.value.replace(/([\r\n])[\t ]*<[\\\/\. ]*HR[\\\/\. ]*>/gi, '$1----');
txt.value = txt.value.replace(/(.)<[\\\/\. ]*HR[\\\/\. ]*>/gi, '$1\n----');
Let me know if there is anything else I can do to help. Plastikspork ( talk) 23:03, 21 April 2009 (UTC)
I noticed an issue; sometimes this script offers to convert mundane elements like the b-element to wiki-markup, but gets confused by any attributes present.
This does still work, but is definitely not something we want happening — to anything, not just sigs. An expression for this would be rather messy as there's a lot that could be going on in the middle.
fyi, Jack Merridew 06:37, 20 April 2009 (UTC)
txt.value = txt.value.replace(/<(B|STRONG)[ ]*>([^<>]*)<\/\1[ ]*>/gi, "'''$2'''"); // Wikify <B> and <STRONG>
txt.value = txt.value.replace(/<(I|EM)[ ]*>([^<>]*)<\/\1[ ]*>/gi, "''$2''"); // Wikify <I> and <EM>
I noticed someone made an edit that was attributed to this script, and that edit added lots of incorrect spaces: [4]. There's no need to replace entities with unicode in the first place, but if the script does do the replacement it should not add spaces. — Carl ( CBM · talk) 17:11, 22 April 2009 (UTC)
I notice that the code substitutes '<H1>' for '=' without checking to see if it's preceded by a newline. Note that while HTML is pretty much newline insensitive, wikipedia text is not, especially in this case. A simple, appears to be completely safe, solution is to match the start and end tags together, and make sure there are no problems with newlines. For example this is<h1>a heading</h1>with problems
would not render the same as this is=a heading=with problems
. You could start by adding newlines where needed:
txt.value = txt.value.replace(/([^\r\n ])[\t ]*(<H[1-6][^<>]*>)/gim, '$1\n$2'); // Make sure <H1>, ..., <H6> is after a newline
txt.value = txt.value.replace(/(<\/H[1-6][^<>]*>)[\t ]*([^\r\n ])/gim, '$1\n$2'); // Make sure </H1>, ..., </H6> is before a newline
and then follow that by a match which replaces only those which can be safely changed
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H1[^<>]*>([^\r\n]*?)<\/H1[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1=$2=$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H2[^<>]*>([^\r\n]*?)<\/H2[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1==$2==$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H3[^<>]*>([^\r\n]*?)<\/H3[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1===$2===$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H4[^<>]*>([^\r\n]*?)<\/H4[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1====$2====$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H5[^<>]*>([^\r\n]*?)<\/H5[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1=====$2=====$3');
txt.value = txt.value.replace(/(^|[\r\n])[\t ]*<H6[^<>]*>([^\r\n]*?)<\/H6[\r\n\t ]*>[\t ]*([\r\n]|$)/gim, '$1======$2======$3');
Note that if there is a newline inside of the headline tags, then the substitution is not safe, which is why my suggestion guards against newlines inside. A solution to this problem would be to remove these newlines before performing the substitution. If you want that code, search for 'remove newlines from inside' in my script. You can copy the for loop and change 'str' to 'value.txt' and it should work. Plastikspork ( talk) 02:15, 25 April 2009 (UTC)
Hi. I use a Norwegianised version of your script along with the formatter-script, and appreciate it very much. However, the latter can add and remove whitespaces in ways that are not always desirable. I was wondering if it was possible for you to add a fix similar to the link simplifier (simplifies some links e.g. [[Dog|dog]] to [[dog]], [[Dog|dogs]] to [[dog]]s and [[Dog|canine]]s to [[Dog|canines]]) to your script, eliminating my need to use both scripts? Thanks, and keep up the good work. - Helt ( talk) 14:39, 27 April 2009
I really appreciate both of your efforts. I do pop in now and again to copy some updates, but I guess I need to copy the whole page from time to time and translate the lot again, as I might miss the odd bugfix when I sow it together bit by bit. I don't know any java, but a few things seem quite intuitive, so I'm able to weed out the bits that don't fit the Norwegian standards. Adding my own fixes, now that requires more than my little brain can handle, so I'll have to ask you. Like the new error from Check Wikipedia, the indented list. Would it be much of a problem changing the :* ::* to ** and *** and so on? - Helt ( talk) 18:52, 28 April 2009 (UTC)
Here's an interesting case you may want to look into: [[text#text|text]] (where all three "text"s are identical). This type of link may occur when an inexperienced user tries to bypass a redirect to a section link by copying from the URL (e.g. if Foo redirects to Bar#Foo, when you click on a link to Foo, the URL becomes http://en.wikipedia.org/wiki/Foo#Foo - the user then copies "Foo#Foo" from the URL and pastes it into the link in the original article, perhaps following the lead from other piped links on that article; thus [[Foo]] becomes [[Foo#Foo|Foo]]). Probably, the best option per WP:R2D is just to replace it with a non-section, non-piped link. For an example, see this edit which fixed one such occurrence - ADV Films redirects to A.D. Vision#ADV Films. 「 ダイノ ガイ 千?!」(Dinoguy1000) 19:59, 28 April 2009 (UTC)