This is my simple guide to editing at CCI - Contributor Copyright Investigations. Marking stuff down, and what to do in special situations, based off of my own experience.
If you are experienced with this area on Wikipedia, feel free to add other advice. For a list I have made of CCIs, see User:Moneytrees/CCI Sort.
CCI's vary greatly on subject matter and the way in which the subject copied over content. Becoming familiar with the subject's way of editing-- in what ways they copied from sources, the type of citation style they used, and what sources they were fond of copying from-- is useful when focusing down on one CCI. The Wayback Machine, a website that takes snapshots of websites that have gone offline over the years, is essential for work at CCI.
? Rewritten/removed since --~~~~
Make sure it wasn't moved to a different article. Some CCI subjects use sockpuppets to repeatedly edit the same article. Make sure what was removed wasn't rewritten by one of their socks.{{n}} Checked --~~~~
{{y}} removed --~~~~
{{subst:copyvio|url=INSERTURL}}
and follow the instructions on the generated notice. Notifying CCI subjects that an article was blanked is not necessary.Earwig's Copyvio Detector is the primary tool for finding copyright violations on Wikipedia. Earwig will compare the scanned article to live web pages and highlight similarities. There are two options: Copyvio search and URL comparison. In Copyvio search mode, there are three options that can be selected at the same time:
In URL comparison mode, a single URL will be compared to the article. This is the best option for examining individual diffs, edits that cite only one or two sources, or when an article primarily cites a single source.
Earwig will sometimes have trouble reading certain archives and websites, so be patient and reload a few times if it doesn't work initially. Earwig does not work on books or journals, and cannot translate non-English sources into English, so if you want you use Earwig for comparisons you will have to manually compare the articles or paste the content into a page like User:Moneytrees/dummy. You can then paste the URL of the page into the "URL comparison" field to get a comparison. After you get the comparison, remove the content you copied and request revision deletion if applicable.
The percentage doesn't mean much and is usually best ignored. Instead go off of the text highlighted.
Keep in mind, many sites have copied from Wikipedia over the years, and using the search engine with earwig will almost always find a handful, so be careful when removing content. If it seems like the website copied from Wikipedia, CTRL F and type "Wikipedia", which will often highlight along the lines of "Taken from wikipedia" on the scanned web page. Always be wary of user-generated websites; for example, every Wikipedia article has been copied by at least one BlogSpot site. Be careful when assessing IMDb violations; they've repeatedly copied Wikipedia plot summaries, and we've repeatedly copied them.
Certain sites don't like earwig and will time out when it tries scanning them; The Independent and some PDFs are examples. If this happens, go to a website that will find Google web caches, which are saved versions of pages that earwig should always be able to read. https://cachedpage.co/ is an example; some Archive.org saves can be viable workarounds as well.
In some cases, sources copied from by the CCI subjects are inaccessible, of questionable veracity, or significant money would have to be spent to access them. There are also cases where infringement is guaranteed and obvious in most significant edits. In these cases, it is best to remove the content inserted. Note that this is a last resort option; try and find if you can access the content before doing this. Presumptive removals may also be warranted in cases where the subject copied a specific thing (e.g. plot summaries), figuring out where the subject copied from would be too difficult, or where the CCI could be wrapped up quicker by just removing everything. If the sources are inaccessible and stubbing/removing the problematic content would not be feasible, tag the article for presumptive deletion. For presumptive removals and deletions:
Presumptive removal over copyright concerns, please see: [[Wikipedia:Contributor copyright investigations/INSERTNAME]]
{{subst:copyvio|url=Presumptive deletion over copyright concerns, please see: [[Wikipedia:Contributor copyright investigations/INSERTNAME]]}}
If the amount of text you remove is major (+500 or important text), please leave a note on the articles talk page with {{subst:CCI|INSERTNAME}}
Please mark the associated listing with and something along the lines of {{x}} Presumptive removal ~~~~
/{{x}} Tagged for presumptive deletion ~~~~
License Compatibility with Wikipedia [1] | |
---|---|
Licenses compatible with Wikipedia | Licenses not compatible with Wikipedia |
Creative Commons licenses | |
|
|
Other licenses | |
|
|
This is my simple guide to editing at CCI - Contributor Copyright Investigations. Marking stuff down, and what to do in special situations, based off of my own experience.
If you are experienced with this area on Wikipedia, feel free to add other advice. For a list I have made of CCIs, see User:Moneytrees/CCI Sort.
CCI's vary greatly on subject matter and the way in which the subject copied over content. Becoming familiar with the subject's way of editing-- in what ways they copied from sources, the type of citation style they used, and what sources they were fond of copying from-- is useful when focusing down on one CCI. The Wayback Machine, a website that takes snapshots of websites that have gone offline over the years, is essential for work at CCI.
? Rewritten/removed since --~~~~
Make sure it wasn't moved to a different article. Some CCI subjects use sockpuppets to repeatedly edit the same article. Make sure what was removed wasn't rewritten by one of their socks.{{n}} Checked --~~~~
{{y}} removed --~~~~
{{subst:copyvio|url=INSERTURL}}
and follow the instructions on the generated notice. Notifying CCI subjects that an article was blanked is not necessary.Earwig's Copyvio Detector is the primary tool for finding copyright violations on Wikipedia. Earwig will compare the scanned article to live web pages and highlight similarities. There are two options: Copyvio search and URL comparison. In Copyvio search mode, there are three options that can be selected at the same time:
In URL comparison mode, a single URL will be compared to the article. This is the best option for examining individual diffs, edits that cite only one or two sources, or when an article primarily cites a single source.
Earwig will sometimes have trouble reading certain archives and websites, so be patient and reload a few times if it doesn't work initially. Earwig does not work on books or journals, and cannot translate non-English sources into English, so if you want you use Earwig for comparisons you will have to manually compare the articles or paste the content into a page like User:Moneytrees/dummy. You can then paste the URL of the page into the "URL comparison" field to get a comparison. After you get the comparison, remove the content you copied and request revision deletion if applicable.
The percentage doesn't mean much and is usually best ignored. Instead go off of the text highlighted.
Keep in mind, many sites have copied from Wikipedia over the years, and using the search engine with earwig will almost always find a handful, so be careful when removing content. If it seems like the website copied from Wikipedia, CTRL F and type "Wikipedia", which will often highlight along the lines of "Taken from wikipedia" on the scanned web page. Always be wary of user-generated websites; for example, every Wikipedia article has been copied by at least one BlogSpot site. Be careful when assessing IMDb violations; they've repeatedly copied Wikipedia plot summaries, and we've repeatedly copied them.
Certain sites don't like earwig and will time out when it tries scanning them; The Independent and some PDFs are examples. If this happens, go to a website that will find Google web caches, which are saved versions of pages that earwig should always be able to read. https://cachedpage.co/ is an example; some Archive.org saves can be viable workarounds as well.
In some cases, sources copied from by the CCI subjects are inaccessible, of questionable veracity, or significant money would have to be spent to access them. There are also cases where infringement is guaranteed and obvious in most significant edits. In these cases, it is best to remove the content inserted. Note that this is a last resort option; try and find if you can access the content before doing this. Presumptive removals may also be warranted in cases where the subject copied a specific thing (e.g. plot summaries), figuring out where the subject copied from would be too difficult, or where the CCI could be wrapped up quicker by just removing everything. If the sources are inaccessible and stubbing/removing the problematic content would not be feasible, tag the article for presumptive deletion. For presumptive removals and deletions:
Presumptive removal over copyright concerns, please see: [[Wikipedia:Contributor copyright investigations/INSERTNAME]]
{{subst:copyvio|url=Presumptive deletion over copyright concerns, please see: [[Wikipedia:Contributor copyright investigations/INSERTNAME]]}}
If the amount of text you remove is major (+500 or important text), please leave a note on the articles talk page with {{subst:CCI|INSERTNAME}}
Please mark the associated listing with and something along the lines of {{x}} Presumptive removal ~~~~
/{{x}} Tagged for presumptive deletion ~~~~
License Compatibility with Wikipedia [1] | |
---|---|
Licenses compatible with Wikipedia | Licenses not compatible with Wikipedia |
Creative Commons licenses | |
|
|
Other licenses | |
|
|