![]() | This page is an archive. Do not edit the contents of this page. Please direct any additional comments to the current main page. |
Kodak-worldREMOVETHIS.com was changed to www.officialkodakblack.com but URLs within the site do not necessarily map cleanly. The old URL now hosts malware ( Signpost coverage, "Beware of malware", screen shot from Kaspersky).
Please change http://kodak-world.REMOVETHIScom/?page_id=24 (Biography of Kodak Black) to https://web.archive.org/web/20170103124913/http://kodak-world.com?page_id=24 and change the main URL where it appears by itself (such as in "Official web site" links) to www.officialkodakblack.com. Change any other uses to a non-recent/non-poison version on https://web.archive.org or a similar archive site or on www.officialkodakblak.com if it exists, and flag the rest for manual handling.
I found only a few instances of this in a manual sweep of Kodak Black articles in 14 languages so this task may already be complete. ru:Kodak Black, uk:Kodak Black, and fr:Kodak Black are now clean. However, we do need to scan the entire project for other instances of the poisoned web site. Previous discussion which pointed me here is at Wikipedia:Village_pump_(technical)#Should we be checking for links to the Shlayer trojan horse?( permalink). davidwr/( talk)/( contribs) 15:13, 31 January 2020 (UTC)
The domain
factfinder.census.gov
will be taken offline on 31 March 2020.
As per https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml:
There are over 4,600 Wikipedia articles directly referencing this domain, as well as several templates that reference the domain. However, there are over 40,000 Wikipedia articles that use these templates. — Preceding unsigned comment added by Fabrickator ( talk • contribs)
{{
cite web}}
and treat them as dead links and add an archive URL, or 2) find the corresponding new URL at data.census.gov .. the problem with technique #1 is the FactFinder site uses web 2.0 type stuff that Wayback Machine has trouble archiving so won't be much help. Archive.today does better but most of the links are not saved. For #2, this is the ideal solution, but mapping URLs between old and new site looks very complicated. There are two documents (ominously 2 20-page "deep linking guide"), one for the
old site and
new site - the trick is to learn how to map between them and write software that can do it. --
Green
C
20:47, 8 February 2020 (UTC)Discussion moved to WP:USCENSUS -- Green C 03:31, 12 February 2020 (UTC)
Technical and legal authorizations from the Mexican Federal Telecommunications Institute's Registro Público de Concesiones (RPC) are cited in hundreds of articles about Mexican broadcasting. There are 1,290 citations from the domain rpc.ift.org.mx which hosts the PDF documents.
On January 31, 2020, the RPC changed to begin serving HTTPS only. In addition, they added a "v" to the URL, so URLs that were formerly
http://rpc.ift.org.mx/rpc/pdfs/96255_181211120729_7489.pdf
changed to
https://rpc.ift.org.mx/vrpc/pdfs/96255_181211120729_7489.pdf
This will particularly be needed for Mexican radio and TV articles, as well as the lists that use them on eswiki (such as es:Anexo:Estaciones de radio en el estado de Michoacán). I am doing some high-link-count articles, like Imagen Televisión, manually. Raymie ( t • c) 02:13, 9 February 2020 (UTC)
Thank you, @ Raymie:. Comments like that help to keep going. In case you want to pursue it further there are 57 articles on eswiki with the links (listed). My bot doesn't have permissions there. Or we could make a bot request at [1] but I don't speak Spanish (well). -- Green C 15:28, 12 February 2020 (UTC)
Extended content
|
---|
|
These previously-reputable domains were semi-recently replaced with spam and other nasty content. Blackwell-synergy.com has already been marked as dead in IABot, but I do not believe gaylesbiantimes.com has been. Both need to have |url_status=usurped
set as they are not fit to be linked to. --
AntiCompositeNumber (
talk)
04:59, 13 November 2019 (UTC)
{{
dead link}}
but the spam link then is still clickable. There was talk about creating a new template called {{
usurped}}
where these free-floating usurped links could be embedded so they don't display but nothing has happened. --
Green
C
16:46, 13 November 2019 (UTC)
|doi=
and covert bare links to {{
DOI}}
. When that can't be done (say, because they're used in a labeled link, or because that would take a lot of development effort), doi.org links are the best option for an automated fix. If there's no valid DOI and no valid archive, tagging dead and moving on is the best option at the moment. Where we go from there would depend on how many are unfixable. If it's less than ~100, humans can review the links and take appropriate action. --
AntiCompositeNumber (
talk)
17:05, 13 November 2019 (UTC)@
AntiCompositeNumber: the bot ran for Blackwell and it basically
eliminated the domain from mainspace. Replacing the url with |doi=
or doi.org (examples:
[2]
[3]
[4]) .. It can't detect {{
doi}}
so there are a few duplicates (
[5]), and in a few cases cite templates ended up with both a doi.org URL and |doi=
. It edited about 550 pages. The spam filters won't allow addition of new archive URLs, for one reason or another the bot couldn't do some things, these remaining pages have a Blackwell domain that need manual attention:
I'll take a look at GLT next. -- Green C 16:41, 23 November 2019 (UTC)
@
AntiCompositeNumber: - GayLesbianTimes.com is only in 76 mainspace articles so I set them manually - either with |url-status=usurped
or for square and bare links that have a {{webarchive}}
moving the archive URL into the square-barelink (
example). Those without an archive URL had to be deleted and replaced with a non-URL citation.
There are still links in non-mainspace, maybe they should just be blanked with a quick search-replace script unless someone wants to manually fix, it's not possible to add new archive URLs because of a blacklist filter. --
Green
C
01:38, 14 February 2020 (UTC)
Web site comicbookdb.com has announced that it is shutting down as of 16 December 2019.
English-language Wikipedia has about 4,500 articles which include links to comicbookdb.com (mostly using the "comicbookdb" template).
Fabrickator ( talk) 17:53, 20 November 2019 (UTC)
{{
cite web}}
with |archive-url=
so the bots can search for custom fit archives on a per-link basis. --
Green
C
04:43, 14 February 2020 (UTC)Since a month or two ago, springerlink.com has stopped working. Now all 3500 links from articles are a 404 like this, served by a supposed "UltraDNS client redirection service" with "Copyright © 2001-2008 NeuStar".
The good news is that a request to the Internet Archive can reveal the current location, for instance [6] redirects to [7] (and then [8] which can be ignored). Because the new URLs contain the DOI, they can then be translated in a more permanent doi.org URL. Nemo 08:17, 6 February 2020 (UTC)
|doi=
has the same DOI, so in those cases the net effect will be deletion of |url=
field (or |chapter-url=
or wherever). --
Green
C
21:10, 12 February 2020 (UTC)|doi=
already exists. Another 1,000 archive URL additions when no DOI could be found. Archive URL removals when a doi.org could be found. Added
dead link when no archive or doi discovered. Operations on CS1|2 templates, square and bare links; and in Mainspace, File:, Wikipedia: and Template:. --
Green
C
21:25, 13 February 2020 (UTC)
(thread moved from WP:BOTREQ by GreenC)
The old LPSN website at http://www.bacterio.net is frequently linked to from Wikipedia. Many of these links target LPSN entries for species. Because all species belong to a genus and because LPSN uses one HTML page per genus name, links to LPSN species names are links to anchors within an LPSN page for the according genus name. For instance, on /info/en/?search=Acetobacter_aceti we find the link http://www.bacterio.net/acetobacter.html#aceti to the old LPSN page.
As part of an agreement between the old LPSN maintainer, Aidan C. Parte, and the Leibniz Institute DSMZ, LPSN has been taken over by DSMZ to ensure long-term maintenance (see also announcement here). In the course of this takeover, a new website was created. In contrast to the old LPSN website, the new LPSN website at https://lpsn.dsmz.de (currently https://lpsn-dev.dsmz.de) uses individual pages for species names. We will employ the following mapping:
(1) the domain http://www.bacterio.net is permanently redirected to https://lpsn.dsmz.de;
(2) the page address acetobacter.html is mapped to genus/acetobacter, which is the page for the genus Acetobacter on the new LPSN website.
This means, however, that http://www.bacterio.net/acetobacter.html#aceti is mapped to https://lpsn.dsmz.de/genus/acetobacter and not to https://lpsn.dsmz.de/species/acetobacter-aceti, which is the page for the species on the new LPSN website, as it should be. The reason for this limitation is that the anchor aceti is not even transferred by the browser and thus cannot be processed by the website. While links on https://lpsn.dsmz.de/genus/acetobacter are present that lead to https://lpsn.dsmz.de/species/acetobacter-aceti, it would be more convenient for the user if http://www.bacterio.net/acetobacter.html#aceti was transferred to a link that leads directly to https://lpsn.dsmz.de/species/acetobacter-aceti.
As LPSN URLs are stored in Wikidata ( LPSN), this change should be doable task with the help of a bot. Therefore we are kindly asking for help to accordingly modify all Wikipedia links to LPSN species pages as described above. Tobias1984: you did a great job in the past, helping us with BacDive: Is there a chance that you help us again with this issue? -- L.C.Reimer
@ L.C.Reimer: I can help with this but wanted to get the request moved to the right place. -- Green C 03:27, 14 February 2020 (UTC)
@ GreenC: We would appreciate your help very much. We will launch the new site and activate the redirect beginning next week. I will give here a note, when it is done.-- L.C.Reimer
{{
taxonbar}}
which pulls the URL from Wikidata. I am able to fix the first type, but not the second. For Wikidata requests you could try
[13]. The other problem my processes only update English Wikipedia (and Commons) and since there are about 300 language wikis it presents a challenge to make Wikipedia-wide changes as each wiki language is its own organization where permissions and tools customized for that language are secured eg. ar.wikipedia.org requires tools customized for Arabic language and permissions from the Arabic community to make these changes with a bot. I would suggest, if you are able, to create and maintain redirects. Nevertheless, if you would like to convert the in-wiki links on Enwiki I can do that. --
Green
C
23:23, 18 February 2020 (UTC)
{{
taxonbar}}
. --
Green
C
00:53, 19 February 2020 (UTC)L.C.Reimer, a couple new issues.
http://www.bacterio.net/a/acetoanaerobium.html
has an extra "/a/" in the path (there is "/m/" and other letters). Some links have a leading "-" like http://www.bacterio.net/-number.html
. I guess for now it will verify the new URL is working with a header check before making the change or otherwise leave as-is, these look like low volume exceptions.http://www.bacterio.net/a/acetoanaerobium.html
--> http://www.bacterio.net/acetoanaerobium.html
--> https://lpsn.dsmz.de/genus/acetoanaerobium
. --
Green
C
20:01, 19 February 2020 (UTC)Extended content
|
---|
HTTP/1.1 301 Moved Permanently Date: Wed, 19 Feb 2020 18:32:22 GMT Server: Apache Location: https://lpsn.dsmz.de/bacillales.html Content-Length: 244 Content-Type: text/html; charset=iso-8859-1 Via: 1.1 varnish (Varnish/6.3), 1.1 varnish (Varnish/6.3) X-Cache-Hits: 0 X-Cache: MISS Age: 0 Connection: keep-alive HTTP/1.1 301 Moved Permanently Date: Wed, 19 Feb 2020 18:32:23 GMT Server: Apache/2.4.37 (Red Hat Enterprise Linux) OpenSSL/1.1.1c mod_fcgid/2.3.9 X-Powered-By: PHP/7.3.5 Location: /order/bacillales Content-Length: 0 Content-Type: text/html; charset=UTF-8 HTTP/1.1 200 OK Date: Wed, 19 Feb 2020 18:32:23 GMT Server: Apache/2.4.37 (Red Hat Enterprise Linux) OpenSSL/1.1.1c mod_fcgid/2.3.9 X-Powered-By: PHP/7.3.5 Vary: Accept-Encoding Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 |
The second Location: line contains /order/bacillales
which is added onto the domain name found in the first Location line. There are probably other paths besides /order/ we don't know about yet. --
Green
C
19:44, 19 February 2020 (UTC)
@ L.C.Reimer: The bot has completed. It converted 11,355 links in 5,718 articles (the previous link count of 6,487 is incorrect.) All links were tested as working (header status code 200). Some typical diffs:
It was unable to convert 1,240 links because the new URL doesn't work (header status 404). Can provide a list of those if you want, most of them appear to be related to Streptomyces. -- Green C 02:29, 20 February 2020 (UTC)
Found these: [18] -- Green C 14:47, 20 February 2020 (UTC)
![]() | This page is an archive. Do not edit the contents of this page. Please direct any additional comments to the current main page. |
Kodak-worldREMOVETHIS.com was changed to www.officialkodakblack.com but URLs within the site do not necessarily map cleanly. The old URL now hosts malware ( Signpost coverage, "Beware of malware", screen shot from Kaspersky).
Please change http://kodak-world.REMOVETHIScom/?page_id=24 (Biography of Kodak Black) to https://web.archive.org/web/20170103124913/http://kodak-world.com?page_id=24 and change the main URL where it appears by itself (such as in "Official web site" links) to www.officialkodakblack.com. Change any other uses to a non-recent/non-poison version on https://web.archive.org or a similar archive site or on www.officialkodakblak.com if it exists, and flag the rest for manual handling.
I found only a few instances of this in a manual sweep of Kodak Black articles in 14 languages so this task may already be complete. ru:Kodak Black, uk:Kodak Black, and fr:Kodak Black are now clean. However, we do need to scan the entire project for other instances of the poisoned web site. Previous discussion which pointed me here is at Wikipedia:Village_pump_(technical)#Should we be checking for links to the Shlayer trojan horse?( permalink). davidwr/( talk)/( contribs) 15:13, 31 January 2020 (UTC)
The domain
factfinder.census.gov
will be taken offline on 31 March 2020.
As per https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml:
There are over 4,600 Wikipedia articles directly referencing this domain, as well as several templates that reference the domain. However, there are over 40,000 Wikipedia articles that use these templates. — Preceding unsigned comment added by Fabrickator ( talk • contribs)
{{
cite web}}
and treat them as dead links and add an archive URL, or 2) find the corresponding new URL at data.census.gov .. the problem with technique #1 is the FactFinder site uses web 2.0 type stuff that Wayback Machine has trouble archiving so won't be much help. Archive.today does better but most of the links are not saved. For #2, this is the ideal solution, but mapping URLs between old and new site looks very complicated. There are two documents (ominously 2 20-page "deep linking guide"), one for the
old site and
new site - the trick is to learn how to map between them and write software that can do it. --
Green
C
20:47, 8 February 2020 (UTC)Discussion moved to WP:USCENSUS -- Green C 03:31, 12 February 2020 (UTC)
Technical and legal authorizations from the Mexican Federal Telecommunications Institute's Registro Público de Concesiones (RPC) are cited in hundreds of articles about Mexican broadcasting. There are 1,290 citations from the domain rpc.ift.org.mx which hosts the PDF documents.
On January 31, 2020, the RPC changed to begin serving HTTPS only. In addition, they added a "v" to the URL, so URLs that were formerly
http://rpc.ift.org.mx/rpc/pdfs/96255_181211120729_7489.pdf
changed to
https://rpc.ift.org.mx/vrpc/pdfs/96255_181211120729_7489.pdf
This will particularly be needed for Mexican radio and TV articles, as well as the lists that use them on eswiki (such as es:Anexo:Estaciones de radio en el estado de Michoacán). I am doing some high-link-count articles, like Imagen Televisión, manually. Raymie ( t • c) 02:13, 9 February 2020 (UTC)
Thank you, @ Raymie:. Comments like that help to keep going. In case you want to pursue it further there are 57 articles on eswiki with the links (listed). My bot doesn't have permissions there. Or we could make a bot request at [1] but I don't speak Spanish (well). -- Green C 15:28, 12 February 2020 (UTC)
Extended content
|
---|
|
These previously-reputable domains were semi-recently replaced with spam and other nasty content. Blackwell-synergy.com has already been marked as dead in IABot, but I do not believe gaylesbiantimes.com has been. Both need to have |url_status=usurped
set as they are not fit to be linked to. --
AntiCompositeNumber (
talk)
04:59, 13 November 2019 (UTC)
{{
dead link}}
but the spam link then is still clickable. There was talk about creating a new template called {{
usurped}}
where these free-floating usurped links could be embedded so they don't display but nothing has happened. --
Green
C
16:46, 13 November 2019 (UTC)
|doi=
and covert bare links to {{
DOI}}
. When that can't be done (say, because they're used in a labeled link, or because that would take a lot of development effort), doi.org links are the best option for an automated fix. If there's no valid DOI and no valid archive, tagging dead and moving on is the best option at the moment. Where we go from there would depend on how many are unfixable. If it's less than ~100, humans can review the links and take appropriate action. --
AntiCompositeNumber (
talk)
17:05, 13 November 2019 (UTC)@
AntiCompositeNumber: the bot ran for Blackwell and it basically
eliminated the domain from mainspace. Replacing the url with |doi=
or doi.org (examples:
[2]
[3]
[4]) .. It can't detect {{
doi}}
so there are a few duplicates (
[5]), and in a few cases cite templates ended up with both a doi.org URL and |doi=
. It edited about 550 pages. The spam filters won't allow addition of new archive URLs, for one reason or another the bot couldn't do some things, these remaining pages have a Blackwell domain that need manual attention:
I'll take a look at GLT next. -- Green C 16:41, 23 November 2019 (UTC)
@
AntiCompositeNumber: - GayLesbianTimes.com is only in 76 mainspace articles so I set them manually - either with |url-status=usurped
or for square and bare links that have a {{webarchive}}
moving the archive URL into the square-barelink (
example). Those without an archive URL had to be deleted and replaced with a non-URL citation.
There are still links in non-mainspace, maybe they should just be blanked with a quick search-replace script unless someone wants to manually fix, it's not possible to add new archive URLs because of a blacklist filter. --
Green
C
01:38, 14 February 2020 (UTC)
Web site comicbookdb.com has announced that it is shutting down as of 16 December 2019.
English-language Wikipedia has about 4,500 articles which include links to comicbookdb.com (mostly using the "comicbookdb" template).
Fabrickator ( talk) 17:53, 20 November 2019 (UTC)
{{
cite web}}
with |archive-url=
so the bots can search for custom fit archives on a per-link basis. --
Green
C
04:43, 14 February 2020 (UTC)Since a month or two ago, springerlink.com has stopped working. Now all 3500 links from articles are a 404 like this, served by a supposed "UltraDNS client redirection service" with "Copyright © 2001-2008 NeuStar".
The good news is that a request to the Internet Archive can reveal the current location, for instance [6] redirects to [7] (and then [8] which can be ignored). Because the new URLs contain the DOI, they can then be translated in a more permanent doi.org URL. Nemo 08:17, 6 February 2020 (UTC)
|doi=
has the same DOI, so in those cases the net effect will be deletion of |url=
field (or |chapter-url=
or wherever). --
Green
C
21:10, 12 February 2020 (UTC)|doi=
already exists. Another 1,000 archive URL additions when no DOI could be found. Archive URL removals when a doi.org could be found. Added
dead link when no archive or doi discovered. Operations on CS1|2 templates, square and bare links; and in Mainspace, File:, Wikipedia: and Template:. --
Green
C
21:25, 13 February 2020 (UTC)
(thread moved from WP:BOTREQ by GreenC)
The old LPSN website at http://www.bacterio.net is frequently linked to from Wikipedia. Many of these links target LPSN entries for species. Because all species belong to a genus and because LPSN uses one HTML page per genus name, links to LPSN species names are links to anchors within an LPSN page for the according genus name. For instance, on /info/en/?search=Acetobacter_aceti we find the link http://www.bacterio.net/acetobacter.html#aceti to the old LPSN page.
As part of an agreement between the old LPSN maintainer, Aidan C. Parte, and the Leibniz Institute DSMZ, LPSN has been taken over by DSMZ to ensure long-term maintenance (see also announcement here). In the course of this takeover, a new website was created. In contrast to the old LPSN website, the new LPSN website at https://lpsn.dsmz.de (currently https://lpsn-dev.dsmz.de) uses individual pages for species names. We will employ the following mapping:
(1) the domain http://www.bacterio.net is permanently redirected to https://lpsn.dsmz.de;
(2) the page address acetobacter.html is mapped to genus/acetobacter, which is the page for the genus Acetobacter on the new LPSN website.
This means, however, that http://www.bacterio.net/acetobacter.html#aceti is mapped to https://lpsn.dsmz.de/genus/acetobacter and not to https://lpsn.dsmz.de/species/acetobacter-aceti, which is the page for the species on the new LPSN website, as it should be. The reason for this limitation is that the anchor aceti is not even transferred by the browser and thus cannot be processed by the website. While links on https://lpsn.dsmz.de/genus/acetobacter are present that lead to https://lpsn.dsmz.de/species/acetobacter-aceti, it would be more convenient for the user if http://www.bacterio.net/acetobacter.html#aceti was transferred to a link that leads directly to https://lpsn.dsmz.de/species/acetobacter-aceti.
As LPSN URLs are stored in Wikidata ( LPSN), this change should be doable task with the help of a bot. Therefore we are kindly asking for help to accordingly modify all Wikipedia links to LPSN species pages as described above. Tobias1984: you did a great job in the past, helping us with BacDive: Is there a chance that you help us again with this issue? -- L.C.Reimer
@ L.C.Reimer: I can help with this but wanted to get the request moved to the right place. -- Green C 03:27, 14 February 2020 (UTC)
@ GreenC: We would appreciate your help very much. We will launch the new site and activate the redirect beginning next week. I will give here a note, when it is done.-- L.C.Reimer
{{
taxonbar}}
which pulls the URL from Wikidata. I am able to fix the first type, but not the second. For Wikidata requests you could try
[13]. The other problem my processes only update English Wikipedia (and Commons) and since there are about 300 language wikis it presents a challenge to make Wikipedia-wide changes as each wiki language is its own organization where permissions and tools customized for that language are secured eg. ar.wikipedia.org requires tools customized for Arabic language and permissions from the Arabic community to make these changes with a bot. I would suggest, if you are able, to create and maintain redirects. Nevertheless, if you would like to convert the in-wiki links on Enwiki I can do that. --
Green
C
23:23, 18 February 2020 (UTC)
{{
taxonbar}}
. --
Green
C
00:53, 19 February 2020 (UTC)L.C.Reimer, a couple new issues.
http://www.bacterio.net/a/acetoanaerobium.html
has an extra "/a/" in the path (there is "/m/" and other letters). Some links have a leading "-" like http://www.bacterio.net/-number.html
. I guess for now it will verify the new URL is working with a header check before making the change or otherwise leave as-is, these look like low volume exceptions.http://www.bacterio.net/a/acetoanaerobium.html
--> http://www.bacterio.net/acetoanaerobium.html
--> https://lpsn.dsmz.de/genus/acetoanaerobium
. --
Green
C
20:01, 19 February 2020 (UTC)Extended content
|
---|
HTTP/1.1 301 Moved Permanently Date: Wed, 19 Feb 2020 18:32:22 GMT Server: Apache Location: https://lpsn.dsmz.de/bacillales.html Content-Length: 244 Content-Type: text/html; charset=iso-8859-1 Via: 1.1 varnish (Varnish/6.3), 1.1 varnish (Varnish/6.3) X-Cache-Hits: 0 X-Cache: MISS Age: 0 Connection: keep-alive HTTP/1.1 301 Moved Permanently Date: Wed, 19 Feb 2020 18:32:23 GMT Server: Apache/2.4.37 (Red Hat Enterprise Linux) OpenSSL/1.1.1c mod_fcgid/2.3.9 X-Powered-By: PHP/7.3.5 Location: /order/bacillales Content-Length: 0 Content-Type: text/html; charset=UTF-8 HTTP/1.1 200 OK Date: Wed, 19 Feb 2020 18:32:23 GMT Server: Apache/2.4.37 (Red Hat Enterprise Linux) OpenSSL/1.1.1c mod_fcgid/2.3.9 X-Powered-By: PHP/7.3.5 Vary: Accept-Encoding Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 |
The second Location: line contains /order/bacillales
which is added onto the domain name found in the first Location line. There are probably other paths besides /order/ we don't know about yet. --
Green
C
19:44, 19 February 2020 (UTC)
@ L.C.Reimer: The bot has completed. It converted 11,355 links in 5,718 articles (the previous link count of 6,487 is incorrect.) All links were tested as working (header status code 200). Some typical diffs:
It was unable to convert 1,240 links because the new URL doesn't work (header status 404). Can provide a list of those if you want, most of them appear to be related to Streptomyces. -- Green C 02:29, 20 February 2020 (UTC)
Found these: [18] -- Green C 14:47, 20 February 2020 (UTC)