The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Archive.is is an archiving service similar to sites like Webcite and the Wayback Machine, offering different levels of service up to and including snapshots that are retained regardless of modern changes in a sites robots.txt file, which the Wayback Machine can abandon (potentially delaying rather than removing the potential for LinkRot), while Webcite has presented itself as having an uncertain long term future tied to funding. No issues have been found with the quality of the snapshots provided at archive.is.
In August 2013, a bot called User:RotlinkBot, created by Rotlink began linking Wikipedia articles to the new Archive.is service. This bot was not approved, and was therefore subsequently blocked. This block was procedural, and made based on the lack of approval, not the quality of the RotlinkBot's edits.
Following this block, edits matching edits from the bot, including the edit summaries, were made from hundreds of IPs, residential and business, from three different Indian states, Italy, Hong Kong, Vietnam, Bulgaria, Qatar, Latvia, Hungary, Slovakia, Romania, Brazil, Argentina, Portugal, Spain, France, Mexico, Austria, and South Africa. Based on fears that the IPs were not being used legally, these IPs, and User:Rotlink, self-identified as the owner of archive.is (Note: struck 3 Oct 2014 because this is an unverifiable claim with no presented evidence supporting it, per discussion in
3.7), were subsequently blocked. Rotlink has not commented on any of the blocks.
The previous RFC regarding Archive.is concluded that the site should be added to the blacklist and that all existing links to archive.is should be removed. In few cases, the removal of archive.is links has resulted in LINKROT.
Archive.is has never been added to the spam blacklist because the use of the blacklist would require the links to be removed before unrelated edits could be made to the article. Instead, an edit filter has been applied which prevents additions of the link, but does not prevent editing articles which simply contain the link.
The concerns about the potential for malware raised in the RFC have not materialized at this point, leading to arguments as to whether those fears were well-founded. An effort to get a bot approved to implement the RFC result stalled, indicating that the community may no longer believe the block to be warranted.
Archive.is does use advertising. (Note: struck 1 Oct 2014 because this is an unverifiable claim with no presented evidence supporting it, per discussion on talk page.) The previous discussion showed that some editors considered this to be a major issue, but there was no strong consensus either for or against the site based on this.
Based on the questions of consensus raised during Wikipedia:Bots/Requests for approval/Archivedotisbot, the community should discuss whether the previous consensus is still in force
"I think we will, at some point, need a proper archiving solution." - When that happens, then by all means provide full support for it, I personally would welcome such a thing. But at the moment, we don't have anything like that. Why not make do with what we have in front of us? There's the saying that "beggars can't be choosers", and we're all beggars here; if you're not one, then I invite you to drop a few thousand dollars on a budget-performance RAID array server to host your own archival service, and provide the non-commercial "perfect" service that you've so keenly described for all of us. If no one's going to do it, why not take the initiative yourself? Don't want to? Then making do with archive.is is a plausible beggar's solution until someone comes along and wants to. -- benlisquare T• C• E 02:09, 26 July 2014 (UTC)
Winston Churchill: "I think you really should reconsider your position, Mr. Chamberlain."
Neville Chamberlain: "Do you have any proof that Hitler will invade Poland next year? I won't be convinced until you prove to me that this will happen in the future. What's the matter, you can't do that? I guess that settles then. Peace in our time!"
Do you realise how silly you sound? -- benlisquare T• C• E 05:01, 27 July 2014 (UTC)
{{
cite|…}}
and friends changed how deadurls are handled. Currently |deadurl=no
is required to have the live link first if a archive link is present. I kinda wish the option had been set up differently. Say have a cite error if url and archive url are provided and deadlink is not set to "yes" or "no". If "deadurl" is set to no, I'd love for any archive link to be displayed with a CSS style that is hidden by default which would allow for custom style sheets to be able to show them for users that are interested in them.
PaleAqua (
talk)
07:53, 3 July 2014 (UTC)Darkwarriorblake your opening statement: "Archive.is has never been added to the spam blacklist because the use of the blacklist would require the links to be removed before unrelated edits could be made to the article. Instead, an edit filter has been applied which prevents additions of the link, but does not prevent editing articles which simply contain the link." is a misinterpretation of my remarks on the request for blacklisting and how blacklisting/edit-filtering works. The way the edit-filter is currently set-up is exactly the same functionality that the blacklist would give (when the blacklisting was put on hold, the filter was more restrictive than the blacklisting would be). My reason not to blacklist at that time was that there were so many links that accidental removals could hurt editing experiences - unfortunately, the edit-filter is now having the same effect (though edit-filter-managers are trying to mitigate that). IMHO, blacklisting can (should?) now be implemented and the edit-filter disabled (the latter is a bigger strain on the server) per the standing consensus of the previous RfC. If this RfC then overturns that decision then blacklisting could be removed. --
Dirk Beetstra
T
C
05:02, 3 July 2014 (UTC)
People, what are you talking about? Have you ever looked up at Alexa graphs during your investigation? Was Rotlink's job really useful to archive.is as you assume uncontionally? Just check the graph and then tell me, who is enemy of archive.is and who are paid editors.— Preceding unsigned comment added by 83.245.226.111 ( talk • contribs) This template must be substituted.
I continue to feel frustrated by this situation, and the apparent failure of the WMF to be more proactive about web archives. Clearly a significant portion of the community feels a need for proactively archiving linked references in advance, in order to deal with the potential for link rot. But there seem to be no widely acceptable solutions. Whether any third-party site can be relied on never to spam or place advertisements is questionable. So why not this solution: Use the file namespace for link archives. Why can't an editor simply upload an image of the page that they want to archive, and simply store it—either locally or on Commons? Wbm1058 ( talk) 16:52, 6 July 2014 (UTC)
I want to cache all 20 million external link so references can be done in a post processing stage where details (title, author, date) can be cross verified and lots of other goodies. The foundation employee have calling 24 TB excessive (it's not) and are not working with me.— Dispenser 20:40, 17 February 2014 (UTC)
I'm not sure if the Administrators already know that, or if its even the right place to let them know, but I've just noticed that archive.is changed its domain name to archive.today. — Mayast ( talk) 20:09, 13 July 2014 (UTC)
As the discussion goes heated, I'd recommend everyone to comment on content and remain civil. Some comments, like "he knows nothing about content creation", "absurd, hysterical and paranoid", or "paranoid hysteria" are no good arguments, and nothing helpful to the discussion. Forbidden User ( talk) 16:45, 21 July 2014 (UTC)
nothing more to see here, just some incivility. -- Mdann 52 talk to me! 17:17, 12 August 2014 (UTC) |
---|
The following discussion has been closed. Please do not modify it. |
This is an attempt to review the allegations and analyze them with Archive.is's own words and the evidence already gathered. The most common complaint that Rotlink must be Archive.is is that "other editors said it so it must be true". The claim has been repeatedly been rejected on the lack of evidence, but also by Archive.is's operator Denis. The following comes from the e-mails from Wikipedia:Archive.is RFC discussion between Lexein and Denis of Archive.is. Of particular concerns are:
Thank you for your reasoning here, so that I can understand your point better. So, you think editors are being sheeps that follow the one before them. No, they think those "leaders" have sufficient reasons and proofs. There are too many unsupported statements that I don't want to waste time citing. For example, RotLink's act at least raised the notability of archive.is, and no matter how small, it helps the rankings, and that's it, it is gaining off our works - proportion does not matter. "It is clear that the person behind Rotlink was already familiar with Wikipedia's policies and procedures - but got fed up and attempted to force it through. Resulting in the situation. Note that the person mass-removing Archive.is was using a "hacked" version of AWB to bypass the requirements for permission and socking bad enough that a special abuse filter, edit filter 620, had to be created and reactivated because of continued abuse. Neither case was a botnet, just a committed proxy/IP hopping single editor." I'm not joking, this is a very appealing argument, made out of zero basis. You said argumentum ad populum is not proof, so take your own advice. The same for the writing pattern issue, unless you're an expert on it. "Do not panic! I do not plan to stop the service, to delete or alter the snapshots, to put ads or malware on it, etc." He lies in the first sentence? What a good liar. In their FAQ they don't have the courage to say so, shown by the stammery statement with total uncertainty and such. Anyway, do you think many people around still trust Denis? If he wants trust, obey robots.txt, make the funding system clear, and make clear policies and regulations on content. P.S. The things you raise are not as important as what archive.is is - a page with absolutely no copyright safeguard (unsafe for editors, particularly new ones, to use), no content regulation (so he can prepare for jail, and site down), and a proven (not to you, I know, to me it's proven, the WP:QUACKs are already enough) usage of Wiki for promotion. Forbidden User ( talk) 18:17, 27 July 2014 (UTC)
Paleaqua is correct, "That still doesn't connect the insertion of the links to archive.is itself. That is what I would like to see proof or even just good evidence for." - Rotlink is a problem and those additions should have been nuked as Archive.is noted or all "suspect" links from Archive.is moved as freely given. The whole Archive.is matter breaks down once you realize that Wikipedia is costing Archive.is and they don't care what we do, but are willing to help our decision if need be. As I tried to state before, existing policy and procedure guides us, but I'm not going to condemn Archive.is for something they did not do. Try this whole thing when you separate Rotlink and Archive.is and you get what should have been done from the beginning - purge the Rotlink additions and CU those unusual pages Rotlink altered with unusual view counts. Rotlink's activity was a clear SEO and page ranking run - and the unusual activity spilled over onto Archive.is's hits. I don't think it was out to crash Archive.is - but it could have if gone unchecked by Kww.
ChrisGualtieri (
talk)
04:05, 29 July 2014 (UTC)
![]()
And so some links will actually be clicked. If the nofollow doesn't necessarily nullify the effect of links of search engine ranking (as proved by someone with linked evidence, correct me if I'm wrong), then archive.is is profitting off our free work (here "profit" is not confined to money). Forbidden User ( talk) 13:31, 31 July 2014 (UTC)
Perhaps we can all stop trying to persuade the admin to the closures we individually want? Anyway, no one knows the exact truth. Stop, please. Forbidden User ( talk) 16:36, 10 August 2014 (UTC) Another remark is that you seem to love doing things you allege others to be doing... like how you made false claims of knowing the "truth" here, and how you antagonise editors who are against archive.is based on their own reasonings/concerns/judgement on the event (Quote:"Frankly, I don't blame them... because people are actively out to bring them down and twist every word ever said.) and protagonise archive.is as some "innocent victims", again with no evidence that you promptly request from others. [[WP:NOTBATTLE|Please don't make divisive bad faith allegations, please. There is no persuasive reasoning to the claim, not to say the "truth". It applies to the claim of someone's conversation being the "truth". Forbidden User ( talk) 16:32, 12 August 2014 (UTC) |
More blog info from before. All come from the Archive.today blog.
Then, on seeing the error, the user will archive the page indirectly, first feeding the url to an url shortener (bit.ly, …) or to an anonimizer (hidemyass.com, ..) or to another on-demand archive (peeep.us, …), and then archive the same content from another url, thus bypassing the robots.txt restrictions. So, this check will not work the same way as it works with IA Archiver (which is actually a machine which takes decisions)."
As mentioned before, Denis did not operate Rotlink, but was aware of the bot's existence. This was confirmed through the blog of Archive.is
He said "it is my friends' project" in Sep 2012 (1 year before the proxy/bot incident) [5]. 87.69.97.159 ( talk) 11:11, 29 August 2014 (UTC)
The domains arcive.is and archive.today have are the same private person in Prague. There does not seem to be any organization or anybody answerable to anyone behind the site. We do not know anything about archiving routines and security and consequently cannot determine if the copies of the pages are genuine exact copies.
Furthermore the site has a functiontion showing which Wikipediapages are linked to the "archived" page. This page claims that the following pages are linked to it:
I have tried to verify that the english pages linkes to the "archive" page, but failed to verify (I could of course have missed something). What worries me more is that this could possibly for some not determined reason be the purpose of the domanins, to claim a relationship with Wikipedia. Furthermore I tried to archive the "archive"page, but WebCite failed to be able to do so due to receiving a Page Not Found error from the website concerned. If this was a trustworthy attemt at actually filing a reliable and legitimate copy I can't really se why it is not possible to archive.
My conclusion is that this is not a reliable archive. That using archives that are not reliable could possibly damage Wikipedia. In my opinion links to the netsites should be prohibited and removed. --ツツ Dyveldi ☯ prattle ✉ post 19:51, 29 August 2014 (UTC)
The backlinks that archive.today provides has nothing to do with Wikipedia. It lists all incoming URL links to that particular archive, which include blogs, forums, private wikis, and yes, Wikipedia. As an example, see
https://archive.today/lM1uP
. --
benlisquare
T•
C•
E
07:56, 30 August 2014 (UTC)
Thanks for the answers. I wondered about the resistance to being archived. My position though is based on the fact that there is no organization. This is a private person with an adress in Prague. There is nothing controlling how, where and what he archives or if this is a proper copy. Anything could be done to or happen to these links and nobody would be responsible for checking. Why serious sponsors should pay to maintain this I do not understand and as far as I can see nobody knows or has any control of any money here. We do not even know which country the person pays tax to. An organization on the other hand does regularily have rules and regulations and a governing body of people. To call someting an archive I do expect a minimum of control by and answerability to someone it is possible to identify. --ツツ Dyveldi ☯ prattle ✉ post 13:23, 30 August 2014 (UTC)
People should be encouraged to use archive.org and webcitation first. In the case where that is not possible, then they should be able to consider the option of using archive.today. It wouldn't be the first archive to go to, which means that Wikipedia projects wouldn't be swamped with archive.today URLs, slightly alleviating the concerns that there is a conspiracy to pair archive.today and Wikipedia as business partners. If an alternative archive of a page is available, it seems reasonable replacing existing archive.today URLs, but deletion of the URLs wholesale only intensifies the problem of having all the eggs in the same two baskets. We need to consider the problem of compring lesser evils here. -- benlisquare T• C• E 03:24, 31 August 2014 (UTC)
Comment on ownership and reliability of archiving domains.
Googling in this case only made me more convinced that this netsite is under no control we can or should rely on.
What is possble though if all these links are deleted is to leave the original url. If this domain is banned It is possible to make a small manual to explain how to find an archived url at archive.is . Getting a look at the page can help editors to find the original page which quite often is not gone but is moved to a different url. The worst problem with link rot is after all that quite often the name of the article is not given or even when or where it was found, which makes it quite impossible to find out what the reference was. --ツツ Dyveldi ☯ prattle ✉ post 19:11, 1 September 2014 (UTC)
Not here to promote other archive sites. Policies of other sites are not our concern. |
---|
The following discussion has been closed. Please do not modify it. |
Should the nature of response, to, a privacy request be determined on the strength of our will to support the request or, the strength of the request to prevent us, ignoring it? If robots.txt requests that archivers do not archive the site, then it is on us to assume that they know what they are doing. They, the websites with the robots.txt files, have elected not to make themselves available for, this process. That equates a request not to proceed use of content, not to use the content in this way, or even access it in this way. The decision has been made by the owners and providers of the content. This site does: follow owners instructions not to use material, without evaluation of any sort, except, where that material may be direct historical reference in and of itself, and for that kind of reference, archive sites are not required. Because to be verifiably historical to reference, by our Wikipedian standards, it must be published widely, and if that's not what you thought it was just read to the end and click the link and forget about these companies who seen archive.org and bought up all of the archive web names. It is a shame if any content is lost adhering to freedom policies, but if something requires a worldly change to make it freely available, that change does not start here, and that doesn't sound very cool or anything, but it is part of the rock upon which the site is founded. You'll see. We didn't need it. Also for your interest, Archive.org is much more than snapshots of web pages. It is not a race to become the archive of everything that ever appeared on the internet. It's an officially licensed book and film library, based on open freely licensed material, and much in line with the goals of the Wikipedia project, so you should all go there and chip them one and read it. No consensus to change. Archive.is is an erroneous addition. Move to close. ~ R. T. G 13:53, 31 August 2014 (UTC)
I guess he means archive.org is in line with our values more, and he thinks that robot.txt is a privacy request which should be respected. Privacy/copyright has been put forward by several people, however some others think that archiving does not interfere with privacy/anything can be done as long as it's legal, so continuing would be like an ideological battle. Sometimes an editor doesn't mean to put fact forward. Only arrogance is demonstrated in calling others inferior. Forbidden User ( talk) 10:17, 1 September 2014 (UTC)
Cool down a bit. Written content is easily misunderstood. I think RTG's latest comment is somehow improved. It might be more meaningful to look at. Forbidden User ( talk) 17:57, 2 September 2014 (UTC) |
By the way I'm taking a Wikibreak, good day. Forbidden User ( talk) 17:57, 2 September 2014 (UTC)
@ PaleAqua and Kww:: Could you two or others knowing the process help me oversight those edits of my replacing the 122. IP's signature with mine? Forbidden User ( talk) 17:57, 2 September 2014 (UTC)
It is now a full three months. Consensus is not clear. Many editors want to see another alternative archive site, but there are serious challenges to this sites suitability for WP, and those challenges have been rebutted only with a willingness to overlook, and that unfortunately is not good enough. The site was added maliciously. The background of the site is a total mystery. These two items alone are enough to close discussion without satisfactory restitution. WP content must be reliably sourced. This applies to all content. That item must be satisfied or there can be no outcome from a discussion. Sorry.
Notes:
It is now a full three months. Consensus has turned toward supporting Archive.today. No serious objections to use of archive.today for WP have been made. Many editors want to have archive.today functionality. The only rebuttal requires a willingness to overlook the damage banning the site is doing to wikipedia and that unfortunately is not good enough. If I don't know the meaning of a big word like restitution, I shouldn't use it. WP content must be verifiable. This applies to all content. There can be no compromise on that based on overblown concerns about who to blame for old behavior or ghosts of the past. (My point being that the section above this one is incredibly biased, and two can play that game.)--{{U| Elvey}} ( t• c) 01:29, 2 October 2014 (UTC)
I just created an article on near-Earth asteroid 2014 SC324. Since the observation arc is only 2 days long today and will be longer tomorrow, I used archive.today to make a capture of the public domain JPL page that archive.org will not capture. Too bad reference #5 has to be coded as if I am referencing some obscure offline source. -- Kheider ( talk) 20:30, 2 October 2014 (UTC)
The fact that it is used is not related to the fact that it does not establish reliability. People know it is used, or they wouldn't ask for it not to be used. We could just copy and paste everything on the internet, but that's 4chan, that's Facebook, this site is Wikipedia, and it is about, reliable encyclopaediac information. ~ R. T. G 10:35, 3 October 2014 (UTC)
Webcite not accessable 7 Oct 2014 but we still want to limit our options by blocking Archive.today? -- Kheider ( talk) 17:57, 7 October 2014 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Archive.is is an archiving service similar to sites like Webcite and the Wayback Machine, offering different levels of service up to and including snapshots that are retained regardless of modern changes in a sites robots.txt file, which the Wayback Machine can abandon (potentially delaying rather than removing the potential for LinkRot), while Webcite has presented itself as having an uncertain long term future tied to funding. No issues have been found with the quality of the snapshots provided at archive.is.
In August 2013, a bot called User:RotlinkBot, created by Rotlink began linking Wikipedia articles to the new Archive.is service. This bot was not approved, and was therefore subsequently blocked. This block was procedural, and made based on the lack of approval, not the quality of the RotlinkBot's edits.
Following this block, edits matching edits from the bot, including the edit summaries, were made from hundreds of IPs, residential and business, from three different Indian states, Italy, Hong Kong, Vietnam, Bulgaria, Qatar, Latvia, Hungary, Slovakia, Romania, Brazil, Argentina, Portugal, Spain, France, Mexico, Austria, and South Africa. Based on fears that the IPs were not being used legally, these IPs, and User:Rotlink, self-identified as the owner of archive.is (Note: struck 3 Oct 2014 because this is an unverifiable claim with no presented evidence supporting it, per discussion in
3.7), were subsequently blocked. Rotlink has not commented on any of the blocks.
The previous RFC regarding Archive.is concluded that the site should be added to the blacklist and that all existing links to archive.is should be removed. In few cases, the removal of archive.is links has resulted in LINKROT.
Archive.is has never been added to the spam blacklist because the use of the blacklist would require the links to be removed before unrelated edits could be made to the article. Instead, an edit filter has been applied which prevents additions of the link, but does not prevent editing articles which simply contain the link.
The concerns about the potential for malware raised in the RFC have not materialized at this point, leading to arguments as to whether those fears were well-founded. An effort to get a bot approved to implement the RFC result stalled, indicating that the community may no longer believe the block to be warranted.
Archive.is does use advertising. (Note: struck 1 Oct 2014 because this is an unverifiable claim with no presented evidence supporting it, per discussion on talk page.) The previous discussion showed that some editors considered this to be a major issue, but there was no strong consensus either for or against the site based on this.
Based on the questions of consensus raised during Wikipedia:Bots/Requests for approval/Archivedotisbot, the community should discuss whether the previous consensus is still in force
"I think we will, at some point, need a proper archiving solution." - When that happens, then by all means provide full support for it, I personally would welcome such a thing. But at the moment, we don't have anything like that. Why not make do with what we have in front of us? There's the saying that "beggars can't be choosers", and we're all beggars here; if you're not one, then I invite you to drop a few thousand dollars on a budget-performance RAID array server to host your own archival service, and provide the non-commercial "perfect" service that you've so keenly described for all of us. If no one's going to do it, why not take the initiative yourself? Don't want to? Then making do with archive.is is a plausible beggar's solution until someone comes along and wants to. -- benlisquare T• C• E 02:09, 26 July 2014 (UTC)
Winston Churchill: "I think you really should reconsider your position, Mr. Chamberlain."
Neville Chamberlain: "Do you have any proof that Hitler will invade Poland next year? I won't be convinced until you prove to me that this will happen in the future. What's the matter, you can't do that? I guess that settles then. Peace in our time!"
Do you realise how silly you sound? -- benlisquare T• C• E 05:01, 27 July 2014 (UTC)
{{
cite|…}}
and friends changed how deadurls are handled. Currently |deadurl=no
is required to have the live link first if a archive link is present. I kinda wish the option had been set up differently. Say have a cite error if url and archive url are provided and deadlink is not set to "yes" or "no". If "deadurl" is set to no, I'd love for any archive link to be displayed with a CSS style that is hidden by default which would allow for custom style sheets to be able to show them for users that are interested in them.
PaleAqua (
talk)
07:53, 3 July 2014 (UTC)Darkwarriorblake your opening statement: "Archive.is has never been added to the spam blacklist because the use of the blacklist would require the links to be removed before unrelated edits could be made to the article. Instead, an edit filter has been applied which prevents additions of the link, but does not prevent editing articles which simply contain the link." is a misinterpretation of my remarks on the request for blacklisting and how blacklisting/edit-filtering works. The way the edit-filter is currently set-up is exactly the same functionality that the blacklist would give (when the blacklisting was put on hold, the filter was more restrictive than the blacklisting would be). My reason not to blacklist at that time was that there were so many links that accidental removals could hurt editing experiences - unfortunately, the edit-filter is now having the same effect (though edit-filter-managers are trying to mitigate that). IMHO, blacklisting can (should?) now be implemented and the edit-filter disabled (the latter is a bigger strain on the server) per the standing consensus of the previous RfC. If this RfC then overturns that decision then blacklisting could be removed. --
Dirk Beetstra
T
C
05:02, 3 July 2014 (UTC)
People, what are you talking about? Have you ever looked up at Alexa graphs during your investigation? Was Rotlink's job really useful to archive.is as you assume uncontionally? Just check the graph and then tell me, who is enemy of archive.is and who are paid editors.— Preceding unsigned comment added by 83.245.226.111 ( talk • contribs) This template must be substituted.
I continue to feel frustrated by this situation, and the apparent failure of the WMF to be more proactive about web archives. Clearly a significant portion of the community feels a need for proactively archiving linked references in advance, in order to deal with the potential for link rot. But there seem to be no widely acceptable solutions. Whether any third-party site can be relied on never to spam or place advertisements is questionable. So why not this solution: Use the file namespace for link archives. Why can't an editor simply upload an image of the page that they want to archive, and simply store it—either locally or on Commons? Wbm1058 ( talk) 16:52, 6 July 2014 (UTC)
I want to cache all 20 million external link so references can be done in a post processing stage where details (title, author, date) can be cross verified and lots of other goodies. The foundation employee have calling 24 TB excessive (it's not) and are not working with me.— Dispenser 20:40, 17 February 2014 (UTC)
I'm not sure if the Administrators already know that, or if its even the right place to let them know, but I've just noticed that archive.is changed its domain name to archive.today. — Mayast ( talk) 20:09, 13 July 2014 (UTC)
As the discussion goes heated, I'd recommend everyone to comment on content and remain civil. Some comments, like "he knows nothing about content creation", "absurd, hysterical and paranoid", or "paranoid hysteria" are no good arguments, and nothing helpful to the discussion. Forbidden User ( talk) 16:45, 21 July 2014 (UTC)
nothing more to see here, just some incivility. -- Mdann 52 talk to me! 17:17, 12 August 2014 (UTC) |
---|
The following discussion has been closed. Please do not modify it. |
This is an attempt to review the allegations and analyze them with Archive.is's own words and the evidence already gathered. The most common complaint that Rotlink must be Archive.is is that "other editors said it so it must be true". The claim has been repeatedly been rejected on the lack of evidence, but also by Archive.is's operator Denis. The following comes from the e-mails from Wikipedia:Archive.is RFC discussion between Lexein and Denis of Archive.is. Of particular concerns are:
Thank you for your reasoning here, so that I can understand your point better. So, you think editors are being sheeps that follow the one before them. No, they think those "leaders" have sufficient reasons and proofs. There are too many unsupported statements that I don't want to waste time citing. For example, RotLink's act at least raised the notability of archive.is, and no matter how small, it helps the rankings, and that's it, it is gaining off our works - proportion does not matter. "It is clear that the person behind Rotlink was already familiar with Wikipedia's policies and procedures - but got fed up and attempted to force it through. Resulting in the situation. Note that the person mass-removing Archive.is was using a "hacked" version of AWB to bypass the requirements for permission and socking bad enough that a special abuse filter, edit filter 620, had to be created and reactivated because of continued abuse. Neither case was a botnet, just a committed proxy/IP hopping single editor." I'm not joking, this is a very appealing argument, made out of zero basis. You said argumentum ad populum is not proof, so take your own advice. The same for the writing pattern issue, unless you're an expert on it. "Do not panic! I do not plan to stop the service, to delete or alter the snapshots, to put ads or malware on it, etc." He lies in the first sentence? What a good liar. In their FAQ they don't have the courage to say so, shown by the stammery statement with total uncertainty and such. Anyway, do you think many people around still trust Denis? If he wants trust, obey robots.txt, make the funding system clear, and make clear policies and regulations on content. P.S. The things you raise are not as important as what archive.is is - a page with absolutely no copyright safeguard (unsafe for editors, particularly new ones, to use), no content regulation (so he can prepare for jail, and site down), and a proven (not to you, I know, to me it's proven, the WP:QUACKs are already enough) usage of Wiki for promotion. Forbidden User ( talk) 18:17, 27 July 2014 (UTC)
Paleaqua is correct, "That still doesn't connect the insertion of the links to archive.is itself. That is what I would like to see proof or even just good evidence for." - Rotlink is a problem and those additions should have been nuked as Archive.is noted or all "suspect" links from Archive.is moved as freely given. The whole Archive.is matter breaks down once you realize that Wikipedia is costing Archive.is and they don't care what we do, but are willing to help our decision if need be. As I tried to state before, existing policy and procedure guides us, but I'm not going to condemn Archive.is for something they did not do. Try this whole thing when you separate Rotlink and Archive.is and you get what should have been done from the beginning - purge the Rotlink additions and CU those unusual pages Rotlink altered with unusual view counts. Rotlink's activity was a clear SEO and page ranking run - and the unusual activity spilled over onto Archive.is's hits. I don't think it was out to crash Archive.is - but it could have if gone unchecked by Kww.
ChrisGualtieri (
talk)
04:05, 29 July 2014 (UTC)
![]()
And so some links will actually be clicked. If the nofollow doesn't necessarily nullify the effect of links of search engine ranking (as proved by someone with linked evidence, correct me if I'm wrong), then archive.is is profitting off our free work (here "profit" is not confined to money). Forbidden User ( talk) 13:31, 31 July 2014 (UTC)
Perhaps we can all stop trying to persuade the admin to the closures we individually want? Anyway, no one knows the exact truth. Stop, please. Forbidden User ( talk) 16:36, 10 August 2014 (UTC) Another remark is that you seem to love doing things you allege others to be doing... like how you made false claims of knowing the "truth" here, and how you antagonise editors who are against archive.is based on their own reasonings/concerns/judgement on the event (Quote:"Frankly, I don't blame them... because people are actively out to bring them down and twist every word ever said.) and protagonise archive.is as some "innocent victims", again with no evidence that you promptly request from others. [[WP:NOTBATTLE|Please don't make divisive bad faith allegations, please. There is no persuasive reasoning to the claim, not to say the "truth". It applies to the claim of someone's conversation being the "truth". Forbidden User ( talk) 16:32, 12 August 2014 (UTC) |
More blog info from before. All come from the Archive.today blog.
Then, on seeing the error, the user will archive the page indirectly, first feeding the url to an url shortener (bit.ly, …) or to an anonimizer (hidemyass.com, ..) or to another on-demand archive (peeep.us, …), and then archive the same content from another url, thus bypassing the robots.txt restrictions. So, this check will not work the same way as it works with IA Archiver (which is actually a machine which takes decisions)."
As mentioned before, Denis did not operate Rotlink, but was aware of the bot's existence. This was confirmed through the blog of Archive.is
He said "it is my friends' project" in Sep 2012 (1 year before the proxy/bot incident) [5]. 87.69.97.159 ( talk) 11:11, 29 August 2014 (UTC)
The domains arcive.is and archive.today have are the same private person in Prague. There does not seem to be any organization or anybody answerable to anyone behind the site. We do not know anything about archiving routines and security and consequently cannot determine if the copies of the pages are genuine exact copies.
Furthermore the site has a functiontion showing which Wikipediapages are linked to the "archived" page. This page claims that the following pages are linked to it:
I have tried to verify that the english pages linkes to the "archive" page, but failed to verify (I could of course have missed something). What worries me more is that this could possibly for some not determined reason be the purpose of the domanins, to claim a relationship with Wikipedia. Furthermore I tried to archive the "archive"page, but WebCite failed to be able to do so due to receiving a Page Not Found error from the website concerned. If this was a trustworthy attemt at actually filing a reliable and legitimate copy I can't really se why it is not possible to archive.
My conclusion is that this is not a reliable archive. That using archives that are not reliable could possibly damage Wikipedia. In my opinion links to the netsites should be prohibited and removed. --ツツ Dyveldi ☯ prattle ✉ post 19:51, 29 August 2014 (UTC)
The backlinks that archive.today provides has nothing to do with Wikipedia. It lists all incoming URL links to that particular archive, which include blogs, forums, private wikis, and yes, Wikipedia. As an example, see
https://archive.today/lM1uP
. --
benlisquare
T•
C•
E
07:56, 30 August 2014 (UTC)
Thanks for the answers. I wondered about the resistance to being archived. My position though is based on the fact that there is no organization. This is a private person with an adress in Prague. There is nothing controlling how, where and what he archives or if this is a proper copy. Anything could be done to or happen to these links and nobody would be responsible for checking. Why serious sponsors should pay to maintain this I do not understand and as far as I can see nobody knows or has any control of any money here. We do not even know which country the person pays tax to. An organization on the other hand does regularily have rules and regulations and a governing body of people. To call someting an archive I do expect a minimum of control by and answerability to someone it is possible to identify. --ツツ Dyveldi ☯ prattle ✉ post 13:23, 30 August 2014 (UTC)
People should be encouraged to use archive.org and webcitation first. In the case where that is not possible, then they should be able to consider the option of using archive.today. It wouldn't be the first archive to go to, which means that Wikipedia projects wouldn't be swamped with archive.today URLs, slightly alleviating the concerns that there is a conspiracy to pair archive.today and Wikipedia as business partners. If an alternative archive of a page is available, it seems reasonable replacing existing archive.today URLs, but deletion of the URLs wholesale only intensifies the problem of having all the eggs in the same two baskets. We need to consider the problem of compring lesser evils here. -- benlisquare T• C• E 03:24, 31 August 2014 (UTC)
Comment on ownership and reliability of archiving domains.
Googling in this case only made me more convinced that this netsite is under no control we can or should rely on.
What is possble though if all these links are deleted is to leave the original url. If this domain is banned It is possible to make a small manual to explain how to find an archived url at archive.is . Getting a look at the page can help editors to find the original page which quite often is not gone but is moved to a different url. The worst problem with link rot is after all that quite often the name of the article is not given or even when or where it was found, which makes it quite impossible to find out what the reference was. --ツツ Dyveldi ☯ prattle ✉ post 19:11, 1 September 2014 (UTC)
Not here to promote other archive sites. Policies of other sites are not our concern. |
---|
The following discussion has been closed. Please do not modify it. |
Should the nature of response, to, a privacy request be determined on the strength of our will to support the request or, the strength of the request to prevent us, ignoring it? If robots.txt requests that archivers do not archive the site, then it is on us to assume that they know what they are doing. They, the websites with the robots.txt files, have elected not to make themselves available for, this process. That equates a request not to proceed use of content, not to use the content in this way, or even access it in this way. The decision has been made by the owners and providers of the content. This site does: follow owners instructions not to use material, without evaluation of any sort, except, where that material may be direct historical reference in and of itself, and for that kind of reference, archive sites are not required. Because to be verifiably historical to reference, by our Wikipedian standards, it must be published widely, and if that's not what you thought it was just read to the end and click the link and forget about these companies who seen archive.org and bought up all of the archive web names. It is a shame if any content is lost adhering to freedom policies, but if something requires a worldly change to make it freely available, that change does not start here, and that doesn't sound very cool or anything, but it is part of the rock upon which the site is founded. You'll see. We didn't need it. Also for your interest, Archive.org is much more than snapshots of web pages. It is not a race to become the archive of everything that ever appeared on the internet. It's an officially licensed book and film library, based on open freely licensed material, and much in line with the goals of the Wikipedia project, so you should all go there and chip them one and read it. No consensus to change. Archive.is is an erroneous addition. Move to close. ~ R. T. G 13:53, 31 August 2014 (UTC)
I guess he means archive.org is in line with our values more, and he thinks that robot.txt is a privacy request which should be respected. Privacy/copyright has been put forward by several people, however some others think that archiving does not interfere with privacy/anything can be done as long as it's legal, so continuing would be like an ideological battle. Sometimes an editor doesn't mean to put fact forward. Only arrogance is demonstrated in calling others inferior. Forbidden User ( talk) 10:17, 1 September 2014 (UTC)
Cool down a bit. Written content is easily misunderstood. I think RTG's latest comment is somehow improved. It might be more meaningful to look at. Forbidden User ( talk) 17:57, 2 September 2014 (UTC) |
By the way I'm taking a Wikibreak, good day. Forbidden User ( talk) 17:57, 2 September 2014 (UTC)
@ PaleAqua and Kww:: Could you two or others knowing the process help me oversight those edits of my replacing the 122. IP's signature with mine? Forbidden User ( talk) 17:57, 2 September 2014 (UTC)
It is now a full three months. Consensus is not clear. Many editors want to see another alternative archive site, but there are serious challenges to this sites suitability for WP, and those challenges have been rebutted only with a willingness to overlook, and that unfortunately is not good enough. The site was added maliciously. The background of the site is a total mystery. These two items alone are enough to close discussion without satisfactory restitution. WP content must be reliably sourced. This applies to all content. That item must be satisfied or there can be no outcome from a discussion. Sorry.
Notes:
It is now a full three months. Consensus has turned toward supporting Archive.today. No serious objections to use of archive.today for WP have been made. Many editors want to have archive.today functionality. The only rebuttal requires a willingness to overlook the damage banning the site is doing to wikipedia and that unfortunately is not good enough. If I don't know the meaning of a big word like restitution, I shouldn't use it. WP content must be verifiable. This applies to all content. There can be no compromise on that based on overblown concerns about who to blame for old behavior or ghosts of the past. (My point being that the section above this one is incredibly biased, and two can play that game.)--{{U| Elvey}} ( t• c) 01:29, 2 October 2014 (UTC)
I just created an article on near-Earth asteroid 2014 SC324. Since the observation arc is only 2 days long today and will be longer tomorrow, I used archive.today to make a capture of the public domain JPL page that archive.org will not capture. Too bad reference #5 has to be coded as if I am referencing some obscure offline source. -- Kheider ( talk) 20:30, 2 October 2014 (UTC)
The fact that it is used is not related to the fact that it does not establish reliability. People know it is used, or they wouldn't ask for it not to be used. We could just copy and paste everything on the internet, but that's 4chan, that's Facebook, this site is Wikipedia, and it is about, reliable encyclopaediac information. ~ R. T. G 10:35, 3 October 2014 (UTC)
Webcite not accessable 7 Oct 2014 but we still want to limit our options by blocking Archive.today? -- Kheider ( talk) 17:57, 7 October 2014 (UTC)