This is the
talk page for discussing improvements to the
Gossip protocol article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
This article has not yet been rated on Wikipedia's content assessment scale. |
This article links to one or more target anchors that no longer exist.
Please help fix the broken anchors. You can remove this template after fixing the problems. |
Reporting errors |
Most of this page is extracted from a longer article I wrote for a special edition of Operating Systems Review (an ACM publication that doesn't impose copyright restrictions). The longer article seemed a bit too detailed for a Wiki page, but it seemed to me that an encyclopedia treatment really should at least explain what a gossip protocol is, and isn't.
The references I included here are all cited in the longer article. Some are papers I co-authored but most are by other folks. I hope this isn't a violation of the POV/COI policies. I'm not trying to sell anything and if people want to edit the article to reduce any perceived bias, go for it!
Are illustrations needed? Ken Birman 12:59, 14 June 2007 (UTC)
This article has numerous problems. First of all the editor who created the artice, User:Ken Birman has a conflict of interest ( WP:COI) in that he is the developer of Virtual synchrony software, as he has stated. The article does not contain in-line citations. Its tone is inappropriate for an encyclopedia, a casual discussion in some areas, abruptly introduced technical language in others, sometimes both in one, so the article seems like a casual discussion among insiders or buyers in places, and unfocused run on descriptions of the process without any concept of the reader of Wikipedia articles ("Developers of distributed computer systems often need a way to replicate data for sharing between programs running on multiple machines, connected by a network. Virtual synchrony is one of three major technologies for solving this problem. The key idea is to create a form of distributed state machine associated with the replicated data item"). The examples are awkwardly introduced (probably due to COI), the prose needs thoroughly edited. The sections need cohesively structured, with internal order, for the reader of the article-- WP:MOS. I have tagged these articles with requests for this clean-up in case there are interested Wikipedia editors who can improve these articles, in particular starting with structure and pose before moving on to technical accuracy. Ken Birman has made it clear he does not want me to edit the articles. KP Botany 18:23, 23 June 2007 (UTC)
On the OR: I included this tag because of the COI between the primary editor and the topic and the failure to include in-line citations, coupled with the disorganized structure of the article, that altogether make difficult the direct verification of information and original research. KP Botany 18:33, 23 June 2007 (UTC)
Quick question for Ken Birman: How do you recouncil your 14 June 2007 statement
above with
The first statement expresses at the least some doubts, while the second denies them completely, or at least denies a tangential concern or subset of the actually stated concern. -- Calton | Talk 00:00, 30 June 2007 (UTC)
This article reads like an advertisement for 'Gossip Protocol'. Many of the claims are unsupported. The "gossip protocol is one that satisfies the following conditions" is even much more specific than in real life. — Preceding unsigned comment added by 50.68.57.160 ( talk) 03:03, 26 November 2018 (UTC)
I thought this article did a good job -- I came here looking for info on epidemic/gossip protocols, and learned most of what I wanted to know. The technical content and exposition is fine, and the prose is serviceable, though it could benefit a little from copy-editing. I'd like to see more details, but then again, that's what the references at the end are for.
But I'm a computer scientist, not a botanist, so what do I know... JensAlfke 23:53, 3 July 2007 (UTC)
And, take your comments about me to my talk page, not to this article. KP Botany 01:11, 5 July 2007 (UTC)
I also liked this article. I don't see nor have they been named here any false claims. Perhaps some details (exact numbers on size) could be added to the data center example. 141.24.33.171 ( talk) 12:51, 11 July 2008 (UTC)
For about six months now, this page has been tagged for Point of View, Original Research and Factual Accuracy. After a brief period of discussion and editing activity, nothing has happened for six months.
Here's my question: as a matter of policy, do such tags remain on the page forever, or is there some procedure for agreeing to remove them? It seems safe to assume that KPBotany, who placed them there, would not consider the matter resolved, since nothing substantive changed during this period. However, perhaps that particular reviewer will never find comfort with this material -- such things do sometimes happen. Ken Birman ( talk) 18:04, 21 December 2007 (UTC)
User_talk:KP_Botany in saying that this article needs work. I wouldn't know where to go with it. BrewJay ( talk) 00:34, 14 July 2008 (UTC)
It's a multi-cast protocol. It hosts gossip. It's reliable. It's based upon internet structure that has existed for decades. Some companies are routing sound and files over it in real-time. Why reinvent the wheel? BrewJay ( talk) 12:03, 4 July 2008 (UTC)
Why don't add it to see-also? 141.24.33.171 ( talk) 12:48, 11 July 2008 (UTC)
Good suggestion. I don't see any reason not to add it. But do keep in mind that when companies like Amazon use gossip, they mean they are using this "class" of solutions -- they have an abstracted behavior in mind. IRC is a very specific protocol, with an associated RFC standard, and this is definitely NOT the implementation of gossip Amazon favors. In fact there are hundreds of gossip protocols, and I would say that the modern perspective is that IRC is an early example of a particular application that uses a gossip mechanism. One would think of it side-by-side with a dozen others (Astrolabe, the Amazon shopping cart, etc), each of which also is an application of some kind running on a gossip protocol of some type. But this said, many people know about IRC and using it as an example would be super. Go for it... Ken Birman ( talk) 14:34, 9 August 2008 (UTC)
"Thus, while one could run a 2-phase commit protocol over a gossip substrate, doing so would be at odds with the spirit, if not the wording, of the definition."
This article might be useful in understanding the problems that routing protocols hav already solved by *not* trying to see the whole path that a packet travels. They don't try to make trees out of network topology. People make maps out of it, sometimes. Perhaps you could explain how a gossip protocol would also be reliable *without* a lock and an acknowledgement to that lock (2-phases). For me, it's basic versing, and I don't see any easy ways around locking without comparisons and arbitration that pretty much, in the end, need to occur on one CPU that needs to serialize that arbitration before it saves. The Amazon shopping cart runs on cookies. It could run on URL-code in HTML that it sends you just as easily. When cookies were being derided and misunderstood, hotmail was using URL-code to identify users more certainly than IP#.
The spirit of gossip is a lack of reliability. So, I hope you understand that I'm inclined to ask for references or terse descriptions of implementations when you indicate "logarithmic growth". BrewJay ( talk) 14:48, 16 October 2008 (UTC)
Someone asked for Gossip-based multicast protocol to be created. I redirected it to Internet Relay Chat, because it carries gossip. Later, after deletion of my redirection, I discovered an uninformative article on a proprietary protocol that doesn't seem to do anything that IP doesn't, when implemented. For example, I know that there are several ways to rank a routing table:
The more of that work you do, the more expensive your router becomes, but considering the ratio of speed between a CPU (mine is rated at about 27 gigabits per second) and network speed, such work is feasible.
Considering it further, I think reliability statistics would need to be mapped and averaged over a week. There might be reliable times when a network node is saturated with traffic or regular maintenance. BrewJay ( talk) 06:37, 9 July 2008 (UTC)
I didn't make a deletion proposal based on the reality of the subject, because I KNOW that it's real; I'm using a gossip protocol (as I'm led to understand the definition) called Transmission Control Protocol over IP.
gossip protocol isn't notable. See category:internet architecture. BrewJay ( talk) 21:55, 8 July 2008 (UTC)
"*Anti-entropy protocols for repairing replicated data, which operate by comparing replicas and reconciling differences."
The current link for "Anti-entropy protocols" goes to "Error correction", which is wrong. These are two distinct concepts; the names are not synonyms. Either the link text needs to be "Error correction" instead of "Anti-entropy protocols", or the link itself needs to point to "Anti-entry protocols" instead of "Error correction". As the text describing the link indicates the topic being Anti-Entropy protocols, I'd recommend to change the link towards "Anti-Entropy protocols". if the link is meant to point to error correction, then I'd recommend to adapt the text describing the link is adapted accordingly. --Christian Kerth (christian.kerth@gmx.net), 08:55, 19 July 2018 (UTC) — Preceding unsigned comment added by 153.96.12.26 ( talk)
In the analogical lead of the article's beginning, that's what I get...that computers on the internet communicate directly along some wavelength. On some networks, like analog telephone (I'm not sure if that exists anymore), or packet radio, maybe they do, but there's no great element of randomness in it. A user picks a BBS, dials the number, BINGO, you hav a peer... Or, in packet radio, some ham scans the radio waves, recognizes digital, then tries to decode it using internet documents and utilities. Relaying QWK-based mail might happen a gossipy manner, but it was (is?) explicitly scheduled -- nothing like the random walk presented in this article. BrewJay ( talk) 07:20, 9 July 2008 (UTC)
b) Background data dissemination protocols continuously gossip about information associated with the participating nodes. Typically, propagation latency isn’t a concern, perhaps because the information in question changes slowly or there is no significant penalty for acting upon slightly stale data.
These are weasel words.
Here's an interesting story for you folks:
... so clearly, gossip protocols are a big deal in modern computing data centers, and play critical roles. Places like Amazon are using them, and when they go wrong, the data center can become unusable! So the moral of the story is: if you work on distributed computer systems in 2008, you had better know what gossip protcools do, how to build them, and how to get them right! Which isn't to diminish the importance of ALSO knowing how routing works. BTW, Amazon makes a lot of use of gossip. I know of some applications that work well for them -- they reported on one, in the shopping cart subsystem, at the SOSP conference in 2007. Another is the Astrolabe system, which they acquired some years ago and then rebuilt internally. But I don't know this specific gossip subsystem (the one in S3) and hence can't guess at what went wrong, beyond what Amazon is saying in their little announcement. Sounds like some form of corrupted data snuck into the gossip subsystem and got stuck there. This is easy to imagine -- without taking adequate care, gossip protocols can be very much at risk of that sort of accidental contamination... Ken Birman ( talk) 11:06, 23 July 2008 (UTC)
The article currently says that pairwise interactions are a defining characteristic of a gossip protocol. To extend the office rumor analogy, what about the case where three or four people stand around the water cooler and listen to the same rumor when it is only spoken once? In other words, would something still be considered a "gossip protocol" if the underlying transmission can involve subscribing to some sort of broadcast of the message? If so, I think the "pairwise" language could be removed. The reason I am confused is because it also states that one of the uses of gossip is to implement multicast, but gossip also seems useful at a layer where multicast is already provided by some deeper layer... Maghnus ( talk) 00:37, 11 August 2008 (UTC)
"Search strings known to A will now also be known to B, and vice versa. In the next "round" of gossip A and B will pick additional random peers, maybe C and D."
Period Nodes 1 2 2 4 3 8
...in thirty-two communication periods, you can hav over four billion nodes with information intended for broadcast. I wrote that method for QWK-packets, which were an early BBS standard for newsgroups. That standard didn't hav message-IDs, so my method had no potential (unless the advantage were worth seeking unique messages, for which I didn't hav an efficient method; now I would use a perfect hash. It still doesn't hav much potential, because hook-ups on the internet are largely manual and static. Notice that my method DEPENDS upon static addresses AND scheduled, regular communication periods. In practice, I don't see any hope for it, except perhaps within one box, where it has been blown away in practical applications, probably long before I even wrote it.
I don't doubt that there are a lot of references for this in papers that have analyzed practicing it for use within a box, where it's easy to hard-wire timing, and when you stick randomness and unreliability into it, then you need some very special error correction protocol, indeed. I read that Intel spends a lot of money on noise control (crosstalk) within a CPU. BrewJay ( talk) 16:42, 16 October 2008 (UTC)
"The term convergently consistent is sometimes used to describe protocols that achieve exponentially rapid spread of information. For this purpose, a protocol must propagate any new information to all nodes that will be affected by the information within time logarithmic in the size of the system (the "mixing time" must be logarithmic in system size)."
In the leadup, there is a mention of "sometimes". I will move to strike it out, because the word is hard to use in Physics. It is a troublesome thing to hav bits sometimes lost or to form an association for collaboration that is irregular. I understand that you want to demonstrate flexibility, reliability and performance. When a router goes down, then the metric for that router in neighbouring routers quickly registers unreachable. That propagates like gossip. BrewJay ( talk) 01:57, 21 October 2008 (UTC)
My theory goes that if you try to design something that is everything to everyone, then you will end up with a system that is nothing to anyone. Professor Birman wants to express that gossip protocols are reliable, perform superbly, and are ultimately flexible. I am still not getting the impression of any design choices made, rather I am led to believe that it's some general description of all protocols that may call themselves gossip; thousands. It is anything but clear, and yet I cannot pinpoint any particular words that are making this document foggy.
If I continue to work on this document, then I think I will try to impose limits and link it to other topics in network architecture; rather than say that reliability is not assumed, I will ask for how compensations are made. If the professor still wants to describe what works on paper, then I will want the scope narrowed to what can be adequately described, here, in examples. BrewJay ( talk) 02:42, 21 October 2008 (UTC)
(Growth in time consumed for a broadcast mirroring transaction that is proportional to system size.)
Initial Condition: Machine A contains messages 1,1,1,1 '' B " " 2,2,2,2 C = 3,3,3,3 D = 4,4,4,4 Machine A Exchanges with B Machine C Exchanges with D State after first period: A = 1,2,1,2 B = 2,1,2,1 C = 3,4,3,4 D = 4,3,4,3 A Exchanges with C B Exchanges with D State after second period: A = 1,2,3,4 B = 2,1,4,3 C = 3,4,1,2 D = 4,3,2,1
After two periods, four machines contain information intended for broadcast.
Period Nodes 1 2 2 4 3 8
The number of channels required for this process increases at half the number of machines, which means that it is proportional (counting in half duplex (one direction at a time), it's identical). The amount of information that an exchange period can require can double at every step, so it is necessary to either double the length of each consecutive period, or arbitrarily limit the amount of information that can be transmitted in a cycle of periods. This means that there will be more latency in the system during peak hours, unless each consecutive period doubles in length. Taking the choice to double consecutive period length adds up so that the time required for a broadcast is proportional to the number of nodes, not logarithmic.
Period Length Nodes 1 1 2 2 2 4 3 4 8 7
For some reason, I'm not quite getting eight for a total in column two, so I think I'm making a one-off error. The result I'm expecting is identical to that for one-to-many by rotation (Is that approximately what tokenring means?): proportional. As I said in earlier comments, I like how simple the pairwise choice makes error correction; I could choose V.42bis, for example. I would choose to *always* double the length of consecutive periods, and periodically arbitrate the baseline of real time that represents one in my table above, based upon the most congested server. Proportional is an optimal result in sorting, too. Since the level of congestion on each server would be a few quantities of fixed length, I would broadcast that between periods. If this were to go as far as shared files, then it might be advisable to rotate periods that lock files, accept file locks, distribute files, and gauge conjestion.
I've told people that once you've stated your assumptions and shown how your math works, that it isn't always necessary to follow WP:NOR. (That's less true in Physics, less true yet in Chemistry, and it's downright bonkers in Biochemistry). I think it needs peer review, though, which is why I'm honoured to be conversing with a professor, however much trouble he might be having with encyclopedic standards that really do encourage you to link your stuff to pre-requisites. I'm thinking I should touch up this article with the graphics, and give Ken some time to adjust the words, assuming he cannot find a way to beat proportional. BrewJay ( talk) 14:00, 23 October 2008 (UTC)
I came here searching for "worst protocol ever" (half-expected a redir to SOAP). The search function seems to have understood "worst article ever". Maybe I mistyped. — Preceding unsigned comment added by 194.228.11.192 ( talk) 06:03, 2 September 2016 (UTC)
The lead right now is a metaphor sandwich, with no technical depth. Where's the beef? — MaxEnt 18:41, 7 January 2018 (UTC)
As a networking professional, I can say categorically that the claim in the first paragraph is quite dubious.
To say, "A gossip protocol is a procedure or process of computer–computer communication that is based on the way social networks disseminate information" is false. Social networks spread information based on selective and discriminative sharing. By this, I mean that parties that are interested in certain communications are not always privy to them and those that are not interested are often included in the information that they have already seen or just don't care about. Social media sharing and peer-to-peer networks may have similarities but not much else. The claim that "Modern distributed systems often use gossip protocols to solve problems that might be difficult to solve in other ways, either because the underlying network has an inconvenient structure, is extremely large, or because gossip solutions are the most efficient ones available" is also false. In 99% of the cases, a gossip protocol is the most wasteful and problematic way of achieving necessary communications. In a nutshell, the claims made in this article are rather biased in favor of 'gossip' protocols and fail to reflect the real life problems encountered with them.
Also worth mentioning, this is really a variant of peer-to-peer networking. Rarely will anyone in software development talk about their "gossip deployment". Rather, the mode by which data dissemination is obtained is discussed as peer-to-peer or client-server networking design. Most notable peer-to-peer networks (BitTorrent, Skype, Bitcoin, eDonkey) use trackers (or variants) and peer coordinators to ensure that data is properly available and so that users are directed to the most reliable and perform ant sources for obtaining data. When Quality-Of-Service is important, peer-to-peer networks are rarely ever chosen.
It would be great if someone could incorporate these issues/concerns/criticisms of this into the actual article.
Almost an unlimited number of "Gossip" variants could be mentioned. They are not WP material unless supported by current use and/or citations.
In that vein, I have also removed "Spacial Gossip". Although explained, there were no examples given of current use and also no valid references. See 'Unsupported Claims' section for a similar issue. — Preceding unsigned comment added by 50.68.57.160 ( talk) 16:50, 26 November 2018 (UTC)
Although I haven't changed it, the heading Gossip Protocol Types also states things unsupported by the citations provided. For example, the opening sentence citation actually supports "information spreading" and "information aggregation" as types. Yet, strangely, the section includes "Anti-entropy protocols" with a link to computer error correction seemingly unrelated. — Preceding unsigned comment added by 50.68.57.160 ( talk) 16:57, 26 November 2018 (UTC)
... Which I have also now trimmed and probably needs further refinement. — Preceding unsigned comment added by 50.68.57.160 ( talk) 17:00, 26 November 2018 (UTC)
This is the
talk page for discussing improvements to the
Gossip protocol article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
This article has not yet been rated on Wikipedia's content assessment scale. |
This article links to one or more target anchors that no longer exist.
Please help fix the broken anchors. You can remove this template after fixing the problems. |
Reporting errors |
Most of this page is extracted from a longer article I wrote for a special edition of Operating Systems Review (an ACM publication that doesn't impose copyright restrictions). The longer article seemed a bit too detailed for a Wiki page, but it seemed to me that an encyclopedia treatment really should at least explain what a gossip protocol is, and isn't.
The references I included here are all cited in the longer article. Some are papers I co-authored but most are by other folks. I hope this isn't a violation of the POV/COI policies. I'm not trying to sell anything and if people want to edit the article to reduce any perceived bias, go for it!
Are illustrations needed? Ken Birman 12:59, 14 June 2007 (UTC)
This article has numerous problems. First of all the editor who created the artice, User:Ken Birman has a conflict of interest ( WP:COI) in that he is the developer of Virtual synchrony software, as he has stated. The article does not contain in-line citations. Its tone is inappropriate for an encyclopedia, a casual discussion in some areas, abruptly introduced technical language in others, sometimes both in one, so the article seems like a casual discussion among insiders or buyers in places, and unfocused run on descriptions of the process without any concept of the reader of Wikipedia articles ("Developers of distributed computer systems often need a way to replicate data for sharing between programs running on multiple machines, connected by a network. Virtual synchrony is one of three major technologies for solving this problem. The key idea is to create a form of distributed state machine associated with the replicated data item"). The examples are awkwardly introduced (probably due to COI), the prose needs thoroughly edited. The sections need cohesively structured, with internal order, for the reader of the article-- WP:MOS. I have tagged these articles with requests for this clean-up in case there are interested Wikipedia editors who can improve these articles, in particular starting with structure and pose before moving on to technical accuracy. Ken Birman has made it clear he does not want me to edit the articles. KP Botany 18:23, 23 June 2007 (UTC)
On the OR: I included this tag because of the COI between the primary editor and the topic and the failure to include in-line citations, coupled with the disorganized structure of the article, that altogether make difficult the direct verification of information and original research. KP Botany 18:33, 23 June 2007 (UTC)
Quick question for Ken Birman: How do you recouncil your 14 June 2007 statement
above with
The first statement expresses at the least some doubts, while the second denies them completely, or at least denies a tangential concern or subset of the actually stated concern. -- Calton | Talk 00:00, 30 June 2007 (UTC)
This article reads like an advertisement for 'Gossip Protocol'. Many of the claims are unsupported. The "gossip protocol is one that satisfies the following conditions" is even much more specific than in real life. — Preceding unsigned comment added by 50.68.57.160 ( talk) 03:03, 26 November 2018 (UTC)
I thought this article did a good job -- I came here looking for info on epidemic/gossip protocols, and learned most of what I wanted to know. The technical content and exposition is fine, and the prose is serviceable, though it could benefit a little from copy-editing. I'd like to see more details, but then again, that's what the references at the end are for.
But I'm a computer scientist, not a botanist, so what do I know... JensAlfke 23:53, 3 July 2007 (UTC)
And, take your comments about me to my talk page, not to this article. KP Botany 01:11, 5 July 2007 (UTC)
I also liked this article. I don't see nor have they been named here any false claims. Perhaps some details (exact numbers on size) could be added to the data center example. 141.24.33.171 ( talk) 12:51, 11 July 2008 (UTC)
For about six months now, this page has been tagged for Point of View, Original Research and Factual Accuracy. After a brief period of discussion and editing activity, nothing has happened for six months.
Here's my question: as a matter of policy, do such tags remain on the page forever, or is there some procedure for agreeing to remove them? It seems safe to assume that KPBotany, who placed them there, would not consider the matter resolved, since nothing substantive changed during this period. However, perhaps that particular reviewer will never find comfort with this material -- such things do sometimes happen. Ken Birman ( talk) 18:04, 21 December 2007 (UTC)
User_talk:KP_Botany in saying that this article needs work. I wouldn't know where to go with it. BrewJay ( talk) 00:34, 14 July 2008 (UTC)
It's a multi-cast protocol. It hosts gossip. It's reliable. It's based upon internet structure that has existed for decades. Some companies are routing sound and files over it in real-time. Why reinvent the wheel? BrewJay ( talk) 12:03, 4 July 2008 (UTC)
Why don't add it to see-also? 141.24.33.171 ( talk) 12:48, 11 July 2008 (UTC)
Good suggestion. I don't see any reason not to add it. But do keep in mind that when companies like Amazon use gossip, they mean they are using this "class" of solutions -- they have an abstracted behavior in mind. IRC is a very specific protocol, with an associated RFC standard, and this is definitely NOT the implementation of gossip Amazon favors. In fact there are hundreds of gossip protocols, and I would say that the modern perspective is that IRC is an early example of a particular application that uses a gossip mechanism. One would think of it side-by-side with a dozen others (Astrolabe, the Amazon shopping cart, etc), each of which also is an application of some kind running on a gossip protocol of some type. But this said, many people know about IRC and using it as an example would be super. Go for it... Ken Birman ( talk) 14:34, 9 August 2008 (UTC)
"Thus, while one could run a 2-phase commit protocol over a gossip substrate, doing so would be at odds with the spirit, if not the wording, of the definition."
This article might be useful in understanding the problems that routing protocols hav already solved by *not* trying to see the whole path that a packet travels. They don't try to make trees out of network topology. People make maps out of it, sometimes. Perhaps you could explain how a gossip protocol would also be reliable *without* a lock and an acknowledgement to that lock (2-phases). For me, it's basic versing, and I don't see any easy ways around locking without comparisons and arbitration that pretty much, in the end, need to occur on one CPU that needs to serialize that arbitration before it saves. The Amazon shopping cart runs on cookies. It could run on URL-code in HTML that it sends you just as easily. When cookies were being derided and misunderstood, hotmail was using URL-code to identify users more certainly than IP#.
The spirit of gossip is a lack of reliability. So, I hope you understand that I'm inclined to ask for references or terse descriptions of implementations when you indicate "logarithmic growth". BrewJay ( talk) 14:48, 16 October 2008 (UTC)
Someone asked for Gossip-based multicast protocol to be created. I redirected it to Internet Relay Chat, because it carries gossip. Later, after deletion of my redirection, I discovered an uninformative article on a proprietary protocol that doesn't seem to do anything that IP doesn't, when implemented. For example, I know that there are several ways to rank a routing table:
The more of that work you do, the more expensive your router becomes, but considering the ratio of speed between a CPU (mine is rated at about 27 gigabits per second) and network speed, such work is feasible.
Considering it further, I think reliability statistics would need to be mapped and averaged over a week. There might be reliable times when a network node is saturated with traffic or regular maintenance. BrewJay ( talk) 06:37, 9 July 2008 (UTC)
I didn't make a deletion proposal based on the reality of the subject, because I KNOW that it's real; I'm using a gossip protocol (as I'm led to understand the definition) called Transmission Control Protocol over IP.
gossip protocol isn't notable. See category:internet architecture. BrewJay ( talk) 21:55, 8 July 2008 (UTC)
"*Anti-entropy protocols for repairing replicated data, which operate by comparing replicas and reconciling differences."
The current link for "Anti-entropy protocols" goes to "Error correction", which is wrong. These are two distinct concepts; the names are not synonyms. Either the link text needs to be "Error correction" instead of "Anti-entropy protocols", or the link itself needs to point to "Anti-entry protocols" instead of "Error correction". As the text describing the link indicates the topic being Anti-Entropy protocols, I'd recommend to change the link towards "Anti-Entropy protocols". if the link is meant to point to error correction, then I'd recommend to adapt the text describing the link is adapted accordingly. --Christian Kerth (christian.kerth@gmx.net), 08:55, 19 July 2018 (UTC) — Preceding unsigned comment added by 153.96.12.26 ( talk)
In the analogical lead of the article's beginning, that's what I get...that computers on the internet communicate directly along some wavelength. On some networks, like analog telephone (I'm not sure if that exists anymore), or packet radio, maybe they do, but there's no great element of randomness in it. A user picks a BBS, dials the number, BINGO, you hav a peer... Or, in packet radio, some ham scans the radio waves, recognizes digital, then tries to decode it using internet documents and utilities. Relaying QWK-based mail might happen a gossipy manner, but it was (is?) explicitly scheduled -- nothing like the random walk presented in this article. BrewJay ( talk) 07:20, 9 July 2008 (UTC)
b) Background data dissemination protocols continuously gossip about information associated with the participating nodes. Typically, propagation latency isn’t a concern, perhaps because the information in question changes slowly or there is no significant penalty for acting upon slightly stale data.
These are weasel words.
Here's an interesting story for you folks:
... so clearly, gossip protocols are a big deal in modern computing data centers, and play critical roles. Places like Amazon are using them, and when they go wrong, the data center can become unusable! So the moral of the story is: if you work on distributed computer systems in 2008, you had better know what gossip protcools do, how to build them, and how to get them right! Which isn't to diminish the importance of ALSO knowing how routing works. BTW, Amazon makes a lot of use of gossip. I know of some applications that work well for them -- they reported on one, in the shopping cart subsystem, at the SOSP conference in 2007. Another is the Astrolabe system, which they acquired some years ago and then rebuilt internally. But I don't know this specific gossip subsystem (the one in S3) and hence can't guess at what went wrong, beyond what Amazon is saying in their little announcement. Sounds like some form of corrupted data snuck into the gossip subsystem and got stuck there. This is easy to imagine -- without taking adequate care, gossip protocols can be very much at risk of that sort of accidental contamination... Ken Birman ( talk) 11:06, 23 July 2008 (UTC)
The article currently says that pairwise interactions are a defining characteristic of a gossip protocol. To extend the office rumor analogy, what about the case where three or four people stand around the water cooler and listen to the same rumor when it is only spoken once? In other words, would something still be considered a "gossip protocol" if the underlying transmission can involve subscribing to some sort of broadcast of the message? If so, I think the "pairwise" language could be removed. The reason I am confused is because it also states that one of the uses of gossip is to implement multicast, but gossip also seems useful at a layer where multicast is already provided by some deeper layer... Maghnus ( talk) 00:37, 11 August 2008 (UTC)
"Search strings known to A will now also be known to B, and vice versa. In the next "round" of gossip A and B will pick additional random peers, maybe C and D."
Period Nodes 1 2 2 4 3 8
...in thirty-two communication periods, you can hav over four billion nodes with information intended for broadcast. I wrote that method for QWK-packets, which were an early BBS standard for newsgroups. That standard didn't hav message-IDs, so my method had no potential (unless the advantage were worth seeking unique messages, for which I didn't hav an efficient method; now I would use a perfect hash. It still doesn't hav much potential, because hook-ups on the internet are largely manual and static. Notice that my method DEPENDS upon static addresses AND scheduled, regular communication periods. In practice, I don't see any hope for it, except perhaps within one box, where it has been blown away in practical applications, probably long before I even wrote it.
I don't doubt that there are a lot of references for this in papers that have analyzed practicing it for use within a box, where it's easy to hard-wire timing, and when you stick randomness and unreliability into it, then you need some very special error correction protocol, indeed. I read that Intel spends a lot of money on noise control (crosstalk) within a CPU. BrewJay ( talk) 16:42, 16 October 2008 (UTC)
"The term convergently consistent is sometimes used to describe protocols that achieve exponentially rapid spread of information. For this purpose, a protocol must propagate any new information to all nodes that will be affected by the information within time logarithmic in the size of the system (the "mixing time" must be logarithmic in system size)."
In the leadup, there is a mention of "sometimes". I will move to strike it out, because the word is hard to use in Physics. It is a troublesome thing to hav bits sometimes lost or to form an association for collaboration that is irregular. I understand that you want to demonstrate flexibility, reliability and performance. When a router goes down, then the metric for that router in neighbouring routers quickly registers unreachable. That propagates like gossip. BrewJay ( talk) 01:57, 21 October 2008 (UTC)
My theory goes that if you try to design something that is everything to everyone, then you will end up with a system that is nothing to anyone. Professor Birman wants to express that gossip protocols are reliable, perform superbly, and are ultimately flexible. I am still not getting the impression of any design choices made, rather I am led to believe that it's some general description of all protocols that may call themselves gossip; thousands. It is anything but clear, and yet I cannot pinpoint any particular words that are making this document foggy.
If I continue to work on this document, then I think I will try to impose limits and link it to other topics in network architecture; rather than say that reliability is not assumed, I will ask for how compensations are made. If the professor still wants to describe what works on paper, then I will want the scope narrowed to what can be adequately described, here, in examples. BrewJay ( talk) 02:42, 21 October 2008 (UTC)
(Growth in time consumed for a broadcast mirroring transaction that is proportional to system size.)
Initial Condition: Machine A contains messages 1,1,1,1 '' B " " 2,2,2,2 C = 3,3,3,3 D = 4,4,4,4 Machine A Exchanges with B Machine C Exchanges with D State after first period: A = 1,2,1,2 B = 2,1,2,1 C = 3,4,3,4 D = 4,3,4,3 A Exchanges with C B Exchanges with D State after second period: A = 1,2,3,4 B = 2,1,4,3 C = 3,4,1,2 D = 4,3,2,1
After two periods, four machines contain information intended for broadcast.
Period Nodes 1 2 2 4 3 8
The number of channels required for this process increases at half the number of machines, which means that it is proportional (counting in half duplex (one direction at a time), it's identical). The amount of information that an exchange period can require can double at every step, so it is necessary to either double the length of each consecutive period, or arbitrarily limit the amount of information that can be transmitted in a cycle of periods. This means that there will be more latency in the system during peak hours, unless each consecutive period doubles in length. Taking the choice to double consecutive period length adds up so that the time required for a broadcast is proportional to the number of nodes, not logarithmic.
Period Length Nodes 1 1 2 2 2 4 3 4 8 7
For some reason, I'm not quite getting eight for a total in column two, so I think I'm making a one-off error. The result I'm expecting is identical to that for one-to-many by rotation (Is that approximately what tokenring means?): proportional. As I said in earlier comments, I like how simple the pairwise choice makes error correction; I could choose V.42bis, for example. I would choose to *always* double the length of consecutive periods, and periodically arbitrate the baseline of real time that represents one in my table above, based upon the most congested server. Proportional is an optimal result in sorting, too. Since the level of congestion on each server would be a few quantities of fixed length, I would broadcast that between periods. If this were to go as far as shared files, then it might be advisable to rotate periods that lock files, accept file locks, distribute files, and gauge conjestion.
I've told people that once you've stated your assumptions and shown how your math works, that it isn't always necessary to follow WP:NOR. (That's less true in Physics, less true yet in Chemistry, and it's downright bonkers in Biochemistry). I think it needs peer review, though, which is why I'm honoured to be conversing with a professor, however much trouble he might be having with encyclopedic standards that really do encourage you to link your stuff to pre-requisites. I'm thinking I should touch up this article with the graphics, and give Ken some time to adjust the words, assuming he cannot find a way to beat proportional. BrewJay ( talk) 14:00, 23 October 2008 (UTC)
I came here searching for "worst protocol ever" (half-expected a redir to SOAP). The search function seems to have understood "worst article ever". Maybe I mistyped. — Preceding unsigned comment added by 194.228.11.192 ( talk) 06:03, 2 September 2016 (UTC)
The lead right now is a metaphor sandwich, with no technical depth. Where's the beef? — MaxEnt 18:41, 7 January 2018 (UTC)
As a networking professional, I can say categorically that the claim in the first paragraph is quite dubious.
To say, "A gossip protocol is a procedure or process of computer–computer communication that is based on the way social networks disseminate information" is false. Social networks spread information based on selective and discriminative sharing. By this, I mean that parties that are interested in certain communications are not always privy to them and those that are not interested are often included in the information that they have already seen or just don't care about. Social media sharing and peer-to-peer networks may have similarities but not much else. The claim that "Modern distributed systems often use gossip protocols to solve problems that might be difficult to solve in other ways, either because the underlying network has an inconvenient structure, is extremely large, or because gossip solutions are the most efficient ones available" is also false. In 99% of the cases, a gossip protocol is the most wasteful and problematic way of achieving necessary communications. In a nutshell, the claims made in this article are rather biased in favor of 'gossip' protocols and fail to reflect the real life problems encountered with them.
Also worth mentioning, this is really a variant of peer-to-peer networking. Rarely will anyone in software development talk about their "gossip deployment". Rather, the mode by which data dissemination is obtained is discussed as peer-to-peer or client-server networking design. Most notable peer-to-peer networks (BitTorrent, Skype, Bitcoin, eDonkey) use trackers (or variants) and peer coordinators to ensure that data is properly available and so that users are directed to the most reliable and perform ant sources for obtaining data. When Quality-Of-Service is important, peer-to-peer networks are rarely ever chosen.
It would be great if someone could incorporate these issues/concerns/criticisms of this into the actual article.
Almost an unlimited number of "Gossip" variants could be mentioned. They are not WP material unless supported by current use and/or citations.
In that vein, I have also removed "Spacial Gossip". Although explained, there were no examples given of current use and also no valid references. See 'Unsupported Claims' section for a similar issue. — Preceding unsigned comment added by 50.68.57.160 ( talk) 16:50, 26 November 2018 (UTC)
Although I haven't changed it, the heading Gossip Protocol Types also states things unsupported by the citations provided. For example, the opening sentence citation actually supports "information spreading" and "information aggregation" as types. Yet, strangely, the section includes "Anti-entropy protocols" with a link to computer error correction seemingly unrelated. — Preceding unsigned comment added by 50.68.57.160 ( talk) 16:57, 26 November 2018 (UTC)
... Which I have also now trimmed and probably needs further refinement. — Preceding unsigned comment added by 50.68.57.160 ( talk) 17:00, 26 November 2018 (UTC)