This
level-5 vital article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||
|
Breadth first recursion now redirects here.
I have been looking at this and also depth-first search and the preliminary explanation has the same meaning. It says that it starts at the root node "or a node", and explores each neighboring node of the "root" fully without making a cycle until it finds the goal. Something needs to be done about this. I'm just a student trying to learn this so I don't think I can really help. Maybe I'll ask my professor to fix this.
I think that since BFS uses a Queue, "push" and "pop" are wrong (stacks,
LIFO): a queue uses "enqueue" and "dequeue". Edited. --
Folletto 16:00, 21 Jun 2005 (UTC)
"Since all nodes discovered so far have to be saved, the space complexity of breadth-first search is O(|V| + |E|)." I don't see it. Where does the |E| come from?
I agree. The runspace is O(|V|) in the worst case, since that will be the max size of the control queue. —Preceding unsigned comment added by 216.186.140.62 ( talk) 23:37, 5 March 2009 (UTC)
It appears to be assumed that the graphs being dealt with have a root node, but this is not stated explicitly. Presumably any node will do if the graph is undirected, but what happens for a directed graph? If the article deals only with undirected graphs then it might be helpful if this was made clear at the beginning. Autoplunger ( talk) 20:05, 30 January 2011 (UTC)
The pseudocode looks funny, and a little too close to an actual implementation. I have taken liberty to modify it a little.
You did a poor job in modifying it, language is unclear, look at psuedocode on page for topological sort for good reference. —Preceding unsigned comment added by 97.77.52.94 ( talk) 17:21, 22 November 2010 (UTC)
The pseudocode doesn't just look funny, it's wrong. "parent" is never used and nothing is searched for. It's a breadth-first traversal with an unnecessary extra variable.
Also, there's no mention whatsoever of cycles. — Preceding unsigned comment added by 187.163.213.135 ( talk) 21:22, 3 July 2016 (UTC)
To clarify: the marking of 'distance' (another unneeded variable) as INFINITE is apparently a check to make sure there are no cycles. But it relies upon there being a collection of unique graph nodes. This is just terrible pseudocode. — Preceding unsigned comment added by 187.163.213.135 ( talk) 21:35, 3 July 2016 (UTC)
@ Headlessplatter: Can you give a little more detail on your reversion of my pseudocode simplification? Just trying to understand your reasoning. As far as I can understand, adding to a set and setting a value on a node should be the same complexity, both in space and time, if anything it seems like the set should be slower (depending on implementation, of course). Crazy2be ( talk) 21:43, 6 August 2017 (UTC)
The code looks wrong 9 for all edges from v to w in G.adjacentEdges(v) do 10 if w is not labeled as discovered then 11 label w as discovered 12 w.parent := v This is going from v to w, but only modifying w, the last edge, repeatedly — Preceding unsigned comment added by 105.186.83.153 ( talk) 08:54, 5 February 2020 (UTC)
When BFS is used with a priority queue, it is a total algorithm to find the shortest path between any two nodes in a weighted graph. Clearly, therefore, finding the shortest path between two nodes in a weighted graph is an application of BFS.
Somebody deleted this application because they felt that this should be an application of Dijkstra's Algorithm instead. What is the logic here? Clearly it is an application of both DA and BFS. DA is an optimization of the backtracing step of BFS, so you don't have to use back-pointers to remember how you got to the destination. The distinction between BFS and DA therefore has little to do with whether or not the graph is weighted. Finally, BFS is much easier to remember than DA, so BFS is undoubtedly the more popular algorithm to find shortest paths in any graph (weighted or unweighted). Therefore this application should be listed.
The original formulation of BFS did not restrict the queue to be a FIFO queue. If you use BFS with a FIFO queue on a weighted graph then you're effectively discarding the edge weights. That's why you get a different result from the algorithm with a priority queue versus FIFO queue: you're changing the input. DFS is completely different because you get a result that you won't get merely by changing the inputs. BFS has been used with a priority queue in scheduling simulators long before Dijkstra's algorithm got its name. It is not limited to unweighted graphs!
Your question, "Where is the difference to Dijkstra's Algorithm?" gets to the heart of the matter. BFS with PQ is easy to remember, is immediately clear to anybody who has learned BFS, and is the oldest and most popular algorithm for finding all shortest paths in a weighted graph. By contrast, Dijkstra's Algorithm is obscure, known only in elite circles, difficult to memorize, and named after a professor. The differences between the algorithms have nothing to do with the problem they are solving, and everything to do with name-dropping elitism and obfuscation.
What you seem to be suggesting is that "Dijkstra's Algorithm solves the problem, therefore BFS with PQ does not". BFS and PQ long predate Dijkstra's Algorithm: they were used in the 1940s back in Alan Turing's days when lattice theory, AI, and scheduling systems were first developed for computers. So your statement should be turned around. What problem is Dijkstra's Algorithm solving that BFS and PQ did not already solve? If anything needs to be deleted, it's the Dijkstra's Algorithm entry, not the BFS entry.
BFS is what we call a blind search algorithm, cause it uses no more information than links between nodes. On the other hand are informed search algorithms, such as Best-first search. This last algorithm effectively uses a priority queue to handle the visiting nodes. So modifying BFS with such queue results in a different class of algorithm: BestFS. By the way this class of algorithm is still not optimal. You need to move forward to A* search algorithm (and in the process finding an admissible heuristic) to solve the "shortest" path in a weighted graph. So I'm with Regnaron in this matter. —Preceding unsigned comment added by 200.55.140.181 ( talk) 01:46, 16 June 2008 (UTC)
It is worth mentioning that when BFS is used in that manner, the neighbour list should be created such that north, east, south and west get priority over north-east, south-east, south-west and north-west. The reason for this is that BFS tends to start searching in a diagonal manner rather than adjacent, and the path found will not be the correct one. BFS should first search adjacent nodes, then diagonal nodes.
I think a better explanation or a reference is required here. As I understand it the resulting path will be zig-zagged if diagonal moves are prioritized over moves to adjacent grid point. The resulting path should still be correct.
In the sample implementation, why do we use an std::vector<int> to keep track of key-value pairs? The method we currently use wastes more space, runs slower (when attempting to recover path data later on), makes reading the code less clear, and is less versatile (e.g., in expanding the function for a situation where graph size is unknown beforehand) than the much more elegant solution for such an associative array—an std::map<int,int>. I will change the code accordingly if no reasonable objections appear within a week or so. -- 76.189.119.174 00:41, 10 August 2007 (UTC)
The article is vague about stating the problem that BFS is supposed to solve. Is the goal to search for a path to a given vertex, or to traverse all vertices? The distinction is important for instance if the graph is not connected. (The link to "graph search algorithms" actually leads to a page about traversal!) -- Mbetter 08:10, 17 August 2007 (UTC)
The sections on complexity are slightly inconsistent and worded incorrectly. I am changing them to correct the reason for the time and space complexity and to reflect the common notation. Jludwig 18:26, 2 October 2007 (UTC)
The section on complexity (and the given complexity) is deceiving to the reader and doesn't obey usual conventions outside of this particular algorithm. If the graph has multiple connected components (which could cause O(|E| + |V|) to not be equivalent to O(|E|)), the given code will fail to explore all the nodes and thereby run in time O(|E|) for the subgraph it does explore. If the graph has a single connected component, the code will run in O(|E|) which is equivalent in this case to O(|E| + |V|). This is true regardless of what the convention in CLRS is. To illustrate my point, name any other example of an algorithm running in O(n^2 + n) time being the standard convention over O(n^2). 12.32.116.227 ( talk) 20:56, 3 August 2018 (UTC)
I think the Python implementation should store visited nodes in a set rather than a list, because testing membership in a set is O(1) rather than O(n) with the list. —Preceding unsigned comment added by 65.28.233.233 ( talk) 08:30, 16 June 2008 (UTC)
I believe that the Python implementation is still not correct because pop(0) on a list is linear time. In the worst case, the length of the queue is O(|V|), which makes the complexity of this implementation O(|V|^2). Also, checking to see whether the child is already in the queue is currently O(n) and needs to be O(1).
I will wait a few days to see if anyone contradicts me; otherwise I will go ahead and fix it (or at least make a note of the problem).-- AllenDowney
It seems to be wrong. "Otherwise enqueue any successors (the direct child nodes) that have not yet been examined." But successors might have been already enqueued earlier (and not examined yet). So we should color them grey when enqueue, and enqueue only white nodes. Poemich ( talk) 11:41, 3 May 2009 (UTC) kij —Preceding unsigned comment added by 164.78.252.57 ( talk) 08:58, 4 August 2009 (UTC)
Merging the Implementation in to the main document would make more sense. And avoid deletion.
See Wikipedia:Articles for deletion/Breadth-first search implementation. As usual for AfD discussions, please leave a detailed comment explaining your opinion rather than just a keep or delete without an explanation. — David Eppstein ( talk) 21:59, 12 May 2009 (UTC)
<harry_ftw> afaik, there's no such thing as a "breath first search"
What in the world is "completeness" (and "proof of completeness")? Is "completeness" a well-established term in some field? -- Robin ( talk) 04:14, 17 November 2009 (UTC)
Would be great with an example showing what happens with Connected_component_(graph_theory) -- Andreas ( talk) 11:40, 18 March 2010 (CET)
When checking whether to enqueue a new node, the posted version appears to be check if the node is already in the queue by comparing it with every other node in the queue (an O(n) operation). This implementation doesn't have the same characteristics or behavior as the Java implementation, or the description in Cormen, Rivest. Could someone who knows C# fix this? 128.101.35.83 ( talk) 20:14, 26 April 2010 (UTC)
We only need one implementation. As it is, both languages conflate language features with explanation of the algorithm. We should either have only a pseudocode implementation, or maybe Python or some other reads-mostly-like-pseudocode language if it can be made sufficiently close to pseudocode. The fact that java.util.LinkedList
is in the example suggests to me that the example says more about Java than about BFS.
—Ben FrantzDale (
talk)
12:53, 11 May 2010 (UTC)
Doesn't this algorithm for testing bipartiteness assume that every node is reachable from the start node? I don't think it will work correctly for graphs that aren't connected. — Preceding unsigned comment added by 164.107.120.19 ( talk) 20:00, 5 April 2013 (UTC)
I concur. The pseudo-code that's given is specific to BFS for a tree graph. It doesn't work correctly for graphs with multiple components, and doesn't agree with the text. Frankly, this article is a hot mess. I'll edit this when I get home and can refer to Cormen et al.
163.173.9.3 (
talk)
09:37, 29 May 2013 (UTC)
The use of a colour attribute which is WHITE, GREY or BLACK, which I assume originated with the Corman et al. algorithms book, is useful for proving things about the algorithm. But it isn't used for anything in this article and it isn't needed for correctness of the algorithm. The pseudocode is still valid if "if adj.color == WHITE" is replaced by "if adj.distance == INFINITY" and then all statements involving colour are removed. So at the moment it only serves to complicate something simple. Who will object if I make this simplification? McKay ( talk) 04:35, 28 July 2015 (UTC)
Felipe Sobreira Abrahão ( talk · contribs) Could you please explain your edit [1] ? I don't understand what your infinity means. And especially, why you deleted the condition about minimality ? Arthur MILCHIOR ( talk) 18:32, 23 March 2017 (UTC)
"Let be an enumeration of the vertices of . The enumeration is said to be a BFS ordering (with source ) if, for all , is the vertex such that Equivalently, is a BFS ordering if, for all with , there exists a neighbor of such that ."
doesn't hold for . As a counter-example, one may take , in the example presented in The breadth-first search algorithm." Equivalently, is a BFS ordering if, for all with , there exists a neighbor of such that . "
Some IP editor is repeatedly trying to replace the pseudocode with (modulo typos and syntax) the following simplified algorithm:
def BFS(G,start):
V = set()
Q = queue()
Q.enqueue(start)
while Q:
v = Q.dequeue()
V.add(v)
for w in Gv]:
if w not in V:
Q.enqueue(w)
This is a bad algorithm. Because it does not check whether a vertex w has already been enqueued before enqueueing it again, it ends up adding each vertex multiple times, where the number of times is equal to the number of shortest paths from the start to the vertex. This could end up being exponential in the size of the input graph. The current pseudocode could probably be significantly simplified, but this is not the way to do it. — David Eppstein ( talk) 19:22, 20 December 2017 (UTC)
Great catch, David! What are your thoughts of adding an additional set to test queue membership to this style of pseudocode; I think it is much easier to follow than the current code.
def BFS(G, start):
visited = set()
prev_queued = set()
Q = queue()
Q.enqueue(start)
prev_queued.add(start)
while not Q.empty():
v = Q.dequeue()
visited.add(v)
for w in Gv]:
if w not in visited and w not in prev_queued:
Q.enqueue(w)
prev_queued.add(w)
Themichaelyang ( talk) 00:40, 4 October 2018 (UTC)
It's an improvement over the current mess, definitely, but are you really using visited? It seems that the test for w not in visited and w not in prev_queued always gives exactly the same results as if you merely tested prev_queued. And in case you need them after the end of the algorithm, the two sets are exactly the same at that point. So wouldn't it be simpler as:
def BFS(G, start):
queued = {start}
Q = queue(queued)
while not Q.empty():
v = Q.dequeue()
for w in Gv]:
if w not in queued:
Q.enqueue(w)
queued.add(w)
— David Eppstein ( talk) 00:54, 4 October 2018 (UTC)
Whoops, another great catch! I hadn't even noticed that, but it seems obvious now. I'll update the pseudocode section with this snippet and a remarks about the incorrect implementation next time I get a chance. Michael Yang ( talk) 19:58, 4 October 2018 (UTC)
The code has closed_set to remember which nodes have been dequeued, but if vertices are put into closed_set when they are enqueued instead of when they come off the queue, the two tests "if child in closed_set: continue" and "if child not in open_set" can be replaced by "if child not in closed_set". That would also fix an existing bug when the graph has a loop (while a node is having its neighbours processed it is in neither open_set nor closed_set so it will be enqueued again). McKay ( talk) 06:23, 16 April 2018 (UTC)
Revisiting this page after a while, and now the pseudocode section contains a very confusing snippet of "Python" code (in actuality, it is not valid Python code). The new code also includes a very strange `problem` object parameter, which makes very little sense. Why not separate the inputs into actual parameters? I think for clarity purposes, the original pseudocode was much easier to follow, and made less assertions about the syntax and style of a particular implementation. Themichaelyang ( talk) 00:01, 4 October 2018 (UTC)
I agree, I think things were much better a couple years ago ( https://en.wikipedia.org/?title=Breadth-first_search&diff=prev&oldid=789811794 was the last time I tried to edit the psuedocode). The code now is just inscrutable, and is badly written for an actual implementation, let alone psuedocode. This algorithm isn't hard, the psuedocode should be 10 lines, not 50! Seeing how other people agree with this, I'm thinking of just reverting the whole section to what it was then. Thoughts? Crazy2be ( talk) 15:35, 22 March 2019 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
The Q.enqueue line at the bottom of the non-recursive algorithm given should be an S.enqueue since the queue variable was named S at the top of the algorithm. 2601:603:1100:D4EF:E868:E527:B6D4:2CBB ( talk) 18:51, 26 March 2019 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
The current pseudocode doesn't mark the starting vertex as discovered. I think it needs to be added.
1 procedure BFS(G,start_v): 2 let S be a queue --> label start_v as discovered 3 S.enqueue(start_v) 4 while S is not empty 5 v = S.dequeue() 6 if v is the goal: 7 return v 8 for all edges from v to w in G.adjacentEdges(v) do 9 if w is not labeled as discovered: 10 label w as discovered 11 w.parent = v 12 S.enqueue(w)
Pcmx ( talk) 09:35, 7 May 2019 (UTC)
If you use the implementation from the DFS wikipedia article, you can simply use a Stack for DFS, queue for BFS, and it will work depending on which implementation you use.
1 procedure DFS-iterative(G,v): 2 let S be a stack / queue for BFS 3 S.push(v) 4 while S is not empty 5 v = S.pop() 6 if v is not labeled as discovered: 7 label v as discovered 8 for all edges from v to w in G.adjacentEdges(v) do 9 S.push(w)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
CHANGE: The parent attribute of each vertex is useful for accessing the nodes in a shortest path TO: The parent attribute of each node is useful for accessing the nodes in a shortest path
WHERE: Psuedocode -> More details -> Line 7
REASON: This change would improve the clarity since even though `node` and `vertex` are interchangeable it is preferable to use only one during a sentence.
Comment on not done: Despite the words being interchangeable it is still a mid-sentence inconsistency which is not preferable. Regardless, it is a rather small change which would increase the quality and clarity of that section.
LachieRussell ( talk) 03:13, 1 September 2019 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
It uses the opposite strategy as depth-first search, which instead explores the highest-depth nodes first before being forced to backtrack and expand shallower nodes.
It uses the opposite strategy as depth-first search, which instead explores the node branch as far as possible before being forced to backtrack and expand other nodes.
Done ~ Kvng ( talk) 15:37, 22 October 2019 (UTC)
It seems that through several modifications of this article, a problem has slipped in that can confuse readers: The pseudocode for breadth-first search does not set the parent nodes of discovered nodes whereas the surrounding text talks about the usefulness of the parent nodes and how they can be used to trace a shortest path back to the root.
I propose to either add the corresponding lines to the pseudocode (one additional line would set the parent of the root to null/undefined in the initialization stage before or after line 3, and another line would set the parent of a discovered node before or after line 11) or to remove any mentions of the parent nodes. I vote for adding the lines to the pseudocode but I can understand if they are left out for the sake of simplicity.
This
level-5 vital article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||
|
Breadth first recursion now redirects here.
I have been looking at this and also depth-first search and the preliminary explanation has the same meaning. It says that it starts at the root node "or a node", and explores each neighboring node of the "root" fully without making a cycle until it finds the goal. Something needs to be done about this. I'm just a student trying to learn this so I don't think I can really help. Maybe I'll ask my professor to fix this.
I think that since BFS uses a Queue, "push" and "pop" are wrong (stacks,
LIFO): a queue uses "enqueue" and "dequeue". Edited. --
Folletto 16:00, 21 Jun 2005 (UTC)
"Since all nodes discovered so far have to be saved, the space complexity of breadth-first search is O(|V| + |E|)." I don't see it. Where does the |E| come from?
I agree. The runspace is O(|V|) in the worst case, since that will be the max size of the control queue. —Preceding unsigned comment added by 216.186.140.62 ( talk) 23:37, 5 March 2009 (UTC)
It appears to be assumed that the graphs being dealt with have a root node, but this is not stated explicitly. Presumably any node will do if the graph is undirected, but what happens for a directed graph? If the article deals only with undirected graphs then it might be helpful if this was made clear at the beginning. Autoplunger ( talk) 20:05, 30 January 2011 (UTC)
The pseudocode looks funny, and a little too close to an actual implementation. I have taken liberty to modify it a little.
You did a poor job in modifying it, language is unclear, look at psuedocode on page for topological sort for good reference. —Preceding unsigned comment added by 97.77.52.94 ( talk) 17:21, 22 November 2010 (UTC)
The pseudocode doesn't just look funny, it's wrong. "parent" is never used and nothing is searched for. It's a breadth-first traversal with an unnecessary extra variable.
Also, there's no mention whatsoever of cycles. — Preceding unsigned comment added by 187.163.213.135 ( talk) 21:22, 3 July 2016 (UTC)
To clarify: the marking of 'distance' (another unneeded variable) as INFINITE is apparently a check to make sure there are no cycles. But it relies upon there being a collection of unique graph nodes. This is just terrible pseudocode. — Preceding unsigned comment added by 187.163.213.135 ( talk) 21:35, 3 July 2016 (UTC)
@ Headlessplatter: Can you give a little more detail on your reversion of my pseudocode simplification? Just trying to understand your reasoning. As far as I can understand, adding to a set and setting a value on a node should be the same complexity, both in space and time, if anything it seems like the set should be slower (depending on implementation, of course). Crazy2be ( talk) 21:43, 6 August 2017 (UTC)
The code looks wrong 9 for all edges from v to w in G.adjacentEdges(v) do 10 if w is not labeled as discovered then 11 label w as discovered 12 w.parent := v This is going from v to w, but only modifying w, the last edge, repeatedly — Preceding unsigned comment added by 105.186.83.153 ( talk) 08:54, 5 February 2020 (UTC)
When BFS is used with a priority queue, it is a total algorithm to find the shortest path between any two nodes in a weighted graph. Clearly, therefore, finding the shortest path between two nodes in a weighted graph is an application of BFS.
Somebody deleted this application because they felt that this should be an application of Dijkstra's Algorithm instead. What is the logic here? Clearly it is an application of both DA and BFS. DA is an optimization of the backtracing step of BFS, so you don't have to use back-pointers to remember how you got to the destination. The distinction between BFS and DA therefore has little to do with whether or not the graph is weighted. Finally, BFS is much easier to remember than DA, so BFS is undoubtedly the more popular algorithm to find shortest paths in any graph (weighted or unweighted). Therefore this application should be listed.
The original formulation of BFS did not restrict the queue to be a FIFO queue. If you use BFS with a FIFO queue on a weighted graph then you're effectively discarding the edge weights. That's why you get a different result from the algorithm with a priority queue versus FIFO queue: you're changing the input. DFS is completely different because you get a result that you won't get merely by changing the inputs. BFS has been used with a priority queue in scheduling simulators long before Dijkstra's algorithm got its name. It is not limited to unweighted graphs!
Your question, "Where is the difference to Dijkstra's Algorithm?" gets to the heart of the matter. BFS with PQ is easy to remember, is immediately clear to anybody who has learned BFS, and is the oldest and most popular algorithm for finding all shortest paths in a weighted graph. By contrast, Dijkstra's Algorithm is obscure, known only in elite circles, difficult to memorize, and named after a professor. The differences between the algorithms have nothing to do with the problem they are solving, and everything to do with name-dropping elitism and obfuscation.
What you seem to be suggesting is that "Dijkstra's Algorithm solves the problem, therefore BFS with PQ does not". BFS and PQ long predate Dijkstra's Algorithm: they were used in the 1940s back in Alan Turing's days when lattice theory, AI, and scheduling systems were first developed for computers. So your statement should be turned around. What problem is Dijkstra's Algorithm solving that BFS and PQ did not already solve? If anything needs to be deleted, it's the Dijkstra's Algorithm entry, not the BFS entry.
BFS is what we call a blind search algorithm, cause it uses no more information than links between nodes. On the other hand are informed search algorithms, such as Best-first search. This last algorithm effectively uses a priority queue to handle the visiting nodes. So modifying BFS with such queue results in a different class of algorithm: BestFS. By the way this class of algorithm is still not optimal. You need to move forward to A* search algorithm (and in the process finding an admissible heuristic) to solve the "shortest" path in a weighted graph. So I'm with Regnaron in this matter. —Preceding unsigned comment added by 200.55.140.181 ( talk) 01:46, 16 June 2008 (UTC)
It is worth mentioning that when BFS is used in that manner, the neighbour list should be created such that north, east, south and west get priority over north-east, south-east, south-west and north-west. The reason for this is that BFS tends to start searching in a diagonal manner rather than adjacent, and the path found will not be the correct one. BFS should first search adjacent nodes, then diagonal nodes.
I think a better explanation or a reference is required here. As I understand it the resulting path will be zig-zagged if diagonal moves are prioritized over moves to adjacent grid point. The resulting path should still be correct.
In the sample implementation, why do we use an std::vector<int> to keep track of key-value pairs? The method we currently use wastes more space, runs slower (when attempting to recover path data later on), makes reading the code less clear, and is less versatile (e.g., in expanding the function for a situation where graph size is unknown beforehand) than the much more elegant solution for such an associative array—an std::map<int,int>. I will change the code accordingly if no reasonable objections appear within a week or so. -- 76.189.119.174 00:41, 10 August 2007 (UTC)
The article is vague about stating the problem that BFS is supposed to solve. Is the goal to search for a path to a given vertex, or to traverse all vertices? The distinction is important for instance if the graph is not connected. (The link to "graph search algorithms" actually leads to a page about traversal!) -- Mbetter 08:10, 17 August 2007 (UTC)
The sections on complexity are slightly inconsistent and worded incorrectly. I am changing them to correct the reason for the time and space complexity and to reflect the common notation. Jludwig 18:26, 2 October 2007 (UTC)
The section on complexity (and the given complexity) is deceiving to the reader and doesn't obey usual conventions outside of this particular algorithm. If the graph has multiple connected components (which could cause O(|E| + |V|) to not be equivalent to O(|E|)), the given code will fail to explore all the nodes and thereby run in time O(|E|) for the subgraph it does explore. If the graph has a single connected component, the code will run in O(|E|) which is equivalent in this case to O(|E| + |V|). This is true regardless of what the convention in CLRS is. To illustrate my point, name any other example of an algorithm running in O(n^2 + n) time being the standard convention over O(n^2). 12.32.116.227 ( talk) 20:56, 3 August 2018 (UTC)
I think the Python implementation should store visited nodes in a set rather than a list, because testing membership in a set is O(1) rather than O(n) with the list. —Preceding unsigned comment added by 65.28.233.233 ( talk) 08:30, 16 June 2008 (UTC)
I believe that the Python implementation is still not correct because pop(0) on a list is linear time. In the worst case, the length of the queue is O(|V|), which makes the complexity of this implementation O(|V|^2). Also, checking to see whether the child is already in the queue is currently O(n) and needs to be O(1).
I will wait a few days to see if anyone contradicts me; otherwise I will go ahead and fix it (or at least make a note of the problem).-- AllenDowney
It seems to be wrong. "Otherwise enqueue any successors (the direct child nodes) that have not yet been examined." But successors might have been already enqueued earlier (and not examined yet). So we should color them grey when enqueue, and enqueue only white nodes. Poemich ( talk) 11:41, 3 May 2009 (UTC) kij —Preceding unsigned comment added by 164.78.252.57 ( talk) 08:58, 4 August 2009 (UTC)
Merging the Implementation in to the main document would make more sense. And avoid deletion.
See Wikipedia:Articles for deletion/Breadth-first search implementation. As usual for AfD discussions, please leave a detailed comment explaining your opinion rather than just a keep or delete without an explanation. — David Eppstein ( talk) 21:59, 12 May 2009 (UTC)
<harry_ftw> afaik, there's no such thing as a "breath first search"
What in the world is "completeness" (and "proof of completeness")? Is "completeness" a well-established term in some field? -- Robin ( talk) 04:14, 17 November 2009 (UTC)
Would be great with an example showing what happens with Connected_component_(graph_theory) -- Andreas ( talk) 11:40, 18 March 2010 (CET)
When checking whether to enqueue a new node, the posted version appears to be check if the node is already in the queue by comparing it with every other node in the queue (an O(n) operation). This implementation doesn't have the same characteristics or behavior as the Java implementation, or the description in Cormen, Rivest. Could someone who knows C# fix this? 128.101.35.83 ( talk) 20:14, 26 April 2010 (UTC)
We only need one implementation. As it is, both languages conflate language features with explanation of the algorithm. We should either have only a pseudocode implementation, or maybe Python or some other reads-mostly-like-pseudocode language if it can be made sufficiently close to pseudocode. The fact that java.util.LinkedList
is in the example suggests to me that the example says more about Java than about BFS.
—Ben FrantzDale (
talk)
12:53, 11 May 2010 (UTC)
Doesn't this algorithm for testing bipartiteness assume that every node is reachable from the start node? I don't think it will work correctly for graphs that aren't connected. — Preceding unsigned comment added by 164.107.120.19 ( talk) 20:00, 5 April 2013 (UTC)
I concur. The pseudo-code that's given is specific to BFS for a tree graph. It doesn't work correctly for graphs with multiple components, and doesn't agree with the text. Frankly, this article is a hot mess. I'll edit this when I get home and can refer to Cormen et al.
163.173.9.3 (
talk)
09:37, 29 May 2013 (UTC)
The use of a colour attribute which is WHITE, GREY or BLACK, which I assume originated with the Corman et al. algorithms book, is useful for proving things about the algorithm. But it isn't used for anything in this article and it isn't needed for correctness of the algorithm. The pseudocode is still valid if "if adj.color == WHITE" is replaced by "if adj.distance == INFINITY" and then all statements involving colour are removed. So at the moment it only serves to complicate something simple. Who will object if I make this simplification? McKay ( talk) 04:35, 28 July 2015 (UTC)
Felipe Sobreira Abrahão ( talk · contribs) Could you please explain your edit [1] ? I don't understand what your infinity means. And especially, why you deleted the condition about minimality ? Arthur MILCHIOR ( talk) 18:32, 23 March 2017 (UTC)
"Let be an enumeration of the vertices of . The enumeration is said to be a BFS ordering (with source ) if, for all , is the vertex such that Equivalently, is a BFS ordering if, for all with , there exists a neighbor of such that ."
doesn't hold for . As a counter-example, one may take , in the example presented in The breadth-first search algorithm." Equivalently, is a BFS ordering if, for all with , there exists a neighbor of such that . "
Some IP editor is repeatedly trying to replace the pseudocode with (modulo typos and syntax) the following simplified algorithm:
def BFS(G,start):
V = set()
Q = queue()
Q.enqueue(start)
while Q:
v = Q.dequeue()
V.add(v)
for w in Gv]:
if w not in V:
Q.enqueue(w)
This is a bad algorithm. Because it does not check whether a vertex w has already been enqueued before enqueueing it again, it ends up adding each vertex multiple times, where the number of times is equal to the number of shortest paths from the start to the vertex. This could end up being exponential in the size of the input graph. The current pseudocode could probably be significantly simplified, but this is not the way to do it. — David Eppstein ( talk) 19:22, 20 December 2017 (UTC)
Great catch, David! What are your thoughts of adding an additional set to test queue membership to this style of pseudocode; I think it is much easier to follow than the current code.
def BFS(G, start):
visited = set()
prev_queued = set()
Q = queue()
Q.enqueue(start)
prev_queued.add(start)
while not Q.empty():
v = Q.dequeue()
visited.add(v)
for w in Gv]:
if w not in visited and w not in prev_queued:
Q.enqueue(w)
prev_queued.add(w)
Themichaelyang ( talk) 00:40, 4 October 2018 (UTC)
It's an improvement over the current mess, definitely, but are you really using visited? It seems that the test for w not in visited and w not in prev_queued always gives exactly the same results as if you merely tested prev_queued. And in case you need them after the end of the algorithm, the two sets are exactly the same at that point. So wouldn't it be simpler as:
def BFS(G, start):
queued = {start}
Q = queue(queued)
while not Q.empty():
v = Q.dequeue()
for w in Gv]:
if w not in queued:
Q.enqueue(w)
queued.add(w)
— David Eppstein ( talk) 00:54, 4 October 2018 (UTC)
Whoops, another great catch! I hadn't even noticed that, but it seems obvious now. I'll update the pseudocode section with this snippet and a remarks about the incorrect implementation next time I get a chance. Michael Yang ( talk) 19:58, 4 October 2018 (UTC)
The code has closed_set to remember which nodes have been dequeued, but if vertices are put into closed_set when they are enqueued instead of when they come off the queue, the two tests "if child in closed_set: continue" and "if child not in open_set" can be replaced by "if child not in closed_set". That would also fix an existing bug when the graph has a loop (while a node is having its neighbours processed it is in neither open_set nor closed_set so it will be enqueued again). McKay ( talk) 06:23, 16 April 2018 (UTC)
Revisiting this page after a while, and now the pseudocode section contains a very confusing snippet of "Python" code (in actuality, it is not valid Python code). The new code also includes a very strange `problem` object parameter, which makes very little sense. Why not separate the inputs into actual parameters? I think for clarity purposes, the original pseudocode was much easier to follow, and made less assertions about the syntax and style of a particular implementation. Themichaelyang ( talk) 00:01, 4 October 2018 (UTC)
I agree, I think things were much better a couple years ago ( https://en.wikipedia.org/?title=Breadth-first_search&diff=prev&oldid=789811794 was the last time I tried to edit the psuedocode). The code now is just inscrutable, and is badly written for an actual implementation, let alone psuedocode. This algorithm isn't hard, the psuedocode should be 10 lines, not 50! Seeing how other people agree with this, I'm thinking of just reverting the whole section to what it was then. Thoughts? Crazy2be ( talk) 15:35, 22 March 2019 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
The Q.enqueue line at the bottom of the non-recursive algorithm given should be an S.enqueue since the queue variable was named S at the top of the algorithm. 2601:603:1100:D4EF:E868:E527:B6D4:2CBB ( talk) 18:51, 26 March 2019 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
The current pseudocode doesn't mark the starting vertex as discovered. I think it needs to be added.
1 procedure BFS(G,start_v): 2 let S be a queue --> label start_v as discovered 3 S.enqueue(start_v) 4 while S is not empty 5 v = S.dequeue() 6 if v is the goal: 7 return v 8 for all edges from v to w in G.adjacentEdges(v) do 9 if w is not labeled as discovered: 10 label w as discovered 11 w.parent = v 12 S.enqueue(w)
Pcmx ( talk) 09:35, 7 May 2019 (UTC)
If you use the implementation from the DFS wikipedia article, you can simply use a Stack for DFS, queue for BFS, and it will work depending on which implementation you use.
1 procedure DFS-iterative(G,v): 2 let S be a stack / queue for BFS 3 S.push(v) 4 while S is not empty 5 v = S.pop() 6 if v is not labeled as discovered: 7 label v as discovered 8 for all edges from v to w in G.adjacentEdges(v) do 9 S.push(w)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
CHANGE: The parent attribute of each vertex is useful for accessing the nodes in a shortest path TO: The parent attribute of each node is useful for accessing the nodes in a shortest path
WHERE: Psuedocode -> More details -> Line 7
REASON: This change would improve the clarity since even though `node` and `vertex` are interchangeable it is preferable to use only one during a sentence.
Comment on not done: Despite the words being interchangeable it is still a mid-sentence inconsistency which is not preferable. Regardless, it is a rather small change which would increase the quality and clarity of that section.
LachieRussell ( talk) 03:13, 1 September 2019 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
It uses the opposite strategy as depth-first search, which instead explores the highest-depth nodes first before being forced to backtrack and expand shallower nodes.
It uses the opposite strategy as depth-first search, which instead explores the node branch as far as possible before being forced to backtrack and expand other nodes.
Done ~ Kvng ( talk) 15:37, 22 October 2019 (UTC)
It seems that through several modifications of this article, a problem has slipped in that can confuse readers: The pseudocode for breadth-first search does not set the parent nodes of discovered nodes whereas the surrounding text talks about the usefulness of the parent nodes and how they can be used to trace a shortest path back to the root.
I propose to either add the corresponding lines to the pseudocode (one additional line would set the parent of the root to null/undefined in the initialization stage before or after line 3, and another line would set the parent of a discovered node before or after line 11) or to remove any mentions of the parent nodes. I vote for adding the lines to the pseudocode but I can understand if they are left out for the sake of simplicity.