Pure functional breadth-first search

I’ve just read the initial paragraph of a long nice article about tree transversal in F#

The starting code is still OO reminiscent.

```let bfs idof fanout node =
let queue = Queue([node])
let visited = HashSet()
// DSL
let enqueue = queue.Enqueue
let dequeue = queue.Dequeue
let empty () = queue.Count = 0
let mark = idof >> visited.Add >> ignore
let test = idof >> visited.Contains >> not
// algorithm
seq {
while not (empty ()) do
let current = dequeue ()
mark current
yield current
current |> fanout |> Seq.filter test |> Seq.iter enqueue
}
```

First wonderful thing in F#: passing from a closure (`tree` implicit inside `fanout node`) to an inversion of control simply by juxtaposing (`tree` after `fanout` everywhere): no distinction between data or function, both first class citizens!
Now, for the sake of pure functional programming, we can use a recursive type (or fix point for haskellers).

```let rec bfs2 (fanout: Map<'node, 'node seq> -> 'node -> 'node seq) (tree: Map<'node, 'node seq>) (node: 'node) : 'node seq =
let single = seq [node]
match fanout tree node with
| e when e = Seq.empty -> single
| s -> Seq.fold (fun acc item ->
bfs2 fanout tree item
|> Seq.append acc) single s
```

Finally we just need to add stack overflow safety.

```let rec bfs2robust (fanout: Map<'node, 'node seq> -> 'node -> 'node seq) (tree: Map<'node, 'node seq>) (node: 'node) : 'node seq =
let single = seq [node]
match fanout tree node with
| e when e = Seq.empty -> single
| s -> Seq.fold (fun acc item ->
bfs2robust fanout (tree |> Map.remove node) item
|> Seq.append acc) single s |> Seq.distinct
```

4 thoughts on “Pure functional breadth-first search”

1. @GiulioHome, this is both nice and true, and I don’t have objections on this post per se, but somehow, in more many ways, it allows me to bring a concern of mine that, as a solution architect, is a concern that has to be considered on just about any distributed solution: caching could affect our design decisions that pertain to the object model and the algorithms.
The point is that in just about any given distributed solution, caching should be analyzed and evaluated with proper care. But then again, is it really that we need object models and algorithms that are dependent on the path?, or, on the contrary, it is OK to stick to object models and algorithms that work just like any State Function, considering the precise and rather narrow definition of Stat Function in Physics?.
The real and complete answer depends on both functional reqs and quality attributes, but no matter what the answer is, it should be properly evaluated.
Path-dependent or State-dependent affect and condition in rather different ways they way we use and configure caching.
Kind regards, GEN

Liked by 1 person

• Well, let’s start from the code and then move to physics. There is a “state” in a tree transverse as a matter of fact: very simple example could be looking for pdf files under a root folder and all subdirs, etc… The fact that the filesystem and the path is “distributed” is not clearly stated, we can vaguely say that this will be I/O at the boundary of our architecture, so… yes and no.
Finally, from a physics’ point of view, you can discuss path in Riemann topology or in Lorentz group representation theory or in fancy de Sitter space with Euler characteristic obstruction… But – put it in simple humble words – it is always the math search of the “optimum transport” 🙂 just to mention the typical approach of a new, young, Italian fields medallist, often travelling around the globe, Alessio Figalli.

Like

• Another example regarding the universal notion of “path”, i.e. a connection. You can define the concept of “connection” in GR and you can apply a type of Galois connection in static program analysis to check array out of bounds… and so on and so forth. The architect of a solution, just like a physics researcher, must be willing to learn new patterns and to join distant dots!

Btw, your comments about caching appear quite unrelated here. Notice that I have another older article focused on such topic and about async performance in C#.

Like