# Graph Theory Problem

I came across a graph theory problem at work this week, and decided to blog about it to help me sort it out. At first I didn’t think of it as a graph problem, but that’s how these problems often go, shrouded in disguise and all. But I digress.

The problem is this. Considering the following directed graph, which nodes are the most senior ancestors (speaking in tree terms, of course)?

The answer is that nodes **A** and **B** are the most senior, meaning they have the most number of descendants in their sub-trees. This becomes obvious if we draw the same graph this way:

I’m trying to discover an algorithm to identify the senior most nodes given any directed graph. The graph is given as an edge list, like this:

- A -> B
- A -> F
- A -> E
- E -> H
- E -> G
- B -> A (notice the cycle here back to A)
- B -> D
- B -> C
- H -> I

So there’s the problem. I’m looking for an algorithm that will find the top-most nodes given **any** directed graph. In the case where there is only a single top-most node, it’s easy, but that’s just one case, and I want a general solution. Also, the graph need not be fully reachable from any node. It could be two or more non-connected graphs.

If you have an idea for an elegant solution, please post it in the comments below. I’ll post my own solution next week.

Happy graphing!

Edit:

After thinking about the problem a bit more, I thought I’d add another example with some corner cases that my solution will need to consider. If you look below, you’ll see a graph with a more complicated cycle and a non-contiguous section. The nodes I expect the algorithm to discover are shown in green:

Here are the edge pairs:

- A -> B
- A -> E
- A -> F
- E -> H
- E -> G
- B -> A
- B -> C
- C -> B
- C -> D
- I -> J
- I -> K
- H -> L