Thoughtlets, Music, and Code from Noah Thorp

Directed Graphs in Rails (Self Refering Many-To-Many Joins)

October 22nd 2007

Here is a simple example of a complex subject. An uncluttered implementation of a directed graph implemented in Rails (adaptable to other Ruby ActiveRecord projects).

Let the directed graph consist of nodes and links. Links join nodes together and are directed from a node to another node. Links also have a value associated with them.

I should mention that in standard digraph terminology nodes are called vertices, and edges are called links. I chose the terms nodes and links as I think it makes for easier reading.

Build The Model

Create the tables like this:

$ script/generate model node name:string
$ script/generate model link from_id:integer to_id:integer value:integer
$ rake db:migrate

The migration table definitions should look like this (surrounded by the usual migration code):

create_table :nodes do |t|
    t.column :name, :string
end

create_table :links do |t|
    t.column :from_id, :integer
    t.column :to_id, :integer
    t.column :value, :integer
end

Create the Node and Link models like this:

class Link < ActiveRecord::Base
    belongs_to :from_node, :foreign_key => "from_id", :class_name => "Node" 
    belongs_to :to_node, :foreign_key => "to_id",   :class_name => "Node" 
end

class Node < ActiveRecord::Base
    has_many :to_links, :foreign_key => 'from_id', :class_name => 'Link' #tricky!
    has_many :to_nodes, :through => :to_links  

    has_many :from_links, :foreign_key => 'to_id', :class_name => 'Link' #tricky!
    has_many :from_nodes, :through => :from_links
end

What’s Going On In The Model?

Did you see the lines labeled “tricky!” in the Node model. Why does has_many :to_links reference :foreign_key => “from_id”? Well, lets check it out.

Here is a diagram I created to explain directed relationships from one node to another node and how they are defined in the model. (Whispered aside: if you are feeling frisky and you want to compute the nodes that go to a given node rather than from it, you will have to traverse the diagram in reverse.)

Essentially we traverse in this order:

  1. From the :from_node via the from_id to the :to_link
  2. From the :to_link via the to_id to the :to_node

Here’s the corresponding code from the Node model that makes this hapen.

1) Traverse from the :from_node via the from_id to the :to_link

has_many :to_links, :foreign_key => 'from_id', :class_name => 'Link' 

2) Traverse :to_link via the to_id to the :to_node

has_many :to_nodes, :through => :to_links

“Where are the :from_node and :to_node terms defined?” you may ask. They are defined in the Link model along with the belongs_to statements.

belongs_to :from_node, :foreign_key => "from_id", :class_name => "Node" 
belongs_to :to_node, :foreign_key => "to_id",   :class_name => "Node"

SQL Nitty Gritty

Lets disect what all of this means in terms of SQL. If you are ready to get right to it, you can skip this section and go straight to playing with the model.

In SQL selecting nodes linked from node “a” to node “b” would look something like the following. Notice that I alias the node table as both from_node and to_node (the node table plays two roles).

SELECT to_nodes.name 
FROM nodes from_nodes, nodes to_nodes, links
WHERE from_nodes.id = links.from_id
AND to_nodes.id = links.to_id
AND from_nodes.name = 'a'
AND to_nodes.name = 'b'

Looking at our diagram we see this bit of code from the Node model which defines the relationship from a Node to a Link:

has_many :to_links, :foreign_key => 'from_id', :class_name => 'Link' 

This corresponds to this sql statment:

WHERE from_nodes.id = links.from_id

Looking at our diagram once again we see this bit of code from the Node model which defines the relationship between from a Link to a Node:

has_many :to_nodes, :through => :to_links

This corresponds to this sql statement:

AND to_nodes.id = links.to_id

One last note about the naming of the model. Although the symbols do represent an non-arbitrary relationship, the names for :from_nodes, :to_nodes, :from_links, and :to_links are arbitrary. I have selected a naming scheme that I believe reveals the structure of the relationships, but if you find something more suitable to your tastes ActiveRecord won’t complain.

Playing With The Model

Now that you have grokked that, lets try out the model. Here’s the diagram of the digraph that we are about to create:

Open up your rails app console:

$ script/console

Create some nodes and connect them with links:

a = Node.create(:name => "a")
b = Node.create(:name => "b")
c = Node.create(:name => "c")
d = Node.create(:name => "d")
Link.create(:from_node => a, :to_node => b, :value=>7)
Link.create(:from_node => b, :to_node => c, :value=>14)
Link.create(:from_node => c, :to_node => d, :value=>21)
Link.create(:from_node => b, :to_node => d, :value=>28)
Link.create(:from_node => d, :to_node => a, :value=>35)

Now run some queries.

a links to b:

a.to_nodes[0].name # returns: b
a.to_nodes.first.name #an alternate, also returns b

b links to c and d:

b.to_nodes[0].name #returns c
b.to_nodes[1].name #returns d
b.to_nodes.each {|node| puts node.name} # returns: c, d

There is a link to a from d:

a.from_nodes[0].name # returns: d

One way to retrieve a node two links away:

a.to_nodes[0].to_nodes[0].name #returns c

Get the value associated with a link between two nodes:

a.to_links[0].value #returns 7

There you go. Now you know how to make directed graphs in Ruby on Rails. Go forth and digraphify.

One final note, this posting was inspired by Josh Susser’s post on the same topic. He has some other great topics on “has_many :through” relationships – in fact that’s even what his blog is called. I recommend reading his posts if this type of model manipulation strikes your fancy.