# Dusa is a graph exploration language

Graph structures are pretty common in computer science, and Dusa is a pretty good language for working with graphs. A lot of problems that would be solved in an imperative or functional language by yet another implementation of depth-first search can be described in just a line or two of Dusa.

## Describing graphs

Consider the following graph, lovingly rendered with ASCII graphics:

We can describe the edges in this graph as a bunch of facts in Dusa:

We call `edge`

a predicate, and say that it takes two arguments or, equivalently, that
it defines a two-place relation. A fact `edge a b`

is also sometimes called a
**proposition**. Dusa deals in propositions.

We can use variables, which start with upper case letters, to talk about edges more
generically. Our illustration indicated that the graph was undirected, so we probably
want there to be an `edge b a`

as well as `edge a b`

. Instead of writing out eleven
more facts, we can say that for every edge, there’s an edge in the opposite direction.

As an if-then statement, we could write “for all `X`

and `Y`

, if `edge X Y`

then
`edge Y X`

.

Note that this if-then statement is written backwards from how it’s written in English:
the “then” part, the **conclusion** is written first, followed by the `:-`

symbol.
After the `:-`

symbol is **premises**. (There can be more than one premise. If there
aren’t any premises we’re just defining a fact, so we don’t write the `:-`

symbol at
all.)

## Graph reachability

Lots of problems boil down to needing to know all the points in a graph that are connected to a given node by one or more edges. For example, liveness analysis in a compiler can be treated as a graph reachability problem where nodes are lines of code.

A node in a graph is reachable from a starting node if it is that starting node, or if it’s connected to another reachable node by an edge. That logic can be captured in Dusa with two rules and a fact:

## Graph coloring

A graph is three-colorable if every node can be assigned to one of three different colors without giving two edge-connected node the same color.

### Functional propositions

Saying “every node can have one color: red, yellow, or blue” is Dusa’s specialty! That rule looks like this:

This rule means that if `node a`

is present in a solution, that solution must also
include one of the following: `color a is red`

, `color a is yellow`

, or
`color a is blue`

. Whereas `node _`

, `edge _ _`

, and `node _`

are **relational**
propositions, `color _ is _`

is a **functional** proposition: the attribute `color a`

can be assigned at most one value by a database, and any solution that derived both
`color a is red`

and `color a is blue`

would be rejected.

The premise `node X`

is important, because you can’t have free variables in the
conclusion of a rule that aren’t present in a premise. We can define the node relation
from the edge relation like this:

(Because we defined the edge relation to be symmetric, either one of these rules will do on its own, but nothing is wrong with both versions.)

### Forbid constraints

There are 12 nodes in our running example, and so there are 3^{12} = 531441
different ways to assign three colors to those 12 nodes. But to show that the graph
is three-colorable, we need to reject any solutions where two edge-connected nodes
have the same color. We can do this with a `#forbid`

constraint:

Say that prospective solution tries to give the color `red`

to both nodes `a`

and `b`

,
then Dusa can assign `X`

to `a`

, assign `Y`

to `b`

, and `Color`

to `red`

in the rule above. Because `edge a b`

exists, all the parts of the `#forbid`

constraint are satisfied, so the solution will be rejected.

### Making the graph un-colorable

Our example graph is three-colorable, but if we add two more edges:

then there are no solutions.

### Demand constraints

Constraints can work in the other direction as well. Because our graph has three
different connected components, and the nodes `a`

, `f`

, and `h`

are in different
connected components, the color `a`

doesn’t affect the color of `f`

and so on.
It might be desirable to say that these three nodes must all have the same color
in order to reduce the number of redundant solutions. We could enforce that
with a `#demand`

constraint:

This constraint says that concrete node values `a`

, `f`

, and `h`

all have to
have the same color, but because `Color`

is a variable, the constraint doesn’t
say anything about *what* that color has to be.

## Connected components

We can also use Dusa to calculate the connected components of a graph. A common way to describe connected components and equivalence classes is to assign a canonical representative for every node. Every element might be the representative of its connected component: an arbitrary element is chosen as a leader and then all connected nodes are assigned the same representative.

The fact that every element might be (but isn’t necessarily) the representative of its
connected component can be captured by an **open** rule.

This rule indicates that every node **may** be its own representative, but doesn’t force
that decision if another rule provides a contradictory assignment.

If only that rule is given, every node will be assigned as its own representative. If we add the reachability logic from our first example, it will force every node in a connected component to have the same representative:

There are many different solutions, but they only differ in which node from a connected component is arbitrarily chosen as the leader.