Thursday, September 25, 2008

Long overdue update

I have been meaning to put together a blog entry for a long time, but finding time has been difficult. We went to the Eastern Sierras, where I was chased (briefly) by a bear. Lehman has tanked, and AIG has been on the brink. Freddie and Fannie are gone. Our house still needs work. I finally got my green card, right in time to see the economy go south. Lightroom 2 came out. And there were more events besides that I would rather not mention here. Where do I begin?

About the only big thing that I haven't covered here is programming and work. So I'll focus on that. My lisp hacking of late has been pretty routine. In fact, all my hacking has been pretty routine. I've been developing the server portion of our software, and it has been going as well as can be expected. Try as I might, I can't think of anything truly interesting I've done over the past few weeks. Time has been spent on bug fixing and other such tasks. Instead, I'll turn to one of the key aspects of our software: the manipulation of knowledge as a graph. This also ties in with one of my previous posts on hypergraphs.

We have viewed graphs as labeled edges connecting nodes. This has the obvious deficiency that this type of data structure falls outside the scope of many standard graph algorithms. My initial approach to addressing this problem was to construct different types of graphs to tackle the various algorithmic problems we faced.

One of the graph types was an edge graph. This graph has two types of nodes: one representing nodes, and the other representing edges. Suppose we have a triple (n1 s n2) in our KB. This translates to a graph with three nodes: n1, n1sn2, and n2. The edges in this graph are n1->n1sn2, n1sn2->n2. Slots don't have an explicit representation in this scheme. They are instead replaced with a representation of the complete edge.

Another type of graph is the node graph. Given a triple (n1 s n2) in our KB, we reduce it to an edge n1->n2 in our graph. Similarly, we can also define an "edge graph". Suppose we have two triples (n1 s1 n2) and (n2 s2 n3) in our KB. We can define an edge in our graph n1s1n2->n2s2n3 in our graph.

It is critical that these are not general purpose KB representation schemes. These are merely mechanisms for reducing the KB to a representation useful for solving certain problems. Hypergraphs by contrast provide a faithful representation of the triples in a knowledge base. A hypergraph could thus serve both purposes: represent the KB and provide a basis for implementing various knowledge manipulation algorithms.

So, how would DFS look in a hypergraph? I've picked on DFS as it is a key algorithm for finding certain structures in our KB. The answer is heavily dependent on the graph representation we choose. Suppose we choose to represent edges as a directed hypergraph. This can be done in one of two ways for a triple (n1 s n2): n1->{s,n2}, and . The former differentiates between the subject, predicate and object by assuming that the predicate is always clearly distinguishable from the object. This assumption is probably not going to hold for arbitrary KBs, so we must abandon this representation. The alternative has us store each hyperedge as an ordered tuple, which tends to go against standard hypergraph formalisms. However, it can reliably represent arbitrary triples correctly.

I must end this blog entry here, as I don't have a formal representation of the hypergraph. Implementing DFS over a representation is straightforward. The basic principle for DFS still applies: start with a root. Mark the current object as visited, then visit the next object. Don't visit any object multiple times. As such, DFS and BFS are straightforward general purpose graph algorithms. So most of the algorithms that build upon these can be straightforwardly translated to hypergraphs. But! It isn't the case that the results of the algorithms would be interpreted the same way as in ordinary graphs.