Tuesday, December 19, 2006

Semantic wikis and reason

Here we are back at the eternal question about semantic wikis: what good are semantic wikis? So far, the only real use of semantic wikis is in semantic media wiki, where semantics are being used for simple search and retrieval. Organization, not inference. And I want to do inference.

Let's start with a somewhat simple description of inference. You apply resolution or some other theorem proving technique to a knowledge base, and thus determine whether the fact is true given the knowledge in the KB. The KB must be consistent for this to work in the simplest case. Then there is the question of whether we can accomplish this tractably. The right knowledge in the KB must be brought to bear while answering the question. So, given a high quality knowledge base, we can produce precise answers to questions using a generally mechanical mathematical procedure.

I don't think reality is so kind to us. Building a knowledge base is a difficult endeavor. Wikis are a convenient way of collaboratively authoring texts, even high quality texts as some of the content on wikipedia may demonstrate. However, authoring knowledge using wiki principles seems to be much harder. I cannot see how we might produce a high quality knowledge base, as would be required for applying reasoning and inference procedures.

There are basically two kinds of KB quality issues I worry about: inconsistency and incompleteness. Inconsistency is when you can prove a contradiction in your KB. Incompleteness is when you cannot make the unique names assumption, that is, when there can exist multiple names in a KB for the same underlying concepts. In the case of a wiki, this would mean having multiple URIs, multiple documents, for the same concept.

I'm still reading up on this. I need to do a mini-project to see just how much of a problem inconsistency and incompleteness prove to be. Inconsistency has been tackled in the past, it is a problem that's always been present in AI systems. Incompleteness has always been present too, but the semantic web I think makes it much more prevalent. I obviously can't get any clarity on this issue without running some experiments.