Tuesday, December 16, 2008

CL-Wiki reborn

I have renamed my fork of CL-Wiki as XL-Wiki. Still need to make a few changes to wrap up the renaming.

Sunday, December 14, 2008

Now with project pages!

I've finally gotten around to creating project pages for CL-Wiki (my fork) and CL-Wise, my experiment with semantic wikis. I've also added a little test suite based on FReT. The first time I'm actually using FReT for anything useful since I'd written it.

Saturday, November 29, 2008

Photography updates

My nature photography class is coming to an end. So I thought I'd do a little wrapup in everything I've attempted through the course of the class. Other than the field trips, we had a trip up to Mendocino, and from there up to the Redwood National Park.

Here are the links to the relevant flickr sets, picking out individual photographs is far more work:

Point Lobos State Reserve
California North Coast
Moss Landing
Mountain View Shoreline
Reblog this post [with Zemanta]

Friday, November 7, 2008

(Not yet) A semantic wiki for common lisp

I have put in a stub of a semantic wiki for common lisp, in an svn repository, here. It requires my fork of cl-wiki, also in svn, here. I have to have a better name for my fork of cl-wiki than "my fork of cl-wiki". Definitely uncool.

Also have to throw together some web pages for these projects.

Monday, October 27, 2008

Perspective on the financial crisis

So, there's at least one person out there who seems to share my opinion on the nature of the financial crisis... This was in Business Week. I hope the link lasts, I didn't see any indication that it would expire but the URL itself doesn't give me confidence.

Wednesday, October 22, 2008

Photographing birds

Last weekend I went out to photograph birds. It was a field trip organized through the class I'm taking. For the trip I rented myself a 300mm f4 and a 1.4 extender.

First the sites. We visited two spots: the Mountain View shoreline, and Moss Landing. We got to the Mountain View shoreline about 8am. I was there over two hours. The morning was very foggy, but windless. Not the worst conditions, but certainly not the best. I couldn't crank up the shutter speed without compromising on the ISO, and was shooting at f4 most of the time. There were an immense number of birds at this site. Really fantastic. Shorebirds, water fowl, sparrows... Even spotted a pheasant.

Moss Landing was a bit of a mess. It isn't the most pleasant site, and the person that had arranged the location wasn't able to make it there himself. There were lots of sea lions and a few otters. Lots of birds, slightly different from the ones at Mountain View. I was trying to photograph cormorants flying just above the water. These are hard birds to track! Though I think I got a few reasonable shots. Once again, light was a problem. Very foggy. But there was a little break on the horizon that let through a nice sunset.

I'm still working through the photos. I had several hundred by the end of the day. Working with a 300mm lens was immensely challenging. It's a wonderful piece of equipment. I'm leaning towards skipping the 70-200mm and moving straight up to the 300mm. If I need reach, I should get a lens with some real reach. At least that's my reasoning.

On the other hand, the 70-200mm will give me a good focal length for doing portraits. And is likely much more portable. The 300mm f4 is a heavy beast.

I don't think I'll be making a decision this year though. Budget constraints and all that.
Reblog this post [with Zemanta]

Friday, October 10, 2008

OWL in Lisp!!

I got a lead yesterday on some OWL parsing and generation tools implemented in Lisp. Apparently in the process of being open sourced. The back-end store on their project is Sesame. I don't know if they will be releasing their triple store bridge along with the rest of the tools.

Needless to say, this is seriously exciting to me! I will switch from working on my triple store API to CL-Wiki for now. I need to write up a design document as a first step.

Reblog this post [with Zemanta]

Tuesday, October 7, 2008

Wilbur-rdf is not usable

I've just spent a bunch of time on wilbur-rdf that I could have used for doing something more useful, and have nothing to show for it.

The key problem is NOKOS, or the Nokia Open Source License. This license is based on the Mozilla 1.1 license. A full reading of the license shows that if the software is covered by a patent, and the software is modified, then the patent must be re-licensed. (See here for the discussion on Debian Legal mailing list.) Unfortunately there is a patent covering wilbur-rdf. So this software is unusable in the manner I was hoping: componentize it, and apply it towards building a semantic wiki.

I'll have to start over from scratch: put together a triple store API, then build a triple store capable of dealing with OWL. Parsing OWL, RDF etc. will have to take a back seat for now.

Hacking Wilbur

I've now got wibur in my svn repository, and have been fixing it. Currently the software isn't functional.

The tendency to not reuse is evident in force in wilbur. It has its own XML parser and HTTP client. Granted, these didn't exist as components in the lisp world when the project started, but that's hardly the case now.

My first task is still to pull out a triple store and the triple store API from wilbur. I really need to have them be independent components before I can turn cl-wiki into a semantic wiki. I'll get to the rdf parser in time, when it comes to importing ontologies into the semantic wiki.

(Or should importing existing ontologies be the first step? Hmmm....)

Monday, October 6, 2008

cl-wiki, wilbur now in my svn repository

I have been reorganizing my svn repository. All the public projects have been moved to http://svn.sfmishras.com/public/. This includes cl-wiki and wilbur. I'm forking cl-wiki, as the current maintainer isn't interested in further developing the project. And wilbur as it exists on sourceforge is dead. In emails exchanged with Ora Lassila, I found out they are putting out a successor project named piglet based on Python and C++. I'm not likely to be interested in that.

The key reason behind these steps is that I wish to develop a semantic wiki in Common Lisp. There are many pieces I see in this effort. The following are just the starting point:
  • A wiki. Therefore cl-wiki.
  • The ability to import and export OWL. Therefore wilbur.
  • A triple store API. To be factored out of wilbur.
  • A triple store. Again, to be factored out of wilbur.
I don't know how to put together a triple store, or how to organize the triple store persistence. So this is going to be quite a learning experience. The reason for refactoring wilbur is twofold:
  • Multiple software components will access the triple store. Just in the list above we have two: the import/export component and the wiki.
  • Having a standard triple store API will help interoperability as new CL components are (might be? could be? am I just dreaming?) developed.
Given the key role of the triple store API, refactoring wilbur will be the next step.

The Financial Crisis Isn't Going Away

I'm obviously no expert in finance. I'm just an observer. But I strongly believe that being on the outside is necessary to spot when things are going wrong. Those on the inside get a view of the situation with too many details and possibilities, enough so that they aren't able to see the forest for the trees. This I think is what happened during the tech bubble. And I see it happening again.

I believe this financial crisis is multiple decades in the making, and will take a decade to sort itself out. I don't see anyone talking about what I believe is the key problem in the system: the wage stagnation of the vast majority of Americans. There's been quite a bit made of the growing income gap between the top 1% of Americans, and the remaining 99%, over the previous decade. This is, I believe, extremely unhealthy. I don't know what the situation was before then. My bet would be that the years preceding the actual increase of the income gap would have gone into prep work. In other words, the groundwork for the acceleratng income gap would have been laid down. The problem has in other words been created by both Democrats and Republicans.

Would the consumer be at fault? I think the consumer acted as would be sensible given the circumstances. Cost reduction and easy credit had made up for the loss of income for the vast majority of Americans. Moving jobs and factories overseas made it easier to keep a healthy supply of cheap goods coming in. So it was possible to squeeze through on falling wages. At the same time in the name of making housing more accessible to home buyers, easy financing was created. After all, in the absence of growing wages, it would be impossible for businesses to make money off consumers. I believe this would have been especially true of the mortgage industry. Easy credit was an incentive to make bad investment decisions, which everyone did. These investments led to asset inflation. Which led to further credit. And at some point the cycle had to break.

So, what's in store for us? Nothing good, I fear. Basically, housing costs have far outstripped income for many. The only real fix for this problem is to bring housing costs back in line with income. There are two ways this could happen, and both are going to be incredibly painful. You can either let house values drop. Or you can let incomes catch up. The result of dropping housing prices is already evident. And this doesn't seem to be working at all. Bringing up consumer incomes is also a long term solution. What, after all, are you going to do to bring up their income? Increasing income in this manner is going to lead to increasing inflation. So, everything will become more expensive to match the prices of homes. Not a pretty scenario.

What of the bailout plan that's been proposed? Say the government buys $700 billion of bad debt. What we're doing basically is setting up an entity to absorb losses in the housing market, an entity for which taxpayers are taking responsibility. So we're insuring ourselves. But how is that debt going to be repaid? Consumers don't have the income to support the houses they're in. There's nothing here that tells me how we're going to fix the housing situation. Say a house forecloses. We take a loss on the $700B. That foreclosure affects the values of other surrounding houses. And we have affected debt that formerly wasn't bad, debt that's presumably still in the hands of financial companies. Have we solved anything if this scenario, which I think is quite likely, comes to pass?

Friday, October 3, 2008

Updating my web site

Modest improvements to my web site at http://www.sfmishras.com/. Most importantly, I have killed the silly animated front page and put in a link to some real content. Not that the content is currently all that interesting...

My next step is to update my svn repository with various projects that I'm planning on hacking. Starting with wilbur-rdf.

Reblog this post [with Zemanta]

Tuesday, September 30, 2008

Expressiveness of Semantic MediaWiki

Semantic MediaWiki (SMW) represents a concept as a page. Each page is exactly one concept. Each link to another page is potentially a triple, with an annotation on the link identifying the relation. This in a nutshell is the syntactic representation of triples in SMW. Categories potentially reflect classes, while normal pages reflect individuals. This gives a class-instance distinction.

This simple approach allows us to represent a variety of knowledge. But there are many other key constructs required for authoring a full ontology. I'll just give one example here: template vs own slots. These terms require some explanation. Template slots are defined on classes. Suppose (C1 r V2) is a triple representing a specification of a template slot. C1 is a class, and V2 is some valid value for r. By stating this triple, we're asserting that the filler of r in an instance of C1 must be V2. If r were an own slot, by contrast, C1 itself would have a value V2 in the filler of r. The instances of C1 would not be directly aware of this triple.

Further, the assumption that a single page describes only one concept is not tenable in any realistic situation. All but the simplest articles deal with multiple concepts. Splitting an article into components is often not feasible, as the content becomes too scattered for easy human consumption. I deal with both issues in the remainder of this post.

Given that we want to represent knowledge as succinctly as possible within the general wiki syntax, creating every semantic distinction within the wiki is a significant challenge. The SMW syntax is very simple, but the knowledge one can describe through it is too simplistic. A few straightforward changes can make SMW much more expressive.

Let's start with classes and instances. I don't think it is a good idea to conflate a class with a category, and an instance with an article. Each of these objects have distinct roles that are then lost. For example, each article is an instance of a class Document. The article discusses subjects that can be mapped to concepts. But the article is not itself a concept.

Here's a simple mechanism for producing such mappings. Each time an article discusses something that can be related to a concept, declare a start tag. At the end of the discussion, close with a end tag.

But, if an article no longer represents something in the KB, we can no longer rely on links to describe triples. This is easily addressed. Insofar as an article talks about anything, it is generally possible to define a primary topic for the article. Then a link is substituted with that primary topic when we're defining triples.

The nesting of topics also represents a relationship between those topics. This is similar to annotated links describing relations between concepts in the KB.

Allow a document to represent a class, an individual, or neither. This can be easily declared, just as categories are declared. This will enable documents to describe either classes or instances, or a combination of the two. The current SMW implementation supports linking categories to anything one wishes. However, categories do not show factboxes, except in previews. Asking for the OWL export of a category does not include the properties that had been included there. These may just be bugs, but as far as I know there exists no specification of the meaning of linking a category to another via a relation. One needs to be able to say whether a property linking two classes is intended as an own slot, a template slot, or a restriction on the slot filler of the source class. I would contend that in most cases one can deduce the intended meaning by considering the domain and range restrictions for a slot.

Effectively, my experience is that SMW can't describe anything but the most rudimentary content about classes. One cannot really develop an ontology in SMW as it exists today. At best one can describe basic taxonomy.

Thursday, September 25, 2008

Photography class

I have registered for a photography class. Digital nature photography, offered through the Palo Alto Camera Club. First time I'm taking a photography class, rather excited! I've learned a bit on my own so far about photography, but I really feel like I'm missing something... That I won't really get any better just working at it on my own.

Long overdue update

I have been meaning to put together a blog entry for a long time, but finding time has been difficult. We went to the Eastern Sierras, where I was chased (briefly) by a bear. Lehman has tanked, and AIG has been on the brink. Freddie and Fannie are gone. Our house still needs work. I finally got my green card, right in time to see the economy go south. Lightroom 2 came out. And there were more events besides that I would rather not mention here. Where do I begin?

About the only big thing that I haven't covered here is programming and work. So I'll focus on that. My lisp hacking of late has been pretty routine. In fact, all my hacking has been pretty routine. I've been developing the server portion of our software, and it has been going as well as can be expected. Try as I might, I can't think of anything truly interesting I've done over the past few weeks. Time has been spent on bug fixing and other such tasks. Instead, I'll turn to one of the key aspects of our software: the manipulation of knowledge as a graph. This also ties in with one of my previous posts on hypergraphs.

We have viewed graphs as labeled edges connecting nodes. This has the obvious deficiency that this type of data structure falls outside the scope of many standard graph algorithms. My initial approach to addressing this problem was to construct different types of graphs to tackle the various algorithmic problems we faced.

One of the graph types was an edge graph. This graph has two types of nodes: one representing nodes, and the other representing edges. Suppose we have a triple (n1 s n2) in our KB. This translates to a graph with three nodes: n1, n1sn2, and n2. The edges in this graph are n1->n1sn2, n1sn2->n2. Slots don't have an explicit representation in this scheme. They are instead replaced with a representation of the complete edge.

Another type of graph is the node graph. Given a triple (n1 s n2) in our KB, we reduce it to an edge n1->n2 in our graph. Similarly, we can also define an "edge graph". Suppose we have two triples (n1 s1 n2) and (n2 s2 n3) in our KB. We can define an edge in our graph n1s1n2->n2s2n3 in our graph.

It is critical that these are not general purpose KB representation schemes. These are merely mechanisms for reducing the KB to a representation useful for solving certain problems. Hypergraphs by contrast provide a faithful representation of the triples in a knowledge base. A hypergraph could thus serve both purposes: represent the KB and provide a basis for implementing various knowledge manipulation algorithms.

So, how would DFS look in a hypergraph? I've picked on DFS as it is a key algorithm for finding certain structures in our KB. The answer is heavily dependent on the graph representation we choose. Suppose we choose to represent edges as a directed hypergraph. This can be done in one of two ways for a triple (n1 s n2): n1->{s,n2}, and . The former differentiates between the subject, predicate and object by assuming that the predicate is always clearly distinguishable from the object. This assumption is probably not going to hold for arbitrary KBs, so we must abandon this representation. The alternative has us store each hyperedge as an ordered tuple, which tends to go against standard hypergraph formalisms. However, it can reliably represent arbitrary triples correctly.

I must end this blog entry here, as I don't have a formal representation of the hypergraph. Implementing DFS over a representation is straightforward. The basic principle for DFS still applies: start with a root. Mark the current object as visited, then visit the next object. Don't visit any object multiple times. As such, DFS and BFS are straightforward general purpose graph algorithms. So most of the algorithms that build upon these can be straightforwardly translated to hypergraphs. But! It isn't the case that the results of the algorithms would be interpreted the same way as in ordinary graphs.

Sunday, August 10, 2008

Discovering RESTful APIs

Read up today on REST APIs. See this introduction for specific details. Here I'm just going to give my opinion.

The idea behind the REST style is appealing in its simplicitly. I can see Many ways in which rigorously following the style would produce an easy to understand API, provided the user (client) is also familiar with the style. We've all become so used to treating GET and POST in HTTP as equivalents that we don't even think about how we might be abusing the intent. I never did understand the intent behind the separation of GET, POST, PUT and DELETE until now, and even now that understanding is arguably superficial.

REST provides a set of constraints over an API. Does it make sense to have a framework for supporting REST style APIs? Would the framework do anything beyond providing constraints on the behavior of the various HTTP verbs? How would such constraints be encoded? A lot of food for thought here.

Tuesday, August 5, 2008

Lassen Volcanic National Park trip report

We returned from Lassen Volcanic National Park late Sunday. Well, technically early Monday. Spent the day yesterday recovering. I'm not in decent hiking shape, and even the little hiking we did took its toll. That aside... Lassen is amazing! I got us a reservation at Butte Lake (we showed up on the wrong date, but that's another story...) more or less by accident. It was the first camping reservation I'd made, and the reservation web site is, well, lacking in user friendliness. Butte Lake is in a relatively remote north east corner of Lassen, inaccessible from within the park, off a gravel road. But there we find one of the most spectacular sites Lassen has to offer: the Cinder Cone. Bumpass Hell is great, but the scale and quality of Cinder Cone made it superior in my mind. Plus, you can hike to the top. We got to our campsite too late to attempt the hike, but we got close enough to really appreciate the spare beauty of the site. The Cinder Cone rises up from some impressive lava beds. I did of course take lots of photos, I'll post them when they're processed. Largely speaking, I feel we barely scratched the surface of the sites accessible from that corner of Lassen, and we'll have to return for a longer stay there some time.

Morning at Butte Lake was really amazing. It was cold when we woke, but nowhere near freezing. A couple of layers had me comfortable enough to take a trip out to the lakeside for some early morning photography, some of the best of the trip. We packed up and headed out to the main park area. We entered through the Manzanita Lake entrance, another wonderful sight. There we spotted a bald eagle! But I was too slow with my camera to make good of the opportunity. We stopped at the visitors center. I've decided the NPS has the best officials I've met, meeting them always leaves me happy. Got a few pointers for trails to take.

We first went up to Paradise Meadow. The trail was uninteresting to start with, but then as we climbed we found more and more wildflowers. The trail ran along a stream, and the end of the trail was in a large meadow surrounded by mountains. Our next trail was the Kings Falls. It is another short trail, goes down running along a stream, which cascades down some rocks. We initially thought those were the falls, but as awesome as they were, the falls were further along the way. The distance markers were a little confusing, we theorize that they stamped duplicates of certain signs and just decided to use all of them, even though they were a bit inaccurate. The falls go down many feet, but we can only see them from the top. So it is a bit difficult to fully appreicate them. On the return we went up a horse trail rather than along the cascades. The upper portion of the horse trail was quite wide open, very scenic.

After the Kings Falls trail we proceeded along the highway, climbing, climbing, climbing up to about 8000 feet. Right next to Lassen Peak. The peak looks quite doable, maybe next time we'll give it a shot. There's a rest stop with some food and water. Just past Lassen Peak is Helen Lake, one of two alpine lakes that are right by the roadside in Lassen. From there we went on to Bumpass Hell, just a few hundred feet further along. The trail is quite easy, runs along the sides of hills which challenged my fear of heights. Especially since my feet were not holding up so well. Bumpass Hell was other worldly. Anyone going to Lassen has no excuse for skipping out on this site, it is just that accessible.

From that high up we could also see the forest fires burning, and the smoke streaming out. The sight made quite an impression. I've captured some of it in photos.

Lassen is the highest I've been so far. I know, not that high. But we'll soon be going to the Eastern Sierras, to White Mountain, to see the bristlecone pines. We'll easily cross 10000 feet on that trip! I'm really looking forward to that trip.

As for photography, my new 24-105mm lens held up really nicely. The only problem with the lens is that the zoom drifts. I tend to wear the camera strap with the camera hanging down, and invariably the lens drifts out to 105mm. This is a bit disappointing. The expensive B+W polarizer turned out to be fantastic. Seems to lose less light than other polarizers I've used, even from B+W. Highly recommended. Using a tripod really brings out the best of the lens, otherwise even as good as the IS is on the lens, it still doesn't compare with the stability of a tripod. My best photos, morning at Butte Lake, were taken with a tripod. That is the only time I used a tripod. The difficulty with a tripod of course is that you have to carry it and set it up. This is difficult when you're trying to get through three trails, even short ones, through the course of the day. A good tripod, in other words, is virtually a necessity. Also, the 24-105mm is a good enough general purpose lens that I will try a different tack the next time: instead of carrying so much camera gear, carry a tripod instead. I didn't touch any of the other lenses I was hauling through the whole trip!

Thursday, July 31, 2008

To the northwest corner!

We're going to Lassen on Saturday. But I made an error in reservation, and sent us off to a night at Butte Lake. Accessible only by driving eight hours down a gravel road. At least it will be (relatively) secluded.

I am forgotten

I feel as though I've been forgotten, working here in front of my computer. But is it really that I'm forgotten, or that I've forgotten how to be?

Tuesday, July 29, 2008

Space profiler needs profiling

I struggled and failed against the Allegro CL 8 space profiler yesterday. I tried capturing a space profile for some of our software running on Windows. With only five seconds worth of data Allegro CL would chew up 300MB of heap in producing the profile. Needless to say there was nothing meaningful to be had there.

What I really want is to get visibility into which heap data allocated makes it to oldspace. I want to see what's making the image size grow. I'm not getting useful data from the space profiler, and I don't think I possibly could. Time profilers are much more useful in optimization, space profilers can't separate temporary allocations from those that get tenured.

Monday, July 28, 2008

Photos from India and Taiwan posted!

It took me a while, but I'm finally through processing all the photos I had taken in India and Taiwan. I posted them on picasaweb and flickr. I clearly have a long way to go in developing my photography... I also upgraded my 28-135mm lens to a 24-105mm L lens. My first L lens. And it is fantastic! Even zoomed in at 100% the pictures are sharp as can be.

I took the lens out for a test run around Stinson Beach. There was fog covering the whole area, except a hole around the beach that let in some nice warm sunlight. The beach though is far from photogenic. Not really all that much worth photographing. So I got some spectacularly sharp photos of fairly generic stuff, like sand, and seagulls, and sunbathers. All the exciting stuff happening off shore, like pelicans diving for fish, was too far for me to capture with this lens.

The lens is considerably heavier than the 28-135mm, but definitely seems to be a great investment. We'll be going to Lassen soon, which will give me a chance to really give the lens a workout.

Monday, July 21, 2008

The Dark Knight

I had a peek at The Dark Knight on IMDB. The movie currently is the best ever on IMDB, with a score of 9.6/10. It will probably come down, but this is an incredible score. Going through the user comments, found this: "Oh my freaking baby Jesus." Which sums up nicely how I felt when I saw the two minute bank robbery clip six months ago in the previews of I Am Legend. I've been waiting for this movie ever since.

Sunday, July 20, 2008

Whatever Happened to AI

I just finished reading Lenat's "The Voice of the Turtle: Whatever Happened to AI". It's in the Summer 2008 issue of AI Magazine, more precisely Volume 28, No. 2. In it he asks what happened to the premise of producing human level AI. There are several knee jerk responses that immediately come to mind, given that I've spent a bit of time working on AI:
  1. The meaning of human level AI is fuzzy. In the absence of a clear definition, the question is not really meaningful.
  2. Human level AI is not a unitary construct. An individual's conception of the world is greatly affected by the types of tasks she knows to perform.
  3. Humans are not logical in the manner that logic can be programmed into software. Although the paper isn't about an AI that mimics humans, we're going against the grain of the one example of intelligence out there.
Lenat is extremely intelligent, but quite opinionated on the right way to achieve intelligence. I disagree with his approach, primarily because I don't think of intelligence that has arisen out of nothing. Intelligence is a facility that has developed in response to particular evolutionary pressures. In our case, I believe it is to deal with increasing complexity of social structures. The application of social intelligence to other more general problems is a happy accident.

I do believe that an intelligent agent has to have some embodiment external to its representation of its world. In other words, an agent should be able to take in uninterpreted sensory information (even if it is symbolic), translate it into a format that is more amenable to manipulation, and take actions on that format. Communication with other agents must be one of the many actions the agent is required to perform. Without this overall architecture I don't think there is any way we can build a truly intelligent communicative agent.

Lenat's overall flaw is in believing an intelligent entity can be created by manually representing facts. I believe intelligence is seen where an entity is able to operate and adapt to an environment, and communicate with other agents. These facts are an afterthought, a story we have put together to explain how we do things. It isn't how we actually do things.

So, what happened to AI? I think it is only necessary in entities that have to be truly autonomous or embodied. Otherwise we'd generally be better off with the story of facts weve so far been creating.

Friday, July 18, 2008

Portrait Photography

I've run through my first experiments with portrait photography. And I have a lot to learn. I was working with the Canon 50mm F1.4, and the Canon Speedlite 580 EX II. I didn't have much control over the environment in which I took photographs. Generally I took them wherever the subject happened to be sitting. Here's what I've learned:

  • Small rooms with light walls work best for bouncing the flash.
  • If you have a glossy wall in the back, watch out for reflections massively affecting the flash metering.
  • I need a diffuser.
  • Don't point the flash at the subject. Bouncing it off walls works suprisingly well for getting an even exposure. Consider pointing the flash away from the subject.
  • Ideally, avoid having a wall behind.
  • Even wide open, the background can be distracting.
  • Fill the frame. Not entirely, but do fill it.
  • 50mm, even with a crop sensor, is not enough as a good portrait lens. I think I need at least 100mm.
  • BUT, if there's a possibility that you might need to accommodate multiple subjects, a shorter lens is quite helpful.
  • Overexpose. But don't underexpose. A lot of my photographs were underexposed.
  • By focusing on the eyes you really do get an image that, even though the face is fuzzy at many other spots, turns out to be quite pleasing.

I didn't really have much time, or many opportunities, to get the shot right for many of the portraits. I'm also unclear on how to set up an on-camera flash and have it work with the camera in portrait mode. Every portrait was taken in landscape mode. Though not bad, it does potentially waste a lot of screen real estate. Well, I need to practice, practice, practice...

Thursday, July 17, 2008

RDF and Hypergraphs

Today I discovered directed hypergraphs in all their glory. Well, OK, not all their glory, since information about them is pretty sketchy. There seems to be only one entry in Wikipedia about hypergraphs, with some of the math involved in them. Digging deeper into the literature brought up directed hypergraphs, which according to Wikipedia should not exist. So... what is a hypergraph?

While a graph has edges that connect pairs of nodes, hypergraphs have edges that connect an arbitrary subset of the graph's nodes. A directed hypergraph places a direction between two sets of nodes, where the sets are disjoint. There are mathematical formalizations for these graphs, naturally, but I just want to stick to the intuitions.

Naturally, the resulting data structure can't be visualized in any reasonable way using nodes and edges. So... what are they good for?

I started looking into this knowing that they have been applied to semantic web problems. Google brought me pretty quickly to the realization that hypergraphs have been widely considered for representing RDF. Each node in such a hypergraph could be either the subject, predicate or object of an RDF triple. The triple itself is then an edge in the hypergraph. The hypergraph must be directed, otherwise it isn't possible to distinguish subject from object.

Unfortunately there doesn't seem to be enough discussion about precisely how such an edge should be interpreted. Would the source and destination sets be {s, p} -> {o}, or {s} -> {p, o}? Because, really, we need to distinguish between subject, predicate and object. And in the representation above you can't really tell (other than by the letter prefix I've used) which one of the three is the predicate.

There are other ways I've seen of applying hypergraphs to representing RDF. There was a paper that described the application of hypergraphs to mergining ontologies. The description was quite precise and relatively clear.

Monday, July 14, 2008

More camping trips planned!

Took a trip up Mission Peak again on Saturday morning. It was good, but we're both out of shape. This is a hike we should be able to do in less than two-and-half hours. With loads. So we need a lot of practice. We're taking an optimistic outlook, and have made reservations at June Lake, Lassen, and are trying to get a decent reservation at Angel Island. June Lake is near Mono Lake, and is going to be a launching pad to the White Mountains. Home of many bristlecone pines, the longest lived individual organisms in the world. I've never been to the Easter Sierras, so I'm really excited about this trip. It'll be freezing up there at night, but that's OK. It'll be worth it. Hiking at over 7000 ft for a couple of days!

I'll take lots of photos, naturally, but I wonder how I'm going to put together the time to sort and process all of them. We're talking at least another thousand photographs over the next couple of months. And I'm having a hard time getting through the India photos. I'm having a great time creating bluer skies in my photos than I had caught at the actual time of taking the photo. It really creates so much more drama in the photos. I also tried creating a panorama using hugin from some photos I'd taken at Rajaji National Park. Took forever, and ultimately I had to discard half the photos I had taken for the panorama. It didn't matter though, I still ended up with an image multiple megabytes in size. I'm not sure whether it was really worth putting together that panorama... It isn't *that* good.

I wish we'd been at Rajaji when the elephants were out. Oh well.

Thursday, July 10, 2008


Once again I'm going to attempt restart blogging.

I am still going through the photos from India. Taiwan's done. They have been going on to flickr and picasaweb. Though I'm putting our personal photos only onto picasaweb. They have better privacy control facilities.

I've also received positive feedback from another photographer I'd met at the SF photography meetup. It's given me impetus to find a used 24-105mm to replace my current 28-135mm, which I'd describe as a decent consumer lens. There are far too many instances where the 28-135mm has fallen short, and this is my most frequently used focal range. An improvement here will be an improvement across most of my photographs.

On the work front, I've started looking again into financial planning. I'm extremely interested in seeing what can be done with socially constructed ontologies. However, there aren't all that many examples of such ontologies. And there are even fewer examples where these ontologies have been subjected to reasoning. (Rather, I haven't found any examples where reasoning has been applied to socially constructed ontologies.) I should look at the biomedical wiki and other biological ontologies a bit more closely. I believe these involve trained individuals guiding the construction of the ontology, so it may not be fully in the socially constructed sphere.

On the programming front, still spending all my time with Common Lisp. I took a stab at Python a little while ago, and will go back to it soon. I'm underwhelmed, to put it bluntly. There is syntax to make some operations that are more verbose in Lisp a bit more compact. But the language isn't really interactive. Anything you had created in memory goes out of date once you update the definitions. And speaking of updating definitions, there isn't a straightforward way of reloading all updated source files. The language is interpreted, not really interactive. Contrast that with Common Lisp, where you have a well defined protocol for updating an instance to an updated class definition... This is not even considering all the differences between the languages. I have yet to experience anything in Python that has shown convincing superiority over Common Lisp.

Finally, finances. I've been losing money. The market is becoming cheaper from a long term perspective. But I don't have, at present, funds to invest. A lot is tied into the remodeling of the house. There are nagging doubts about whether we've taken the right approach to managing our money. But it is too late for that now. I still don't know how easy or hard it is going to be for us to refinance. I hope we can get to it in the next couple of months. We really need to wrap up construction by then.

That's it for now. More tomorrow.

Tuesday, March 11, 2008

Sierra Winters

We've been visiting parks in the winter. We just returned from King's Canyon and Sequoia, and two weeks before that we'd been in Yosemite. The Yosemite photos are up on picasaweb. I'm still going through the King's Canyon photos. The parks looked fantastic, Yosemite valley in particular was just amazing. Sequoia and King's Canyon are not as scenic as Yosemite, but were nevertheless worth it. All the peaks around King's Canyon are naturally inaccessible by car this time of year, we could only appreciate them from a distance. I want to visit again in the summer and catch the sites I missed this time around. Mt Whitney is the highest peak in the lower 48 states of the US, and it is part of King's Canyon.

We did see Marble Falls, in the Sequoia foothills. It isn't that bad a hike, and seeing a waterfall running down large marble boulders is, well, stunning.

I also got a chance to try out HDR for the first time on the Yosemite trip. The results you can pull out are fantastic. I've tried taking a few more HDR sets in King's Canyon, this time some of them are hand-held. Photos will be posted as soon as we're done with them.

Next up: Pinnacles National Monument. There's supposed to be a good wildflower spot there. And wildflowers this year (on the way to Sequoia) were simply stunning.

Monday, February 4, 2008

Goodbye netflix, hello greencine

I got sufficiently frustrated with netflix new releases page that I have now left netflix. There have been many complaints about this new feature, and they seem to have largely fallen on deaf ears. My last act as a netflix member was to gather my ratings and queue entries. Netflix does not make that easy. The queue was much easier. I don't know if the queue is filled dynamically, but it doesn't matter. Select some text in the queue page that includes some of the films in your queue. Then right click in firefox, and select View Selection Source. Select everything in the source window that pops up, paste into emacs, and a little regexp search-and-replace later (or a keyboard macro or two later) you have everything from your queue in a nice list. I haven't done this yet, but I'm quite certain this is going to be easy.

Here's what you need for the more complicated task of grabbing your netflix ratings:
  1. Firefox.
  2. Greasemonkey. I thought this was a gimmick until I actually tried it.
  3. A bit of hacking.
I started off with the Getflix Grabber by Anthony Lieuallen. This script is a lot more complex than what I wanted: it sets up a server side php backend with mysql database to store the ratings and movie details. All I wanted was the movie name and rating, so that I could recreate the data in my new movie rental site, GreenCine. So all I had to do was to produce a version of Getflix Grabber that didn't put everything into a database, and didn't grab any additional movie details.

So instead I had the script open a second Firefox window, and write all the titles and ratings into that. For some reason, opening another tab and writing it into there didn't work. I had to go back into the tabbed browsing preferences, change to opening a new window on a link, then execute the script. Maybe that's just a quirk of the mac version of firefox, but that doesn't make the problem any less annoying.

The script puts a pair of start/stop buttons at the bottom of the page. Hit the start, and it will gather all the ratings. Keep an eye on it though, after getting all the ratings it seems to start getting recommendations. You might want to kill the script when it hits that point.

You can't just save the page when you're done gathering ratings either. The source of the page is empty, since all the content is dynamically generated. Instead, once again, select something on the page, view selection source, and save the contents of the page by copy/pasting to your favorite editor.

So the next task now is to put all the ratings and my queue into GreenCine. I suspect I may have to do that from scratch, since GreenCine is a much less likely target for scripts.

Here's the source code of my version of Getflix Grabber:

// ==UserScript==
// @name GetFlix Grabber
// @namespace http://sfmishras.com/getflix
// @description Grab all the data about your NetFlix ratings, pass it to the GetFlix analyzer.
// @include http://www.netflix.com/*
// ==/UserScript==


// Set up the UI

GM_registerMenuCommand('Start GetFlix', startGetFlix);
GM_registerMenuCommand('Stop GetFlix', stopGetFlix);

var button1=document.createElement('button');
button1.setAttribute('style', 'margin: 0.5em 1em; vertical-align: middle;');
button1.addEventListener('click', startGetFlix, true);

var button2=document.createElement('button');
button2.setAttribute('style', 'margin: 0.5em 1em; vertical-align: middle;');
button2.addEventListener('click', stopGetFlix, true);

var menu=document.createElement('div');
menu.setAttribute('style', 'text-align: center; border: 10px solid #B9090B;');


// Output window

var output;
var ratingsTable;

function setupOutput() {
output = window.open();
ratingsTable = output.document.createElement('table');

var ratingsHeader = output.document.createElement('tr');

var ratingsColumn = output.document.createElement('th');

ratingsColumn = output.document.createElement('th');

// Start function

function startGetFlix() {
// init a single-task queue
actionQueue=[ ['getRatingsPage', 1] ];
// and start the queue running!

// Stop function

function stopGetFlix() {
// stop the queue runner
// and empty out the queue


// To control execution speed

var niceness=150;
var nicefact=0.33;
function getNice() {
var min=niceness-(niceness*nicefact);
var max=niceness+(niceness*nicefact);

return ( (Math.random()*(max-min)) + min );


// Run the queue

var actionTimer=null;
var actionQueue=[];
function runQueue() {
actionTimer=setTimeout(runQueue, getNice());

var action=actionQueue.shift();
if (!action) return;

console.log('Queue length: '+actionQueue.length+'. Running action '+action[0]+'.');

switch (action[0]) {
case 'getRatingsPage':
case 'parseRatings':
parseRatingsPage(action[1], action[2]);
case 'saveRating':

function getRatingsPage(pagenum) {
var url='http://www.netflix.com/MoviesYouveSeen?pageNum='+parseInt(pagenum);
console.log('Fetch:', url);
'onload':function(xhr) {
actionQueue.push(['parseRatings', pagenum, xhr.responseText]);


function parseRatingsPage(num, text) {
var ratings=text.split('addlk');
ratings.shift(); // get rid of the HTML before the first one

for (var i=0, rating=null; rating=ratings[i]; i++) {
try {
var detail={
'title':rating.match(/"list-title">(.*?) 'year':rating.match(/"list-titleyear"> \(([0-9]+)\)/)[1],
'mpaa':rating.match(/"list-mpaa">(.+?) 'genre':rating.match(/"list-genre">(.+?)</)[1],
'rating':rating.match(/([.0-9]+) Stars/)[1]

actionQueue.push(['saveRating', detail]);
} catch (e) {
console.debug('Couldn\'t parse item '+i+' because:');

if (text.match(/alt="Next"/)) {
actionQueue.push(['getRatingsPage', num+1]);


function saveRating(detail) {
// for (key in detail) {
//alert(key + "=" + detail[key]);
var title = detail['title'];
var rating = detail['rating'];
var outputRow = output.document.createElement('tr');
var outputCell = output.document.createElement('td');
outputCell = output.document.createElement('td');


Wednesday, January 30, 2008

SLR news from dpreview

Came across a few interesting items on dpreview in advance of PMA:
  • Sony's new DSLRs have an interesting implementation of LiveView.
  • Sigma has a "military hardware" style 200-500mm F2.8 weighing 16kg!

Tuesday, January 29, 2008

Semantic wikis revisited

I've started working on semantic wikis again. We're planning a study here at work to see whether semantic wikis can be used to create content suitable for reasoning. We don't know either way yet, I suspect the answer is going to be mixed.

Our platform is mediawiki. We have installed the semantic mediawiki and semantic forms extensions. The software is a bit cumbersome to use, but we've been able to define some forms, templates and content for our study. We're still waiting on a version of the Halo extension that works with semantic mediawiki 1.0.

I've found the idea of a semantic wiki intriguing and promising ever since I first heard of them. Now I'm getting the chance to study how well the idea works in a near ideal environment, and where it falls short. This is also the first project I've put together. I'm really hoping it goes smoothly, I'd like it to lead to some interesting follow-ons.

Saturday, January 26, 2008

Updated photography wish list

Having gained significantly more experience, I thought I'd update this list.

Canon EF 70-200mm f/4L IS USM Telephoto Lens
1.4x teleconverter
Canon EF 24-70mm f/2.8L USM Lens
Tamron 180mm macro lens

In that order. But the next item I'll actually get will most likely be Photoshop.

Flashes and tubes

I've recently acquired two new pieces of equipment: a flash, and extension tubes for macro photography "on the cheap". I bought a Canon 580 EX II flash a few weeks ago, but I haven't really used it all that much. I haven't really had the opportunity to use it, and when I do have the opportunity I'm afraid to use it because I haven't used it enough. A chicken and egg situation that I must resolve. Otherwise that wonderful piece of equipment is going to be a complete waste. I've tried taking a few photos of Richa with it, and have been learning how to use wall bounce to create a nice, pleasing, balanced light. Overall, it has turned out to be impressively easy and straightforward. The next time I have a chance to use the flash, I am not going to hold back.

Today I also picked up a set of Kenko extension tubes. Extension tubes are an interesting piece of equipment. They fit on the lens, between the lens and the camera, basically adding some air. This causes the focal distance of the lens to be reduced, and gives the lens much greater magnification. The lens at this point can no longer focus to infinity, but only within a very short range from the lens. With a 50mm lens, the focal range of the lens is only a few inches, and drops to only a couple of inches if you use all the extension tubes. The photo quality is surprisingly good, but I now see why it is essential to have a decent flash for macro photography. The amount of light available is so small, you effectively have to choose between having reasonable depth of field, and having a reasonable exposure time. With a tripod, this setup will work pretty well for still subjects, such as flowers. But I can't imagine using this setup for insects. At some point I will have to see if it's possible to set up the external flash off camera and use it for macro photography. Canon's twin light and ring light are fairly expensive pieces of gear, with very specialized use. This is not how I want to spend my photography budget.

Photographing through glass

Remember those early morning photos of the moon? They will need quite a bit of cleaning up. The reason? There are two possibilities:
  1. Double planed windows. Don't photograph through double planed windows at night. A light on the outside can cause havoc. The inner pane reflects light to the outer pane, which then reflects it back into the camera.
  2. The camera itself. I had a non-multicoated UV protector on my lens. That might have reflected light from the lens itself back to the window, which reflected it back.
I'll ping my Adobe connection again and see how close I might be to getting photoshop. Lightroom doesn't have as many tools for doing cleanup etc, which is a shame.

Thursday, January 24, 2008

Merge done

I'm done writing my little blog merging program in Java, and as you might see from the current state of this blog, the merging worked. I don't know if it would have been able to do it any faster by hand. But I got a first hand look at the Google API. The API is generic, which is both good and bad. Good because they can apply it to any service that can be viewed as a feed. Bad because it isn't necessarily clear what a particular call really does. You have to familiarize yourself with the underlying data model which is generally based on the Atom standard.

Where to from here? I've spent a few hours writing this program. Should I turn it into a full fledged blog management tool? It will be a really good exercise in learning some UI programming. I've never really invested serious time into that.

Oh, I also discovered how little functionality the Blogger post management console really provides. Ugh.

Wednesday, January 23, 2008

I am authentic!

Well, at least in the Google sense. I wrote a little Java program (that stretches out beyond the blog's viewing area, but you can still copy and paste in its entirety) that's able to authenticate me against Google:

public static void main(String[] args) {
GoogleService blogger = new GoogleService(BLOGGER , APP_NAME);
String authToken = null;
try {
authToken = blogger.getAuthToken(username, password, null, null, BLOGGER, APP_NAME);
} catch (AuthenticationException e) {
System.err.print("Could not log into account for " + username + ". Aborting.");
System.out.print("Authentication succeeded, received token " + authToken);

I know, it isn't much. And had I actually read the full developer document, I would have been able to code this up in about five minutes. Instead, I managed to spend a half hour doing it. I'm special.

Oh, there was no snow on Highway 35. Only lots of fog. And the vista point they have at the corner of 92 and 35 is pretty unspectacular.

What's the cause here?

Found this interesting graph... Question is, are states democratic because of their economic standing, or do states have their economic standing because they are democratic? I tend to think it's the forme, but the latter point of view gives me a pause. How would you refute it anyway?

Gravity rediscovered

Apple's stock seems to be in a free fall this morning. I'm not really that concerned about the stock market in the long run. It's going to do what it's going to do. I'm worried about the short term: we have taken on a construction project that's going to require more money input than our cash can cover, and I really hope I'm not forced to sell stock with the market doing this badly. There are some tax returns coming, I hope they cover our needs.

Couldn't make it to 35 this morning to photograph snow. Work intervened. I'll head out a little early today though and see if I can get a few shots on the way back home. I stopped at the Crystal Springs reservoir though. Without the sun lighting the clouds, it was possible to capture the shape of the mist over the hills. Crystal Springs could be a really interesting site to take photos, but the SF Water District is very protective of that land (with good reason) and has the water fenced off. The best shots there are quite out of reach, which makes for a very frustrating experience.

Tuesday, January 22, 2008

The interminable wall st slide

I've been pretty unhappy with the state of the economy ever since I started keeping track of it in my own little way. (It hasn't been that long.) The US is spending money in Iraq, it seems basically through borrowing. The dollar's slid so much, traveling abroad (even if I could, but that's a different story) seems like a pipe dream. The Fed's cutting rates to prop up a poorly functioning stock market, at the risk of raising inflation even further. This very myopic world view in which the US operates is extremely troubling at so many levels.

Early morning photo op

Woke up this morning to find a big beautiful full moon over city night lights. I think I got some good shots before clouds and daylight took over the scene. I got a glimpse of stories of snow on Mt Diablo. I wonder if there's snow on 35 again this year. I'll likely check it out tomorrow on the way to work. Can't leave the car home in any case.

Too bad tonight's photos came after the close of the photo assignment at MBP. The moon would have made a wonderful long exposure shot.

Monday, January 21, 2008

Merging blogs

I've decided I need to blog more frequently. And having a single blog is going to help me in that endeavor. Being a programmer, I naturally can't just copy and paste the blogs. No, I am going to try and do this through software. Open source lisp software. On windows.

I must be mad, that's not going to work. There isn't a decent SOAP library for CL, which makes this project a non-starter. An alternative might be to use FOIL to construct a Lisp to Java bridge, and use the Java APIs to fetch data from blogger.

How about I just do this in Java instead? Lisp has a very small role to play in this process, so I'd spend all my time figuring out how to move data from Java to Lisp and back, without really doing anything interesting or sophisticated in Lisp.

Pragmatism wins.