Darwinian Web
Adam Green's thoughts on the evolution of the Internet

Posts tagged as: microcontent

Exploring the tacit knowledge between RSS and the Semantic Web

Posted on Friday, April 14, 2006 at 8:48 AM (permalink)

I started reading about the Semantic Web again last week, and my immediate reaction was the same as the first time I tried a few months ago. This is such a perfectly specified, intellectually rigorous collection of standards and practices that it seems almost impossible to find an entry point. If it is so hard to get started, then how does anyone work with it? The answer is that the people who understand it now are the same people who helped to build it. Each of the many sub-standards and protocols were introduced in reaction to a specific problem discovered during the creation of some other portion of this edifice. An analogy is trying to understand how an immense cathedral could possibly have been built by walking around the finished building. Once the scaffolding and the masses of workers are long gone, it seems like every part fits seamlessly into every other, and the thousands of decisions that were made during its construction are erased.

I decided to pull back a step and look at the areas of namespaces in RSS and the many competing standards for structured microcontent on the Web. This is much messier and clearly a work in progress, but once again as with the Semantic Web, the same individuals keep popping up in these many projects. The problem with the social nature of the construction of microformats, structured blogging, RSS, Atom, etc., is the unspoken, or at least underdocumented, aspects of the decision process. Why are there two competing sets of blog microcontent formats? Why are there apparently dozens of overlapping collections of RSS namespaces? The answers are lost in the maze of blog posts and standards announcements made over the last few years. Why isn't everyone involved with this area terminally confused? Because they lived through the process and understand the political, social, commercial aspects of each of these multiple body collisions.

What we now have is a continuum from the ultra-simplistic, under specified formats of RSS and OPML to the ultra-rigid, crystalline perfection of the Semantic Web. In between is a rabbit warren of partially completed, interconnected attempts to add more structure and functionality to RSS and HTML.

So what is the solution? I'm not conceited enough to believe that I can unravel the current mess lying between RSS and the Semantic Web, and I'm also not smart enough to try to storm the castle of the Semantic Web by brute intellectual force. What my past history has shown me is that I am capable of helping people build tools and writing documentation that can help bridge this gap. The process I'm going to follow is to start studying and coding with the RSS namespaces and microcontent formats until they gradually make sense, and then try to get tools built by others that will provide a more accessible conceptual model. In other words, I'm going to live there until I grok the neighborhood.

I went through the same process when I moved to Boston. The classic line when trying to explain how to navigate witihin Boston is "I can't tell you how to get there, but I can take you once and show you." This is a perfect example of tacit knowledge. It is something you and your community knows, but which can't be explained in words. It may be an urban legend, but there are many stories of truck drivers paying taxis to lead them through Boston's streets to a specific location. The only way to deal with Boston's streets is to carry a map for the first few weeks until your brain somehow builds the tacit knowledge you need to feel comfortable.

Structured Blogging is a key step toward a defacto SAPI

Posted on Wednesday, December 14, 2005 at 10:08 AM (permalink)

At Syndicate today Marc Canter announced a set of XML data standards for encoding various types of microcontent, such as movie reviews, that he is calling Structured Blogging. This is clearly needed for the growth of features around RSS and the standardization of the XML returned by a SAPI.

I can see my tail getting longer

Posted on Friday, December 9, 2005 at 8:26 AM (permalink)

Just as when I added tags to the site, the switch to individual pages for each post is again adding on to the tail at the far end of my readership stats. Spreading out all of my posts doesn't just change the readership habits by allowing people to link more precisely from the outside, it may well influence my writing as well, as I get a better view of what people are interested in.

The importance of being unique

Posted on Thursday, December 8, 2005 at 5:22 AM (permalink)

The essential ingredient in a microcontent data model is a unique id for each post. I had been assigning unique ids to each post from the beginning, (Hey, I'm a database guy), but I didn't use them in generating the blog pages. Instead I used the date and time of the post to identify it in URLs, so a post written in the early morning on December 5th would have an address of:
http://darwinianweb.com/archive/2005/1205.html#6:47AM

Now it has a URL of:
http://darwinianweb.com/archive/2005/116.html

David Weinberger has just written a useful essay on the benefits of unique ids for each blog post.

Breaking down the blog into microcontent

Posted on Thursday, December 8, 2005 at 5:03 AM (permalink)

Today I started producing blog pages using a model based more completely on microcontent. What this means in practice is that each post now has its own page. I originally wrote the code to produce this blog with a daily organization of posts: the home page held seven days of posts grouped by date, and there was an archive page for each day with all the posts for that day. I realized after a month or so that I don't really write in patterns that fit a day, perhaps because I'm minimally autobiographical. I tend to write 2-3 multi-paragraph posts on different subjects each day. So it makes more sense to treat each post as a separate bead on the string, and move from one post to the next, instead of one day to the next. The internal advantage, is that I can now track the readers' collective attention more easily. Incoming links will be directly to the post, rather than the date, so my readership stats will reflect what is read on a post-by-post basis.

Switching from a relational to a microcontent model

Posted on Wednesday, November 30, 2005 at 7:15 PM (permalink)

For the past week I've been writing about data structures and microcontent here, and planning the database architecture for my new Ruby-based blog code on my Ruby blog. It's interesting how two different streams of thought can influence each other. I've been building relational database applications since 1981, and I'm now thinking about the ways that a microcontent database model would affect a web-based application.

The basic difference between the relational and the microcontent model is flexibility and variability. A good relational database is designed with three things in mind: speed, speed, and speed. That means streamlining everything in terms of consistency. While relational databases can be modified relatively easily in terms of physical arrangement of data between files, trying to add a completely new data type can take some serious programming, and trying to adapt to a new data structure during run-time sounds suicidal.

If you want to buy into the microcontent model of individual packets of information, each of which can have different types of data, then you need to move to an object-based database, in which each item can have a unique internal structure. It is most likely that developers will use relational databases for quite some time to handle microcontent, but this will change as they start thinking about flows of individual data items instead of a structured set of normalized data tables.

Of course, someone will have to do the hard work of making a fast microcontent database. I'll be evaluating new object, xml-based, databases in the future. Oh boy, sounds like fun!

What do you mean it's free?

Posted on Wednesday, November 30, 2005 at 8:00 AM (permalink)

I had dinner last night with an old friend from the software business and once again had one of those conversations where we try to come to grips with a new Internet economic model. In 1995 I was telling my software friends to drop everything and start publishing web sites, but what is the business model they asked? Why should they give content away on public web pages when they could publish with AOL or Compuserve? In 1999 I was telling them to read The Cathedral and the Bazaar and try to wrap their heads around free software. But how can we give away software and still make money they cried? Last night I explained what I knew about microcontent, and said that in their rush for customers the major content holders and search engines would provide unlimited APIs and RSS feeds for all of their content.

The next wave of freely available intellectual property will once again distort the Internet economy, but that won't prevent it from happening. I don't think that there is some inevitable progression to all IP being free. Each set of changes took place for different reasons and in different times, but it is clear that massive change can occur before there is an economic justification. The Internet doesn't care if anyone makes money or loses money, the Internet serves the crowd.

The "Berlin Wall" of this next burst of data freedom will be when Google unlocks the limits on its search engine API. I say to you Brin and Page, tear down that wall!

I just grokked microcontent

Posted on Tuesday, November 29, 2005 at 12:54 PM (permalink)

Sometimes you realize that two or three different terms actually refer to the same idea, and suddenly the world seems a little clearer. That just happened for me with "microcontent." I had filed that away as one of those buzzwords I woud have to decipher eventually. I was much more interested in the idea of individual chunks of data, such as blog posts, floating freely through the datasphere. I was reading a post on Joshua Porter's Bokardo, which led me to a great essay by Terry Heaton. I saw that Terry's idea of "unbundled media," and Googlebase entries, and RSS items are all examples of microcontent. Now I feel better. A lot of walls have collapsed into a large common area.