Wednesday, January 11, 2006

Improving problem description

This week is set off for improving the chapter on problem description. Every thesis has a "problemstilling", a problem which the thesis should solve. A goal, or a challenge. Currently the chapter looks a little something like this:

Challenges

What are the challenges that have pushed forth content management. What are the problems IT-departmens suffer from today related to web content. Issues on web-management.

The issues of web content management

Content is not maneuvrable. There is too much of it, too many web pages with too many attached documents. Often a corporation will put much resource into sustainin a site map and a navigation tree, but if these are made manually, it will be a lot of work and no guarantee to be correct. Searching is a great shortcut to make all content available, but searching the right way is easier said than done. Does the search engine check if the search word was incorrectly spelled? Are there any synonyms of the search word which should be checked?


Content is useless. The web page is full of dead links. There exists many pages and documents which are not linked to at all, and therefore never will be accessed. It is safe to say that content which is not accessed and used has no value.


Content is not automatically accessible. No XML export. Recently many news-sites have offered the option of subscribing via popular RSS-feeds. By subscribing to these feeds in RSS-readers or news-aggregators, the process of collecting news from these sites is turned from a pull-protocol (actively surfing around on news-sites) into a push-protocol (content is pushed to the reader, like mail to a recipient).


Content has no meta information. There has a been a noteworthy increase in the ability to tag or label various data objects with meta data, like in the header of a HTML-page, or in the properties of a Word-document. It is difficult to force users into actually using these features manually. If the title of this document is "Content Management", why should I write in its meta-data that it is about the same topic? A possible solution to the meta-problem lies in automatically tagging content [HP, 2004].


Content is technically unaccessible. Dependancy to specific software or platform restricts the numbers of users.


-------

So I need to come up with something more completing this chapter. A good CMS doesn't produce the problems mentioned above. CMS-es like this already exist, I'm sure. And the goal of the thesis could indeed be to present a CMS solving these, by the use of open standards. To get the open source bit in, I should add something about functionality and customization (functionality is content too!, like Boiko said). A old rusty CMS, or even a modern one (but not a tidy one) can be quite hard to extend, having components which are not reusable. Content is not reusable.

Interestingly, I'm not the only one who's been asking questions about meta-data. Seth Cambridge is another blogger I just added to my bloglines. But still it remains a problem that so much of the CMS theory landscape remains opinions through blogs and online articles, mediums not really appreciated by the people who will judge my thesis. I might have to get back to basis and read up on some ancient IT-theory I can reuse in this context (but I haven't really got time to do that).