Tuesday, February 21, 2006

New background-introduction

I went through my intro and have started to rewrite the whole thing. Posting it here to show A. The difference is that this intro takes a more top-down, basical approach.

Background

The last ten years have seen revolution after revolution within information technology and telecommunications. The rise of the Internet, the success of the World Wide Web, the availability of personal computers and server performance, more recently the circulation of mobile devices and the distribution of broadband Internet are all trends of the new technological infrastructure which supports the world of modern assets which is electronical or digital data and information.


As to illustrate the increase in digital capability in containing data, one might consider the fact that the information estimated lost in the burning of the Great Library of Alexandria would fit on one single DVD. As storage space has grown, and network bandwidth has widened, the mass of digital information has exploded, both internally on intranets, and on the Internet. Users of the Internet have been most significantly effected by the increase in e-mail traffic and the amount of documents and pages available on the World Wide Web.


The value of information is only equal to that of its use. To use information, it must be found, recovered, formatted and presented. Information which is stored but never used is worthless. Digital information is enabled by the use of Information Systems. Before one can define the particular kind of Information System referred to as the Content Management System, one needs to define content itself, and seperate it from data and information.


Definitions

Data, information. content and knowledge are four ambigous concepts which are regularly applied in Information Systems. If allowed to delimit the definition to digital representation, we leave out the definition of knowledge for now, focusing on the other three. These terms have various meanings, and are potential candidates for extensive ontological discussion. To avoid confusion, the meanings of these terms as used in this paper are defined as follows:


Data

The basic unit of digital representation which can be used to construct information and content with more value for the consmer. Data is raw and granular. It does not inherently have any meaning, meta-data is not self-contained.


Data is a set of symbols, ranging from a numeral value to a string of words, or even a large series of encoded symbols that compose a binary value representing sound or picture. One often mentions data processing, feeding data as input to a program or algorithm, the output being either new data, information or content. Imagine calculating the mean of a hundred numerical values into one number. Data has been processed, but no meaning has been added. Had the value been wrapped with the context that this is the average temperature for the last three months, it could have been considered information.


Information

One definition of information is data with meaning (Davenport and Prusak, 1998 [fix]). The same information can be conveyed with different data. Pieces of data combined with meta-data to form a package of meaning that can be conveyed. Bob Boiko includes all the common forms of recorded communication. Liz Orne ([Boiko 2002]: Orna, E (2004) Information Strategy in Practice, Aldershot: Gower, p. 7). describes it as knowledge transformed into a transportable format, visible or audible.


Content

This is perhaps the vaguest term which we must define. Ideas include


  • Information put to use [boiko 2002]

  • Information with human meaning and context [wikipedia]

  • Information with an intended consumer, artificial or real [personal note]

  • Information with a purpose (the now disbanded ContentWatch organization's definition [Boiko 2002, p. 8]) .


The definition used in this paper is streamlined for how content can be handled by an Information System. A collection or subset of information intended for a given audience or non-human consumer with a context of location, period and situation.


Content management

Now that the definition is in place, the segment of Information Systems known as Content Management Systems can be defined. Note that in the industry of content management, the use of the term is indeterminate. Some CMS vendors claim their services feature knowledge management or enterprise content management. On the other side of the scale, many lightweight web applications claim to do content management when they actually are providing what is by most percieved as web content management, or perhaps merely weblog or wiki functionality.


Content management means different things for different actors. The basic lifecycle of content is production and consumption. For the producer, the processes of content management includes creation, formatting, structuring and integration of content. For the consumer, it includes search, export, and display. The sum of these processes make out content management. A content management system (CMS) is a suite of tools designed to assist and support these processes.


Web content management

As pointed out earlier, the explosion of digital information has been most significant on the World Wide Web. To manage this mass of online content and use, a new breed of information systems has evolved; the Web Content Management System (WCMS). The responsibility of such a system is similar to that of the CMS, only it is delimited to content which consumption is done by way of the World Wide Web. [See “Why only a web content management system” to see how WCMS has become detached from the CMS].

Tuesday, February 14, 2006

Hero of the Week Award

Hero of the Week Award: " It is pathetic and sad that the scientists are perpetrating an act, not of mere hubris, and larceny, but seriously, murder."

Just noting that many of the articles I review are from journals available only to subscribers (that includes my university, so as long as I'm on a UiO-IP, I'm free to browse for example the ACM portal online, same goes for the IEEE periodical).

Why is this? My education is paid for by the government (the people). My professors are paid by the government, and so is their research. Why should the knowledge they produce only be available to subscribers? Annual membership to the ACM Portal is $100. IEEE charges 13$ for one article (subscription prices are insane).

Article review: A Fragment-Based Approach for Efficiently Creating Dynamic Web Content

The system described in this article takes it several steps further, using Java to dynamically generate web pages based on fragments of content. There is a focus on performance, utilizing cache and and graph algorithms (ODG) for conditionally rendering/re-building/publishing pages when fragments are changed. The templates are made with the ESI language (looks a bit like SSI). The major part of the article undertakes a series of benchmarks and statistics to evaluate performance, but also displays functionality for searching, evaluating incoming and outing hyperlinks, as well as other handy ODG analysis stuff.

This article is a bit too technical for me. There is of course a crucial need to research performance under GRUPA, and this article does a fine job of pressing the issue, but still it's not particularly within my scope. Will still reference it under the levels.


Article review: A Simple Web Content Management Tool...

[title continued:] .. as the Solution to a Web Site Redesign

How long can these titles get? Anyhow, the article is from the ACM journal, only a couple of pages long. It describes how they (programmers at University at Buffalo) in a two month period implemented a WCM tool with Perl/CGI to perform what I would call the templating of their webpages. They also created some style guides (part of CM strategy).

Will put a reference to this one in somewhere under the levels of CM.


Thursday, February 09, 2006

The difference between a portal and a WCMS

What is the difference between a portal and a WCMS? I've been asking myself that question since the beginning of this thesis. Others have asked as well. Indeed, the WCMS I've been working on for Primetime used to be called Primetime Portal.

Now it seems the question is bubbling around in the CMS blogosphere these days. John Quirk suggests "..if they are dealing with content a CMS solution is where they should be focused. If they are planning on allowing access to back end applications or information stored in those applications, a portal solution is a better fit.". Toby Ward claims the gap will shrink (or blur, perhaps making it easier to fall down the gap?) as both product families grow and mature. Bob Boiko suggests Knowledge Portals are the last trick from the KM camp. James Robertson has written a white-paper on business portals, explains the concept of portals rather well.

When I present my thesis to more-or-less technology aware people, they sometimes ask "Oh, you're writing about portals then?". The word portal sometimes seems to have become synonymous with company homepage. The definitions are a bit loose, so the answer could actually be yes. The views below explain why:

Portal people: Content management is part of the portal. We have a CMS portlet in our portal. A CMS alone doesn't have personalization or integration of different content sources, nor pluggable functionality in the form of portlets.

CMS people: A portal is just one way to display the CMS to the end user, just like a blog is another (smaller) way to perform CM. Portal is just a hyped product name. CM is the theory behind it. We have pluggable functionality in the form of templates. Personalization is pointless (who cares if you want a pink webpage?).

A CMS is content oriented. Is a portal knowledge oriented? Portals try to be the silver bullet of corporate information management, but the above bloggers suggest that they have become too complex and do still not possess sufficient CMS functionality. Don't get me wrong, portals are very useful for distributed corporations, serving as a central source of information. I haven't played around with too many enterprise class portals, but I can guess they still lack CMS-functionality like content version control, wysiwyg editing and workflow, as Ward mentioned.

So how to relate to portals in the thesis? Should a portal be part of a CM strategy, or should the portal be an integration tool outside the CMS? I think I'll go for the latter alternative and rather focus on WCMS, might mention portal in a side note.

Disclaimer: Suffering with a terrible cold here, so mind that some of these ramblings might be fever inspired ;)

Wednesday, February 08, 2006

Finding the red thread

First post in a long time now. Managed to catch quite a nasty cold the other day, and the week before was pretty cramped with turning one open source CMS into a webshop (will post about it later).

One of the main issues with my thesis as now is that it lacks a red thread through it. The thesis is an answer to a question; a research question. Am I ready to specify a context of the question? See writing about open source and open standards in wcms is still too general. I need some sort of approach or specialty, an aspect I can attack.

In my experience, you have to be careful when reading through your own papers. Do it too often, and you grow tired of your own content. My technique will now be to first look quickly through the thesis (7500 words) and the blog (11000 words, bloglines is a great tool for viewing your whole blog on one page), and see if I can select an aspect based on the content I've already collected.

Afterwards, I will go through it again, more slowly this time, adding more content, references, generally putting in the fixes suggested by A.

The aspect

Been a while since I read my first blog-posts. Interesting to note that I've all along proclaimed customizability as the most important feature of a WCMS, especially in an off-the-shelf WCMS, or an open source WCMS. Reading through the blog gave me the impression that I've been bouncing back and forth between different aspects, roaming from knowledge management to the architecture of a WCMS, the social aspects of open source, and the concept of a web portal. I doubt all these concepts can be treated properly in the thesis.

Reading quickly through the thesis. Some chapters follow the the thread, others are immensly subjective and will have to be heavily edited. The abstract and introduction suggest the following red thread:

  • requirements of WCMS
  • how requirements are met by open source systems (OSS) and open standards
  • how OSS compares to proprietary WCMS
  • how to compare different OSWCMS

So the choice remains. What are the options?

  1. The aspects of open source and open standards
  2. WCMS as a knowledge management tool
  3. A study of an open source content management system project (social and technical aspects)
  4. The business case of using an open source WCMS
  5. The technical implications of JSR-170

As previously mentioned, my current title is The Use of Open Standards and Open Source in Web Content Management Systems. Why can't I stick to this aspect? Is it too general? Too subjective?

Nonetheless, I've got material for aspect for all these aspects, especially number 3.

Depending on feedback from A, I want to try to make my current title the red thread. After this has been settled, I'll give the thesis a good write-thru and see how it goes.