Skip to main content

Summer thoughts

Today I'm taking the train back down south for my summer vacation. Since it's been a while since my last post, I'll do some more thoughts here before I'm off. I am planning to spend a good bit of the summer reading through Bob Boiko's CMS bible, as well as putting together a scheduler-application with JSF, and perhaps glue it together with Magnolia. Will also be making some much needed custom Magnolia templates.

I was reading the CMS bible the other night, and I read something about sorting content in to binary and "normal" files. Binary files being "all 1's and 0's", like pictures and media files, but also files of a closed format. A word file or an Excel workbook is a binary file. During the course of knowledge management this spring, I learned that heaps of information which contents are allready at an atomic level (there is no easy way to aggregate elements out of a word document), we call unstructured data. An email, which content contains no tags outside the header of the mail (subject, recepient, sender, etc), is also unstructured. Unstructered data is bad for content management because you can't tell from the outside what the content is about.

But back to the proprietary formats like Word, it occured to me the risk of a file or document being unstructered, is connected to the fact whether it is based on an open standard or not. XML (depending if you use it right) is structured. PDF is structured.

I'm sure Word has some sort of structure as well, but it's hard for the outside world to enable and use this structure since it is a closed format (there are some projects evolving how to read MS' formats, I have personally been using Apache POI for reading Excel workbooks). The POI team doesn't seem to be too pleased with reverse engineering and reading closed formats, using package names like
  • HSSF - Horrible Spreadsheet Format
  • HPSF - Horrible Property Set Format
  • DDF - Dreadful Drawing Format
Amusing, but it also paints a serious picture of what it is like to deal with closed formats. A MS technical engineer might have the correct tools and frameworks available for accessing a Word file as structured data. The rest of the world, however, does not.

Closed formats automatically fall into the bag of unstructered/binary data in a CMS.

Enjoy your summer!

Popular posts from this blog

Encrypting and Decrypting with Spring

I was recently working with protecting some sensitive data in a typical Java application with a database underneath. We convert the data on its way out of the application using Spring Security Crypto Utilities. It "was decided" that we'd be doing AES with a key-length of 256, and this just happens to be the kind of encryption Spring crypto does out of the box. Sweet!

The big aber is that whatever JRE is running the application has to be patched with Oracle's JCE in order to do 256 bits. It's a fascinating story, the short version being that U.S. companies are restricted from exporting various encryption algorithms to certain countries, and some countries are restricted from importing them.

Once I had patched my JRE with the JCE, I found it fascinating how straight forward it was to encrypt and decrypt using the Spring Encryptors. So just for fun at the weekend, I threw together a little desktop app that will encrypt and decrypt stuff for the given password and sa…

Managing dot-files with vcsh and myrepos

Say I want to get my dot-files out on a new computer. Here's what I do:

# install vcsh & myrepos via apt/brew/etc
vcsh clone https://github.com/tfnico/config-mr.git mr
mr update

Done! All dot-files are ready to use and in place. No deploy command, no linking up symlinks to the files. No checking/out in my entire home directory as a Git repository. Yet, all my dot-files are neatly kept in fine-grained repositories, and any changes I make are immediately ready to be committed:

config-atom.git
    -> ~/.atom/*

config-mr.git
    -> ~/.mrconfig
    -> ~/.config/mr/*

config-tmuxinator.git  
    -> ~/.tmuxinator/*

config-vim.git
    -> ~/.vimrc
    -> ~/.vim/*

config-bin.git   
    -> ~/bin/*

config-git.git          
    -> ~/.gitconfig

config-tmux.git  
    -> ~/.tmux.conf    

config-zsh.git
    -> ~/.zshrc

How can this be? The key here is to use vcsh to keep track of your dot-files, and its partner myrepos/mr for operating on many repositories at the same time.

I discovere…

Always use git-svn with --prefix

TLDR: I've recently been forced back into using git-svn, and while I was at it, I noticed that git-svn generally behaves a lot better when it is initialized using the --prefix option.

Frankly, I can't see any reason why you would ever want to use git-svn without --prefix. It even added some major simplifications to my old git-svn mirror setup.

Update: Some of the advantages of this solution will disappear in newer versions of Git.

For example, make a standard-layout svn clone:

$ git svn clone -s https://svn.company.com/repos/project-foo/

You'll get this .git/config:

[svn-remote "svn"]
        url = https://svn.company.com/repos/
        fetch = project-foo/trunk:refs/remotes/trunk
        branches = project-foo/branches/*:refs/remotes/*
        tags = project-foo/tags/*:refs/remotes/tags/*

And the remote branches looks like this (git branch -a):
    remotes/trunk
    remotes/feat-bar

(Compared to regular remote branches, they look very odd because there is no remote name i…

Joining eyeo: A Year in Review

It's been well over a year since I joined eyeo. And 'tis the season for yearly reviews, so...

It's been pretty wild. So many times I thought "this stuff really deserves a bloggin", but then it was too inviting to grab onto the next thing and get that rolling.

Instead of taking a deep dive into some topic already, I want to scan through that year in review and think for myself, what were the big things, the important things, the things I achieved, and the things I learned. And then later on, if I ever get around to it, grab one of these topics and elaborate in a dedicated blog-post. Like a bucket-list of the blog posts that I should have written. Here goes:
How given no other structures, silos will grow by themselves This was my initial shock after joining the company. Only a few years after taking off as a startup, the hedges began growing, seemingly almost by themselves, and against the will of the founders. I've worked in silos, and in companies without the…

Automating Computer Setup with Boxen

I just finished setting up a new laptop at work, and in doing so I revamped my personal computer automation quite a bit. I set up Boxen for installing software, and I improved my handling of dot-files using vcsh, which I'll cover in the next blog-post after this one.

Since it's a Mac, it doesn't come with any reasonable package manager built in. A lot of people get along with a combination of homebrew or MacPorts plus manual installs, but this time I took it a step further and decided to install all the "desktop" tools like VLC and Spotify using GitHub's Boxen:

  include vlc
  include cyberduck
  include pgadmin3
  include spotify
  include jumpcut
  include googledrive
  include virtualbox

If the above excerpt looks like Puppet to you, it's because it is. The nice thing about this is that I can apply the same puppet scripts on my Ubuntu machines as well. Boxen is Mac-specific, Puppet is not.

It was a little weird to get started with Boxen, as you're offered…