Google Guava's Predicates

So the other day I was working away on some rather tricky rules for validating an incoming identifier code. Like

999392304II+TJAJEIAJTI901119

or

EPFOOMNNVLLALOWEIITOE

Now, the system where this code is going is rather picky about what these codes should look like. (For the sake of this blog-post, I'll invent some business rules here on the fly, in reality they are a bit different, but the nature of them are the same):

A code cannot be null, or an empty string (of course)
If there are only letters (A-Z) in the code, it must be longer than 10 characters, but fewer than 20
On the other hand, if there are letters mixed with numbers, it must be exactly 10 or 15 characters
Numbers only are not allowed
No other symbols are allowed, but + signs are allowed, if the code starts with 999. In this case, the max length is 10.

Instead of just going straight for the if-else'ing my way through these rules, I implemented it using Predicates from Google Guava. I was pretty happy with the result, and wanted to put it up against what the classic solution would look like.

I've created a little project on Github that demonstrates the above rules. There are two branches:

ifelse - The old imperative way, using if's and else's

predicate - The new functional way, using predicates

Both branches pass the same test suite. I won't swear on the correctness of the implementation, as it's currently 01:59 in the night, and the tests aren't really extensive.

Performance is not a big thing in this validation function. We're dealing with codes that come in every few seconds, so we've got plenty of milliseconds to spend on making the code look as nice as possible. Also, my regexp sucks, I know.

Here's a quick look of what I ended up with using if's and else's:

(Source on Github)

And here's using predicates:
(Source on Github)

In order to really appreciate the differences, it's best to have a look at the whole classes on Github. It's a Maven project, so you can import it into IDEA or Eclipse, and use Git to jump between the two branches for comparison.

I'm not saying that the predicates way is easier to read or write, but there is something functional and declarative about it that feels kind of.. powerful. A nice thing about those predicates is that they can easily be combined with eachother, and re-used in other places, like in filtering collections using other Guava utils.

Note that the full predicate style solution is much larger, but much to blame are those base-predicates I'm defining for saying matching, shorterThan, longerThan and exactLength.

If we were dealing with a language that properly supported functions (a predicate is a function that evalutes to true/false), it would probably look a lot nicer. Perhaps with Java 8, I can slim down the code quite a bit..

I just read that the Guava developers are going to include new fluent interfaces for combining predicates in a more elegant way, than the prefix style I've used above. Maybe something like

new FluentPredicate().not(NULL_OR_EMPTY).and(ONLY_LETTERS).build().

Read more here about FluentPredicates, etc. They can't be too far away, release-wise.

For other Guava resources, check out my collection here.

Comments

Louis Wasserman23/12/11 11:43
You might get some mileage out of CharMatcher for this particular task. I wonder if there's a Predicate wrapper that tests if a CharMatcher matches all characters in a string.
ReplyDelete
Replies
Johannes Brodwall23/12/11 23:26
Personally, I found the predicates variant more intimidating and it required more concentration to cope with.

In any case, there is an excellent opportunity for you to empirically evaluate which version is more maintainable: I found a bug that was present in both implementations.

As I understand your description, the following test case should pass:

assertFalse(isCodeAllowed("###########"));

(I suspected that this was present based on the implementation of both isCodeAllowed methods)

You could use this as an opportunity for comparing how long it takes to fix a bug in either version, which is a good proxy for "goodness" in this case.
ReplyDelete
Replies