-
Reversing Groovy switch statement
Recently I’ve been working on a Groovy code that had many methods with long multibranch conditionals like this:
def parse(message, options) {if (options.contains('A')) {parseARule message} else if (options.contains(2)) {parseSmallDigitRule message...} else if (options.contains(something)) {parseSomeRule message} else {parseSomeOtherRule message}}Although this code is working, it is hard to see which branch is called under which condition. It would be much better if we could replace this code with something like Lisp
cond
macro. The best candidate for such a task in Groovy would be aswitch
statement. If we could only refactor the code above to something like following, it would significantly improve readability:def parse(message, options) {switch (options) {case 'A' : return parseARule(message)case 2 : return parseSmallDigitRule(message)...case ... : return parseSomeRule(message)default : return parseSomeOtherRule(message)}}Unfortunately, this code doesn’t work out of the box in Groovy, but it works if we do some metaprogramming.
The way
switch
statement works in Groovy is a bit different than in Java. Instead of equals() it uses isCase() method to match case-value and switch-value. The default implementation of isCase() method falls back to equals() method, but some classes, including Collection, override this behaviour. That’s why in Groovy you can do things like this:switch (value) {case ['A','E','I','O','U'] : return 'vowel'case 0..9 : return 'digit'case Date : return 'date'default : return 'something else'}For our purposes we need some sort of reverse
switch
, where collection is used as a switch-value, and String and Integer are used as a case-value. To do this we need to override default implementation of isCase() method on String and Integer classes. It’s not possible in Java, but is very easy in Groovy. You can change method implementation globally by replacing it in corresponding meta class, or locally with the help of categories. Let’s create a category that swaps object and subject of isCase() method:class CaseCategory {static boolean isCase(String string, Collection col) {reverseCase(string, col)}static boolean isCase(Integer integer, Collection col) {reverseCase(integer, col)}// Add more overloaded methods here if neededprivate static boolean reverseCase(left, right) {right.isCase(left)}}Now we can use this category to achieve the goal we stated at the beginning of this post:
def parse(message, options) {use (CaseCategory) {switch (options) {case 'A' : return parseARule(message)case 2 : return parseSmallDigitRule(message)...case ... : return parseSomeRule(message)default : return parseSomeOtherRule(message)}}}If you are comfortable with global method replacement, you can amend String and Integer meta classes. In this case you don’t need to wrap
switch
statement withuse
keyword. -
Lazy lists in Groovy
I like lazy evaluation, and it’s one of the reasons I like Haskell and Clojure. Although from engineering perspective lazy evaluation is probably not the most needed feature, it’s definitely very useful for solving some mathematical problems.
Most languages don’t have lazy evaluation out of the box, but you can implement it using some other language features. This is an interesting task, and I use it as a code kata which I practice every time I learn a new strict language.
So, how to implement lazy lists in a strict language? Very simple, if the language is functional. Namely, you build lazy list recursively by wrapping strict list within a function. Here is, for example, the strict empty list in Groovy:
[]If we wrap it with a closure, it becomes lazy empty list:
{-> [] }If we need a list with one element, we prepend (or speaking Lisp terminology cons) an element to lazy empty list, and make the result lazy again:
{-> [ element, {-> [] } ] }To add more elements we continue the same process until all elements are lazily consed. Here is, for example, a lazy list with three elements a, b and c:
{-> [a, {-> [b, {-> [ c, {-> [] } ] } ] } ] }Now, when you have an idea how to build lazy lists, let’s build them Groovy way. We start by creating a class:
LazyList.groovy class LazyList {private Closure listprivate LazyList(list) {this.list = list}}The variable
list
encapsulates the closure wrapper of the list. We need to expose some methods that allow constructing lists using procedure described above:LazyList.groovy (cont'd) static LazyList nil() {new LazyList( {-> []} )}LazyList cons(head) {new LazyList( {-> [head, list]} )}Now we can construct lists by consing elements to empty list:
def lazylist = LazyList.nil().cons(4).cons(3).cons(2).cons(1)To access elements of the list we implement two standard functions,
car
andcdr
, that return head and tail of the list respectively.LazyList.groovy (cont'd) def car() {def lst = list.call()lst ? lst[0] : null}def cdr() {def lst = list.call()lst ? new LazyList(lst[1]) : nil()}Here is how you use these functions to get first and second elements of the list constructed above
assert lazylist.car() == 1assert lazylist.cdr().car() == 2In Lisp there are built-in functions for various
car
andcdr
compositions. For example, the previous assertion would be equivalent to functioncadr
. Instead of implementing all possible permutations, let’s use Groovy metaprogramming to achieve the same goal.LazyList.groovy (cont'd) def methodMissing(String name, args) {def matcher = name =~ /^c([ad]+)r$/if (matcher) {matcher[0][1].reverse().toList().inject(this) {del, cr -> del."c${cr}r"()}} else {throw new MissingMethodException(name, this.class, args)}}It might look complicated, but in reality it’s pretty simple if you are familiar with Groovy regex and functional programming. It’s easier to explain by example. If we pass “caddr” as a value of
name
parameter, the method will create a chain on method calls.cdr().cdr().car()
which will be applied to delegate of the operation which is our LazyList object.With this method in place we can call car/cdr functions with arbitrary depth.
assert lazylist.caddr() == 3If you create nested lazy lists, you can access any element of any nested list with this dynamic method.
def lmn = LazyList.nil().cons('N').cons('M').cons('L')def almnz = LazyList.nil().cons('Z').cons(lmn).cons('A')assert almnz.cadadr() == 'M'With so many cons methods it’s hard to see the structure of the list. Let’s implement
lazy
method on ArrayList class that converts strict list to lazy. Again, we will use metaprogramming and functional techniques.ArrayList.metaClass.lazy = {-> delegate.reverse().inject(LazyList.nil()) {list, item -> list.cons(item)}}Now we can rewrite the previous example as follows
def lazyfied = ['A', ['L','M','N'].lazy(), 'Z'].lazy()assert lazyfied.cadadr() == 'M'What have we accomplished so far? We learned how to build lazy lists from scratch and from strict lists. We know how to add elements to lazy lists, and how to access them. The next step is to implement
fold
function.fold
is the fundamental operation in functional languages, so our lazy lists must provide it.LazyList.groovy (cont'd) boolean isEmpty() {list.call() == []}def fold(n, acc, f) {n == 0 || isEmpty() ? acc : cdr().fold(n-1, f.call(acc, car()), f)}def foldAll(acc, f) {isEmpty() ? acc : cdr().foldAll(f.call(acc, car()), f)}The only difference between this
fold
function and the standard one is the additional parameter n. We will need it later when we implement infinite lists.foldAll
function to lazy lists is the same as standardfold
to strict lists.assert [1,2,3,4,5].lazy().foldAll(0){ acc, i -> acc + i } == 15assert [1,2,3,4,5].lazy().fold(3, 1){ acc, i -> acc * i } == 6First example calculates the sum of all elements of the list, second calculates the product of first three elements.
If you have
fold
functions you can easily implementtake
functionsLazyList.groovy (cont'd) def take(n) {fold(n, []) {acc, item -> acc << item}}def takeAll() {foldAll([]) {acc, item -> acc << item}}def toList() {takeAll()}take
is an inverse operation tolazy
assert [1,2,3,4,5].lazy().takeAll() == [1,2,3,4,5]assert [1,2,3,4,5].lazy().take(3) == [1,2,3]Our next goal is
map
function on lazy lists. Ideally I want the implementation look like thisdef map(f) {isEmpty() ? nil() : cdr().map(f).cons(f.call(car()))}For some reason it doesn’t work lazy way in Groovy — it’s still strictly evaluated. Therefore I have to implement it directly with closure syntax
LazyList.groovy (cont'd) def map(f) {isEmpty() ? nil() : new LazyList( {-> [f.call(car()), cdr().map(f).list]} )}Unlike
fold
, lazymap
is identical to strictmap
assert [1,2,3,4,5].lazy().map{ 2 * it }.take(3) == [2,4,6]The following example shows one of the benefits of laziness
assert [1,2,3,0,6].lazy().map{ 6 / it }.take(3) == [6,3,2]map
didn’t evaluate the entire list, hence there was no exception. If you evaluate the expression for all the elements, the exception will be throwntry {[1,2,3,0,6].lazy().map{ 6 / it }.takeAll()}catch (Exception e) {assert e instanceof ArithmeticException}For strict lists this is a default behaviour of
map
function.The last function I want to implement is
filter
LazyList.groovy (cont'd) def filter(p) {isEmpty() ? nil() :p.call(car()) ? new LazyList( {-> [car(), cdr().filter(p).list]} ) :cdr().filter(p)}In the following example we find first two elements greater than 2
assert [1,2,3,4,5].lazy().filter{ 2 < it }.take(2) == [3,4]With the help of
car
/cdr
,fold
,map
andfilter
you can implement any other function on lazy lists yourself. Here is, for example, the implementation ofzipWith
functionLazyList.groovy (cont'd) static def zipWith(alist, blist, f) {alist.isEmpty() || blist.isEmpty() ? nil() :new LazyList( {-> [f.call(alist.car(), blist.car()),zipWith(alist.cdr(), blist.cdr(), f).list]} )}Now, after we implemented all lazy functions we need, let’s define infinite lists
LazyList.groovy (cont'd) private static sequence(int n) {{-> [n, sequence(n+1)]}}static LazyList integers(int n) {new LazyList(sequence(n))}static LazyList naturals() {integers(1)}Infinite lists, from my point of view, is the most useful application of lazy lists
def naturals = LazyList.naturals()assert naturals.take(3) == [1,2,3]def evens = naturals.map { 2 * it }assert evens.take(3) == [2,4,6]def odds = naturals.filter { it % 2 == 1 }assert odds.take(3) == [1,3,5]assert naturals.cadddddddddr() == 10def nonnegatives = naturals.cons(0)assert nonnegatives.cadr() == 1assert LazyList.zipWith(evens, odds){ x, y -> x * y }.take(4) == [2,12,30,56]At this point you have all basic functionality implemented, and you should be able to extend this model to whatever you need in regards to lazy (infinite) lists. Happy lazy programming!
Resources and links
- Source code for this blog
- Lazy list implementation in Erlang
- Lazy list implementation in Lisp
-
Counting modifications in Git repository
Michael Feathers wrote a blog about Open-Closed Principle, where he described simple technique that measures the closure of the code. I created a Groovy script which implements this algorithm for Git repositories. If you run it from the root of your Git project, it produces a CSV file with the statistics of how many times the files have been modified.
As an example, here is the top 10 files from rabbitmq-server repository
845 src/rabbit_amqqueue_process.erl 711 src/rabbit_channel.erl 650 src/rabbit_tests.erl 588 src/rabbit_variable_queue.erl 457 src/rabbit_amqqueue.erl 448 src/rabbit_mnesia.erl 405 src/rabbit.erl 395 src/rabbit_reader.erl 360 src/rabbit_msg_store.erl 356 src/rabbit_exchange.erl
-
Maven and Git
More and more Maven projects are switching from Subversion to Git, and the majority of those projects make the same mistake: They configure scm section of POM to point to remote repository, the same way they did it in Subversion
<scm><url>http://github.com/SpringSource/spring-batch</url><connection>scm:git:git://github.com/SpringSource/spring-batch.git</connection><developerConnection>scm:git:ssh://git@github.com/SpringSource/spring-batch.git</developerConnection></scm>By doing this they lose the main benefit of Git. They become dependent on the remote machine. And when their release depends on the remote machine, that’s what happens: They need to release the project but the remote machine is down
The right way of configuring Git in Maven is following
<scm><url>scm:git:file://.</url><connection>scm:git:file://.</connection><developerConnection>scm:git:file://.</developerConnection></scm>This configuration is universal, and it separates two orthogonal concerns: releases and remote copies.
Resources
- Screencast that shows how to work with Git in Maven projects.
-
Joe Armstrong on optimization
A quote from Erlang and OTP in Action book
Make it work, then make it beautiful, then if you really, really have to, make it fast. 90 percent of the time, if you make it beautiful, it will already be fast. So really, just make it beautiful!
A quote from InfoQ interview
I think that people first of all write a problem, they solve the problem and then they sort of optimize this code, work on it and the code becomes very efficient but unreadable. What I think they should doing is specifying it with a domain specific language or some higher thing and then writing a compiler and then changing the compiler because it’s not efficient. Because then they would have the benefits of a clear specification and a fast implementation. What they do is they don’t keep these things separated and the language doesn’t support them separating it like that.