Balaji Vajjala's Blog

A DevOps Blog from Trenches

Maven Deployment Linker - plug-in

This plug-in does something so simple yet very useful instead of archiving artifact it will list the deployments performed at build time to the Maven Proxy you are running regardless to the proxy vendor (Archiva, Artifactory or Nexus). All you need to do in your maven build is select 1 check-box:

You can also filter artifacts with regex.

The result is:

And the status bar shows:

SVN bash_completion - Subversion productivity boost

Remembering all the svn switches / CLI options, not necessary, well unless you are using some nifty GUI tool you will definitely need this one.

Just google “Subversoin bash completion” and you will find yourself here

so grab is, source it and use it:

1
2
apt-get install bash-completion
wget http://worksintheory.org/files/misc/bash_completion_svn -O /etc/bash_completion.d/subversion

You are good to go.

Please note: Installing the bash_completion deb package adds a bunch of bash completion helpers see full list by running:

1
dpkg -L bash-completion

All the magic however is in this short file: /etc/profile.d/bash_completion.sh and anything you place under /etc/bash_completion.d/ will automatically be sourced upon login …

Lazy lists in Groovy

I like lazy evaluation, and it’s one of the reasons I like Haskell and Clojure. Although from engineering perspective lazy evaluation is probably not the most needed feature, it’s definitely very useful for solving some mathematical problems.

Most languages don’t have lazy evaluation out of the box, but you can implement it using some other language features. This is an interesting task, and I use it as a code kata which I practice every time I learn a new strict language.

So, how to implement lazy lists in a strict language? Very simple, if the language is functional. Namely, you build lazy list recursively by wrapping strict list within a function. Here is, for example, the strict empty list in Groovy:

1
[]

If we wrap it with a closure, it becomes lazy empty list:

1
{-> [] }

If we need a list with one element, we prepend (or speaking Lisp terminology cons) an element to lazy empty list, and make the result lazy again:

1
{-> [ element, {-> [] } ] }

To add more elements we continue the same process until all elements are lazily consed. Here is, for example, a lazy list with three elements a, b and c:

1
{-> [a, {-> [b, {-> [ c, {-> [] } ] } ] } ] }

Now, when you have an idea how to build lazy lists, let’s build them Groovy way. We start by creating a class:

LazyList.groovy
1
2
3
4
5
6
7
class LazyList {
    private Closure list

    private LazyList(list) {
        this.list = list
    }
}

The variable list encapsulates the closure wrapper of the list. We need to expose some methods that allow constructing lists using procedure described above:

LazyList.groovy (cont’d)
1
2
3
4
5
6
7
    static LazyList nil() {
        new LazyList( {-> []} )
    }

    LazyList cons(head) {
        new LazyList( {-> [head, list]} )
    }

Now we can construct lists by consing elements to empty list:

1
def lazylist = LazyList.nil().cons(4).cons(3).cons(2).cons(1)

To access elements of the list we implement two standard functions, car and cdr, that return head and tail of the list respectively.

LazyList.groovy (cont’d)
1
2
3
4
5
6
7
8
9
    def car() {
        def lst = list.call()
        lst ? lst[0] : null
    }

    def cdr() {
        def lst = list.call()
        lst ? new LazyList(lst[1]) : nil()
    }

Here is how you use these functions to get first and second elements of the list constructed above

1
2
assert lazylist.car() == 1
assert lazylist.cdr().car() == 2

In Lisp there are built-in functions for various car and cdr compositions. For example, the previous assertion would be equivalent to function cadr. Instead of implementing all possible permutations, let’s use Groovy metaprogramming to achieve the same goal.

LazyList.groovy (cont’d)
1
2
3
4
5
6
7
8
9
10
    def methodMissing(String name, args) {
        def matcher = name =~ /^c([ad]+)r$/
        if (matcher) {
            matcher[0][1].reverse().toList().inject(this) {
                del, cr -> del."c${cr}r"()
            }
        } else {
            throw new MissingMethodException(name, this.class, args)
        }
    }

It might look complicated, but in reality it’s pretty simple if you are familiar with Groovy regex and functional programming. It’s easier to explain by example. If we pass “caddr” as a value of name parameter, the method will create a chain on method calls .cdr().cdr().car() which will be applied to delegate of the operation which is our LazyList object.

With this method in place we can call car/cdr functions with arbitrary depth.

1
assert lazylist.caddr() == 3

If you create nested lazy lists, you can access any element of any nested list with this dynamic method.

1
2
3
def lmn = LazyList.nil().cons('N').cons('M').cons('L')
def almnz = LazyList.nil().cons('Z').cons(lmn).cons('A')
assert almnz.cadadr() == 'M'

With so many cons methods it’s hard to see the structure of the list. Let’s implement lazy method on ArrayList class that converts strict list to lazy. Again, we will use metaprogramming and functional techniques.

1
2
3
ArrayList.metaClass.lazy = {
    -> delegate.reverse().inject(LazyList.nil()) {list, item -> list.cons(item)}
}

Now we can rewrite the previous example as follows

1
2
def lazyfied = ['A', ['L','M','N'].lazy(), 'Z'].lazy()
assert lazyfied.cadadr() == 'M'

What have we accomplished so far? We learned how to build lazy lists from scratch and from strict lists. We know how to add elements to lazy lists, and how to access them. The next step is to implement fold function. fold is the fundamental operation in functional languages, so our lazy lists must provide it.

LazyList.groovy (cont’d)
1
2
3
4
5
6
7
8
9
10
11
    boolean isEmpty() {
        list.call() == []
    }

    def fold(n, acc, f) {
        n == 0 || isEmpty() ? acc : cdr().fold(n-1, f.call(acc, car()), f)
    }

    def foldAll(acc, f) {
        isEmpty() ? acc : cdr().foldAll(f.call(acc, car()), f)
    }

The only difference between this fold function and the standard one is the additional parameter n. We will need it later when we implement infinite lists. foldAll function to lazy lists is the same as standard fold to strict lists.

1
2
assert [1,2,3,4,5].lazy().foldAll(0){ acc, i -> acc + i } == 15
assert [1,2,3,4,5].lazy().fold(3, 1){ acc, i -> acc * i } == 6

First example calculates the sum of all elements of the list, second calculates the product of first three elements.

If you have fold functions you can easily implement take functions

LazyList.groovy (cont’d)
1
2
3
4
5
6
7
8
9
10
11
    def take(n) {
        fold(n, []) {acc, item -> acc << item}
    }

    def takeAll() {
        foldAll([]) {acc, item -> acc << item}
    }

    def toList() {
        takeAll()
    }

take is an inverse operation to lazy

1
2
assert [1,2,3,4,5].lazy().takeAll() == [1,2,3,4,5]
assert [1,2,3,4,5].lazy().take(3) == [1,2,3]

Our next goal is map function on lazy lists. Ideally I want the implementation look like this

1
2
3
    def map(f) {
        isEmpty() ? nil() : cdr().map(f).cons(f.call(car()))
    }

For some reason it doesn’t work lazy way in Groovy — it’s still strictly evaluated. Therefore I have to implement it directly with closure syntax

LazyList.groovy (cont’d)
1
2
3
    def map(f) {
        isEmpty() ? nil() : new LazyList( {-> [f.call(car()), cdr().map(f).list]} )
    }

Unlike fold, lazy map is identical to strict map

1
assert [1,2,3,4,5].lazy().map{ 2 * it }.take(3) == [2,4,6]

The following example shows one of the benefits of laziness

1
assert [1,2,3,0,6].lazy().map{ 6 / it }.take(3) == [6,3,2]

map didn’t evaluate the entire list, hence there was no exception. If you evaluate the expression for all the elements, the exception will be thrown

1
2
3
4
5
6
try {
    [1,2,3,0,6].lazy().map{ 6 / it }.takeAll()
}
catch (Exception e) {
    assert e instanceof ArithmeticException
}

For strict lists this is a default behaviour of map function.

The last function I want to implement is filter

LazyList.groovy (cont’d)
1
2
3
4
5
    def filter(p) {
        isEmpty() ? nil() :
            p.call(car()) ? new LazyList( {-> [car(), cdr().filter(p).list]} ) :
                cdr().filter(p)
    }

In the following example we find first two elements greater than 2

1
assert [1,2,3,4,5].lazy().filter{ 2 < it }.take(2) == [3,4]

With the help of car/cdr, fold, map and filter you can implement any other function on lazy lists yourself. Here is, for example, the implementation of zipWith function

LazyList.groovy (cont’d)
1
2
3
4
5
6
7
    static def zipWith(alist, blist, f) {
        alist.isEmpty() || blist.isEmpty() ? nil() :
            new LazyList( {-> [
                f.call(alist.car(), blist.car()),
                zipWith(alist.cdr(), blist.cdr(), f).list
            ]} )
    }

Now, after we implemented all lazy functions we need, let’s define infinite lists

LazyList.groovy (cont’d)
1
2
3
4
5
6
7
8
9
10
11
    private static sequence(int n) {
        {-> [n, sequence(n+1)]}
    }

    static LazyList integers(int n) {
        new LazyList(sequence(n))
    }

    static LazyList naturals() {
        integers(1)
    }

Infinite lists, from my point of view, is the most useful application of lazy lists

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def naturals = LazyList.naturals()
assert naturals.take(3) == [1,2,3]

def evens = naturals.map { 2 * it }
assert evens.take(3) == [2,4,6]

def odds = naturals.filter { it % 2 == 1 }
assert odds.take(3) == [1,3,5]

assert naturals.cadddddddddr() == 10

def nonnegatives = naturals.cons(0)
assert nonnegatives.cadr() == 1

assert LazyList.zipWith(evens, odds){ x, y -> x * y }.take(4) == [2,12,30,56]

At this point you have all basic functionality implemented, and you should be able to extend this model to whatever you need in regards to lazy (infinite) lists. Happy lazy programming!

Resources and links

Counting modifications in Git repository

Michael Feathers wrote a blog about Open-Closed Principle, where he described simple technique that measures the closure of the code. I created a Groovy script which implements this algorithm for Git repositories. If you run it from the root of your Git project, it produces a CSV file with the statistics of how many times the files have been modified.

As an example, here is the top 10 files from rabbitmq-server repository

845  src/rabbit_amqqueue_process.erl
711  src/rabbit_channel.erl
650  src/rabbit_tests.erl
588  src/rabbit_variable_queue.erl
457  src/rabbit_amqqueue.erl
448  src/rabbit_mnesia.erl
405  src/rabbit.erl
395  src/rabbit_reader.erl
360  src/rabbit_msg_store.erl
356  src/rabbit_exchange.erl

Maven and Git

More and more Maven projects are switching from Subversion to Git, and the majority of those projects make the same mistake: They configure scm section of POM to point to remote repository, the same way they did it in Subversion

1
2
3
4
5
<scm>
    <url>http://github.com/SpringSource/spring-batch</url>
    <connection>scm:git:git://github.com/SpringSource/spring-batch.git</connection>
    <developerConnection>scm:git:ssh://git@github.com/SpringSource/spring-batch.git</developerConnection>
</scm>

By doing this they lose the main benefit of Git. They become dependent on the remote machine. And when their release depends on the remote machine, that’s what happens: They need to release the project but the remote machine is down

The right way of configuring Git in Maven is following

1
2
3
4
5
<scm>
    <url>scm:git:file://.</url>
    <connection>scm:git:file://.</connection>
    <developerConnection>scm:git:file://.</developerConnection>
</scm>

This configuration is universal, and it separates two orthogonal concerns: releases and remote copies.

Resources

  • Screencast that shows how to work with Git in Maven projects.

JBoss, Selenium, Maven, Hudson, M2 Extra Steps & Files Found Trigger plugins playing well together

JBoss, Selenium, Maven, Hudson, M2 Extra Steps & Files Found Trigger plugins – how do all these work together in a continuous build + Integration test life-cycle ?

The Story – The Use Case:

We have two projects with two war artifacts which need to be deployed to a JBoss Application Server, whilst both webapps share a common base configuration, although the release life-cycle of each war have no correlation to the other.

In production both application servers are running & serving one another thus, Integration test should cover both JBoss instances & test their web services.

We are using selenium tests for both webapps and they need to run straight after a continuous build of each of the servers mentioned above. That said a change in project A or in project B should trigger the Integration tests job, whilst if either project A or project B are running the integration test plugin shouldn’t run (at least until both projects / one of them is complete).

The “work around” – forcing the “native Hudson” configuration (which we didn’t go with naturally):

Hudson - pinned/pinning plugins

If you wish to “hang on” to a certain plugin which shipps with hudson.war just unpin it in the Manage Hudson >> Manage Plugins page – this option is availabe sine 1.374 release (and you can alwasy grab the latest @: latest)

See full explanation below quoted from hudson wiki:

The notion of pinned plugins applies to plugins that are bundled with Hudson, such as the Subversion plugin.

Normally, when you upgrade/downgrade Hudson, its built-in plugins overwrite whatever versions of the plugins you currently have in $HUDSON_HOME. This ensures that you use the consistent version of those plugins. However, this behavior also means that those plugins can be never manually updated, as every time you start Hudson they’ll be replaced by the bundled versions.

So when you manually update those bundled plugins, Hudson will mark those plugins as pinned to the particular version. Pinned plugins will never be overwritten by the bundled plugins during Hudson boot up. However, by definition, with pinned plugins you lose the benefit of automatic upgrade when you upgrade Hudson.

So the plugin manager in Hudson allows you to explicitly unpin plugins. On file system, Hudson creates an empty file called $HUDSON_HOME/plugins/plugin.hpi.pinned to indicate the pinning. This file can be manually created/deleted to control the pinning behavior.

Remove the All view in Hudson + view enhancement plugins

I had tow motivations of getting rid of the All view

  1. The All view is quite annoying don’t you think? After using Hudson for a while you have tens/hundreds of jobs lined up in a huge list – who needs that right.
  2. I wanted a “hidden jobs section” – Jobs no one but myself (and who ever needs access to it) can see.

In order to get rid of it (the All view)****simply:

The hudson plug-ins you can’t live without

This post was originally posted & active @: http://www.tikalk.com As a big fan of hudson-ci I would like to take a note of the most commonly used hudson plug-ins (at least by me) needed in order to maintain a good build environment. This list was collected as part of my experience in the last couple of years. I am sure your may differ then mine mine :).

Setenv plug-in

The The Setenv plug-in lets you set environment variables for a job upon build execution. During migration from CruiseControl I found this plug-in extremely useful, for I could provide the imported script the exact environment it had on the CC machine without the need to change a thing in the build’s logic / parameters, this also applied to the following recommended plug-in:

Parameterized Trigger plug-in

The Parameterized Trigger plug-in lets you add parameters to your build jobs that users enter when they trigger a build. This a very useful plug-in for release or deployment automation, for example, where you want to enter the version number (or label) you want to release or deploy. The biggest feature of this plug-in is the default value so even automatic / SCM triggers get a default value to execute silently.

The Cygpath plug-in

for a *nix oriented guy as myself, this was a great help, all our “special” shell script do not have to be re-written when we are running builds on Windows nodes – and yest we have too … :)

The Cygpath gave me the opportunity to share tools between linux and windows machines this gave us the ability to maintain one tool repository for all our slave regardless of their architecture.

And did I forget to say all you need it to enable this and automatically every batch executed on a windows slave will automatically use Cygwin ? from Cygpath wiki:

  • You install Cygwin on all the Windows slaves

  • Jobs on Hudson that assume Unix environment can now run on all the slaves (including Windows ones)

  • In the system configuration, you use Unix paths for all your tools.

Promoted Builds plug-in

Definitely the #1 plug-in on the list here – this plug-in enables you to do almost anything you can do in a certain Job but run it as a promotion task – if you wish to promote you build to your QA team for testing, or if you want to tag it in SVN or Deploy your artifacts to a maven repository, this is the plug-in you “cannot live without”. Without this plug-in you will need to configure a separate job or Bach Task (see batch tasks plug-in for more details) for every task you want to perform on your build – which makes managing Hudson job a nightmare …

Clover plug-in

Clover is a non-free code coverage tool which is the commercial alternative to Cobertura Emma etc, the Hudson Clover plug-in is an amazing add on which integrates Clover reports and Historical reports into the build flow, which I found extremely helpful. Try configuring Clover to generate historical reports and then publish them to some third-party web server for viewing – this has made Clover integration a breeze, the challenge is even bigger with a distributed build environment which Hudson & Clover plug-in have overcome.

If you don’t have Clover, as mentioned above – the Cobertura and Emma plug-ins are great too which will also integrate with:

Sonar plug-in

Although I am only “P.O.C ing” Sonar+Hudson+Clover, The Sonar plug-in made it trivial to integrate hudson projects with Sonar. Sonar is a powerful open source code quality metrics reporting tool, which displays code quality metrics for multiple projects in a variety of ways on a centralized web location.

For Maven based builds you do not even need to change a line of code in order to get sonar to work which made this module a #2 on my “can’t live without plug-ins”.

Sectioned View plug-in

Sectioned view gives you the ability to create a “Dashboard view” for your job(s) / project(s) – it is quite feature rich if you take a look at it’s configuration and it is very simple to comprehend. A great example is taken from the plug-ins wiki page see:

Section view scrrenshot

Nested Views plug-in

Nested views another View type which allows grouping job views into multiple levels instead of one big list of tabs – this is quite useful and the only disadvantage is you can have both a view and jobs in the same page it’s either a nested view or a list of views – but I presume it will sure be included.

Shelve Project plug-in

If you ever wanted to Hide a jo you are working on and you also would like to prevent it from being triggered by mistake this is the plug-in for you. I often find my self setting up a job and it becomes a work in progress so hiding it to a later time is a great help – this plug-in does just that.

Bugzilla & Jira plug-ins (& there or others I presume)

Well the fact I need both in the same Hudson cluster and I can still have them work side by side was really important. In order for this plug-in to serve you well your CM team has to some extra work on your SCM side, that done you got yourself a link directly into your bug tracking system – the latest versions, query Bugzilla & Jira and can display the Bug details.

Job configuration change in Hudson 1.372

Yesterday I upgraded hudson to the greatest an latest which was a seamless upgrade.

A very obvious change in the Job configuration form was added instead of “Tie Build to a node”:

There is now “Restrict where this project can be run”:

The disadvantage in this feature is if you want to a build to a node you need to know its name / node group name prior to the actual configuration. The Advantage of this feature is you can be more specific in where you want you build to run and with a large number of slaves this is quite important, Please note: “old” jobs aren’t affected of this change.