Balaji Vajjala's Blog

A DevOps Blog from Trenches

Starting Tomcat as part of the build process with HUDSON-CI

In one of my previous projects I needed to deploy 2 Tomcat servers as part of the build process, this could be done by profiling the parent project with a Deploy life cycle although we wanted to avoid errors due to miss use – therefore leave this task for a continues integration server – you guessed right – HUDSON.

My test consisted on completing two build life cycles of mvn clean install then collect three wars, deploy into a remote Tomcat webapps dir, set some spring.properties and start both Tomcats.

So this task seemed quite simple for Hudson to be able to execute, my recipe was:

  1. A parametrized build – which will set the remote tomcat home & remote db variables (passed in spring.properties within wars) [see example scrrenshot below]
  2. A hudson slave instance (so I will build directly on the remote host)
  3. The m2 extras steps plugin

After running the build about half a dozen times I learned that Hudson is killing those Tomcats as soon as the build is successful – this seemed odd.

The build consisted of deploying 3 war files into webapps dir, taking tomcat down, overriding spring.properties files in each war and then start two tomcats instances so new setting are effective.

On the remote machine during build time, netstat showed that services are up and running, in addition to Hudson’s build log wchich showed all wars were copied and all spring.properties files are in place – so what could be the issue here?

At some point it occurred to me – and seemed almost natural that these services should be killed, because they are forked / sub processes of Hudson, and therefore will not continue “living” once there parent process is dead.

So still how do you solve this?

The solution was to pass an argument to Hudson called BUILD_ID with a default value of dontkillme which solved the issue.

you can read about this – it will probably save you some time if your in the same situation see: HUDSON-2729

Although according to the above link:

"This is fixed in 1.283.

The side effect of this is that if someone has been intentionally spawning
processes, those will be killed, too. This is based on environment variables, so
to work around this, you can spawn processes with one of the environment
variables different from what Hudson sets. For example,
BUILD_ID=dontKillMe catalina.sh start
will start Tomcat in such a way that it'll escape the killing."

I am running on 1.338 and this is still an issue – but, I was performing the build via slave, which might complicate the issue. For it is not only a sub process of Hudson, but a “sub-process of a sub-process” – now, if you recall I am running a parametrized build so I set the BUILD_ID=dontkillme as a default parameter to this build, and I solved this issue.

Hope you find this useful

Parametrized build example:

Hudson CI parameters

Parsing files using Groovy regex

In my previous post I mentioned several ways of defining regular expressions in Groovy. Here I want to show how we can use Groovy regex to find the data in the files.

Parsing properties file (simplified)1

Data: each line in the file has the same structure; the entire line can be matched by single regex.

Task: transform each line to the object.

Solution: construct regex with capturing parentheses, apply it to each line, extract captured data.

Demonstrates: File.eachLine method, matrix syntax of Matcher object.

1
2
3
4
5
6
7
def properties = [:]
new File('path/to/some.properties').eachLine { line ->
    if ((matcher = line =~ /^([^#=].*?)=(.+)$/)) {
        properties[matcher[0][1]] = matcher[0][2]
    }
}
println properties

Parsing CSV files (simplified)2

Data: each line in the file has the same structure; the line consists of the blocks separated by some character sequence.

Task: transform each line to the list of objects.

Solution: construct regex with capturing parentheses, parse each line with the regex in a loop extracting captured data.

Demonstrates: ~// Pattern defenition, Matcher.group method, \G regex meta-sequence.

1
2
3
4
5
6
7
8
9
def regex = ~/\G(?:^|,)(?:"([^"]*+)"|([^",]*+))/
new File('path/to/file.csv').eachLine { line ->
    def fields = []
    def matcher = regex.matcher(line)
    while (matcher.find()) {
        fields << (matcher.group(1) ?: matcher.group(2))
    }
    println fields
}

Finding snapshot dependencies in the POM (simplified)3

Data: file contains blocks with known boundaries (possibly spanning multiple lines).

Task: extract the blocks satisfying some criteria.

Solution: read the entire file into the string, construct regex with capturing parentheses, apply the regex to the string in a loop.

Demonstrates: File.text property, list syntaxt of Matcher object, named capture, global \x regex modifier, local \s regex modifier.

1
2
3
4
5
6
7
8
9
10
11
def pom = new File('path/to/pom.xml').text
def matcher = pom =~ $/(?x)
    <dependency>                          \s*
      <groupId>([^<]+)</groupId>          \s*
      <artifactId>([^<]+)</artifactId>    \s*
      <version>(.+?-SNAPSHOT)</version>   (?s:.*?)
    </dependency>
/$
matcher.each { matched, groupId, artifactId, version ->
    println "$groupId:$artifactId:$version"
}

Finding stacktraces in the log

Data: file contains entries each of which starts with the same pattern and can span multiple lines. Typical example is log4j log files:

2009-10-16 15:32:12,157 DEBUG [com.ndpar.web.RequestProcessor] Loading user
2009-10-16 15:32:13,258 ERROR [com.ndpar.web.UserController] id to load is required for loading
java.lang.IllegalArgumentException: id to load is required for loading
     at org.hibernate.event.LoadEvent.(LoadEvent.java:74)
     at org.hibernate.event.LoadEvent.(LoadEvent.java:56)
     at org.hibernate.impl.SessionImpl.get(SessionImpl.java:839)
     at org.hibernate.impl.SessionImpl.get(SessionImpl.java:835)
     at org.springframework.orm.hibernate3.HibernateTemplate$1.doInHibernate(HibernateTemplate.java:531)
     at org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:419)
     at org.springframework.orm.hibernate3.HibernateTemplate.executeWithNativeSession(HibernateTemplate.java:374)
     at org.springframework.orm.hibernate3.HibernateTemplate.get(HibernateTemplate.java:525)
     at org.springframework.orm.hibernate3.HibernateTemplate.get(HibernateTemplate.java:519)
     at com.ndpar.dao.UserManager.getUser(UserManager.java:90)
     ... 62 more
2009-10-16 15:32:14,659 DEBUG [com.ndpar.jms.MessageListener] Received message:
     ... multi-line message ...
2009-10-16 15:32:15,169 INFO  [com.ndpar.dao.UserManager] User: ...

Task: find entries satisfying some criteria.

Solution: read the entire file into the string4, construct regex with capturing parentheses and lookahead, split the string into entries, loop through the result and apply criteria to each entry.

Demonstrates: regex interpolation, combined global regex modifiers \s and \m.

1
2
3
4
5
6
7
8
9
def log = new File('path/to/your.log').text
def logLineStart = /^\d{4}-\d{2}-\d{2}/
def splitter = log =~ $/(?xms)
    (    ${logLineStart}   .*?)
    (?=  ${logLineStart} | \Z)
/$
splitter.each { matched, entry ->
    if (entry =~ /(?m)^(?:\t| {8})at/) println entry
}

Resources

Footnotes

  1. This example is for demonstration purposes only. In real program you would just use Properties.load method.
  2. The regex is simplified. If you want the real one, take a look at Jeffrey Friedl’s example.
  3. Again, in reality you would find snapshots using mvn dependency:resolve | grep SNAPSHOT command.
  4. This approach won’t work for big files. Take a look at this script for practical solution.

Groovy regular expressions

Because of the compact syntax regular expressions in Groovy are more readable than in Java. Here is how Jeffrey Friedl’s example looks in Groovy:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def subDomain  = '(?i:[a-z0-9]|[a-z0-9][-a-z0-9]*[a-z0-9])' // simple regex in single quotes
def topDomains = $/
    (?x-i : com         \b     # you can put whitespaces and comments
          | edu         \b     # inside regex in eXtended mode
          | biz         \b
          | in(?:t|fo)  \b     # backslash is not escaped
          | mil         \b     # in dollar-slash strings
          | net         \b
          | org         \b
          | [a-z][a-z]  \b
    )/$

def hostname = /(?:${subDomain}\.)+${topDomains}/  // variable substitution in slashy string

def NOT_IN   = /;\"'<>()\[\]{}\s\x7F-\xFF/     // backslash is not escaped in slashy strings
def NOT_END  = /!.,?/
def ANYWHERE = /[^${NOT_IN}${NOT_END}]/
def EMBEDDED = /[$NOT_END]/                        // you can ommit {} around var name

def urlPath  = "/$ANYWHERE*($EMBEDDED+$ANYWHERE+)*"

def url =
    """(?x:
             # you have to escape backslash in multi-line double quotes
             \\b

             # match the hostname part
             (
               (?: ftp | http s? ): // [-\\w]+(\\.\\w[-\\w]*)+
             |
               $hostname
             )

             # allow optional port
             (?: :\\d+ )?

             # rest of url is optional, and begins with /
             (?: $urlPath )?
       )"""

assert 'http://www.google.com/search?rls=en&q=regex&ie=UTF-8&oe=UTF-8' ==~ url
assert 'pages.github.io' ==~ url

As you can see, there are several notations, and for every subexpression you can choose the one that is most expressive.

Resources