Balaji Vajjala's Blog

A DevOps Blog from Trenches

Best Practices of Continuous Integration

What is Continuous Integration

Martin Fowler has the best description of CI

Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily – leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.

Best Practices of Continuous Integration

1. Maintain a Single Source Repositry

  • Use SCM tools like Git, Subversion, Perforce for all Software Projects that need to be orchestarted together to build a product
  • Put everything that is required for a build in the SCM system and this should include: test scripts,properties files, database schema, install scripts, and third party libraries.
  • Keep your use of branches to a minimum. Have a mainline: a single branch of the project under developemnt.
  • Don’t add build artifacts/binaries to the SCM system. It only indicates the inability to reliabily recreate builds and absence of any Depndency management Solution.

    Tools Used : Git, SVN, Perforce, hg

2. Automate the Build

  • Automate all phases of Build including Compilation, moving files aound, loading schema into the databases.
  • Only Build what has changed. Looks for dates of the source and object files and only compile if the source date is later

    Tools Used : Ant, Maven

3. Make your Build Self-Testing

  • Use Test-Driven-Development (TDD) approaches to catch bugs in the code-base. These tests needs to be self-testing. For a build to be self-testing the failure of a test should cause the build to fail.

    Tools Used : XUnit family, FIT, Selenium,Sahi,Watir, FITnesse

4. Everyone Commits to the mainline Every Day

  • Integration is primarily about Communication. Diing integration regularly allows the developers to tell others about the changes they have made and allow others to quickly react to the changes. The only prerequiste for a developer committing to the mailibe is that they can correctly build their code. This includes passing the build tests
  • Every Developer should commit to the repository every day. The more frequest they commit, the less places they have to look for conflict errors, and more rapidly they can then fix the conflicts

5. Every Commit Should Build the Mainline on an Integration Machine

  • Using Daily commits, a team gets frequent tested builds and this means that the mainline is always in Ready-to-Release state.
  • Use CI Server. A CI server acts as a monitor to the repository. Every time a commit against the repository finishes the CI server automatically checks out the sources onto the integration machine, initiates a build, and notifies the committer of the result of the build. The committer isn’t done until she gets the notification – usually an email.

Tools Used : Jenkins/Hudson, Bamboo, TeamCity

6. Keep the Build Fast

  • Rather than having a monolithic single build which covers all phases of the build: Commit builds, Unit Tests builds, Code Coverage and Analysis, split the build pipeline into two stages. the first stage is the commit build and used as as the main CI cycle. The second-stage build runs when it can, picking up the executable from the latest good commit build for further testing. If this secondary build fails, then this may not have the same ‘stop everything’ quality, but the team does aim to fix such bugs as rapidly as possible, while keeping the commit build running.
  • Keep the commit build times to be less than 10 mins. The commit build is the build that’s needed when someone commits to the mainline
  • Parallize test stage by running the tests on multiple machines that run half the tests each.

7. Test in a Clone of the Production Environment

  • Use virtualization to put together test environments that mimics the Production Environment. Virtualized machines can be saved with all the necessary elements baked into the virtualization. It’s then relatively straightforward to install the latest build and run tests. Furthermore this can allow you to run multiple tests on one machine, or simulate multiple machines in a network on a single machine.

8. Make it easy for Anyone to get the latest Executables

  • Store all build Executables/Installers in a centralized location, easily accessible to anyone involved with the software project. All Stakeholders should be easily be able to get the latest executable and be able to run it.

9. Everyone can see what’s happening

  • CI is all about communication. Create a Dashboard or Wiki page where everone can easily see the state of the system and the changes that have been made to it.
  • Important thing to communicate is the stae of the mainline build. The Dashboard/Web Site should show if there’s a build in progress and what was the last state of the mainline build.

10. Automate Deployment

  • To do effective Continuous Integration one needs multiple environments, one to run commit tests, one or more to run secondary tests. Since this involves moving executables between these environments multiple times a day, hence the need for Automation. So it’s important to have scripts that will allow you to deploy the application into any environment easily.

Summary

  • Look for frequent builds to be triggered by code commits and to not take more than 15 minutes to run.
  • Keep builds simple and chain them together if more complexity is required. Simpler and quicker builds encourage more frequent use and are easier to debug when they break.
  • Prioritise fixing broken builds over starting new development work.
  • Identify failures sooner
  • Identify culprit change precisely
  • Avoids divide-and-conquer and tribal knowledge
  • Lowers compute costs using fine grained dependencies
  • Keeps the build green by reducing time to fix breaks
  • Accepted enthusiastically by product teams
  • Enables teams to ship with fast iteration times
  • Chain simple Jenkins jobs rather than trying to do everything in one big job.
  • Configure Jenkins as master + slaves.