If you are building multiple related software projects with a continuous integration server one important aspect is to be notified when changes in an upstream job break the build or tests for a downstream job. This involves knowing which exact build numbers of the upstream and the downstream job are involved.
The Jenkins continuous integration server uses the notion of file fingerprints for this purpose. The upstream job is built by Jenkins and produces one or several so called artifacts, the results of the build process. The artifacts are archived by Jenkins and fingerprints (hash sums) for each artifact are created and stored along with the build number of the job. When the downstream job starts to build it downloads the (most recent) artifacts from the upstream job and uses them for its purposes, i.e. building and running the own source code. By comparing the fingerprints of the downloaded artifacts with the stored fingerprints Jenkins knows which version of each upstream job was involved in a build and can track which upstream build number broke the downstream job. Jenkins will only issue notifications if this fingerprinting mechanism is properly configured, triggering a build after another is not sufficient to receive these notifications. Moreover, the Blame Upstream Commiters plugin needs to be used and enabled for each downstream job or the global property hudson.upstreamCulprits (will this ever be renamed?) needs to be set.
The rational behind this rather complex mechanism is that it enables a high amount of parallelism for building jobs. While the downstream job builds, the upstream job can already operate again without affecting the downstream job. This would be the case if e.g. a central installation location would be shared between both jobs. If the upstream job installs new files while the downstream job is still building, this will certainly result in hard to debug errors. Moreover, this also allows to run the downstream job on a different build salve (assuming similar systems), which also would not be the case with a central installation location in a file system.
For Java projects (where Jenkins comes from) the explained mechanism usually works well. The upstream job produces one or several jar files containing all resources for the project like images, fingerprints them, no preprocessor is involved which configures the Java code according to the installation setup, and no source code was generated based on this setup. For C++ projects this is usually different, because the language already includes a preprocessor and it is common practice to set certain code lines according to the installation location, e.g. to find additional files like images, because they cannot be packaged in the jar file. Also, C++ projects usually consists of much more files considering all headers compared to Java. This provides more chances to mix something up.
So assuming an upstream C++ job A (using CMake, other build solutions are not covered in this post but the techniques can be applied there, too) which is built in Jenkins, it usually will be configured with an installation location, e.g. inside the job’s workspace like /jenkins/workspace/A/install
.
Often, CMake will use this location, e.g. to generate a
config.h
which tells that images are found at/jenkins/workspace/A/install/share/A/images
etc.To use the CMake dependency mechanism, it will generate a
AConfig.cmake
file and install it also to the share folder (cf. the CMake documentation for find_package). The file might look like:SET(A_LIBRARIES "/jenkins/workspace/A/install/lib/libA.so") SET(A_INCLUDE_DIRS "/jenkins/workspace/A/install/include/A")
After building the project the job will e.g. use a compression tool to create a single archive and compress all contents of /jenkins/workspace/A/install/
, archive this artifact and generate the fingerprints for it.
Both issues mentions above will prevent the dependency tracking of Jenkins to function properly, because the downstream job will download the artifacts to its own workspace, e.g. to /jenkins/workspace/B/upstream/A
and unpack them. Cf. the issues:
- The upstream project A will not find external files at all or will use wrong versions, because in the meantime a new build of job A might have started and hence the workspace of this job is currently changing.
- The downstream job B will not build at all or might use a wrong version of A because the contents of
AConfig.cmake
point to A’s workspace and not the downloaded artifacts.
To enable reliable dependency tracking in Jenkins, the solutions are:
Do not use this technique at all. The software is generally more flexible if not hard locations are assumed and more situations are covered without recompiling.
The idea here is to make all paths given in the config file (
AConfig.cmake
) relative to its current location on the disc. This will look like this:GET_FILENAME_COMPONENT(CONFIG_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH) SET(A_LIBRARIES "${CONFIG_DIR}/../../lib/libA.so") SET(A_INCLUDE_DIRS "${CONFIG_DIR}/../../include/A")
Now the CMake script of B will use the correct downloaded headers, libraries etc. for A from the own workspace
The two aspects make it possible to use fingerprinting in Jenkins for dependency tracking with notifications for upstream committers. Especially the first aspect includes taking care while designing the project but there is no other solution I can think of.
Please note that for executing any tests in downsteam job B you have to set the LD_LIBRARY_PATH
to find the right upstream libraries as well.
Random Comments
Some more care needs to be taken to not mix up the dependency tracking again:
The downstream job needs to make sure that the latest downloaded artifact is really used to build its own source code. So it is a good idea to simply remove the upstream directory as the first step of the build.
The downloaded artifacts (as explained above the generated archive files) need to be kept after extracting them, because the downstream job also has to generate fingerprints for them (and not for the extracted files) to create a match with the fingerprints stored for the upstream job.
In order to enable the downstream CMake project to find the upstream project use the
_DIR
variable for the CMake call as defined in the CMake documentation, e.g.-DA_DIR="${WORKSPACE}/upstream/A/share/A"
If your upstream project contains a version or revision number in the extracted folder (e.g.
${WORKSPACE}/upstream/A-0.35/
) and you want your downstream job to be resilient against version changes in the upstream project you can use some find-magic on UNIX for automatically finding the folder:A=`find "${WORKSPACE}/upstream" -maxdepth 1 -type d -name "A-\*";\`
If you are using pkg-config instead of or in addition to the CMake config file mechanism, you can use the
--define-variable
command line argument to achieve similar flexibility, assuming that all your absolute paths depend on a single prefix-variable in the pc file.