We're very pleased that you want to get in touch with us. Please fill in the form below:



or   Close this form  
Some content

Peter Pilgrim :: Java Champion :: Digital Developer Architect

I design Java EE and Scala software solutions for the blue-chip clients and private sector

Hey all! Thanks for visiting. I provide fringe benefits to interested readers: checkout consultancy, training or mentorship Please make enquiries by email or call +44 (0)7397 067 658.

Due to the Off-Payroll Working plan for the UK government, unfortunately, I am no longer accepting standard GOV.UK contract engagements for the public sector. Please enquire for further information.

Hot Shot 001 – Why Did My Docker Container Jenkins Job Die?

14 March 2018 Comments off

3 minutes

650

This are my verbatim notes to the PEAT UK podcast:

Hello World

Welcome to the first of meaning Pilgrim Engineering Architecture Technology (PEAT UK) Hotshots, which are uniquely positioned audio segments of development information. I am your host, Peter Pilgrim, a platform engineer specialising in Amazon Web Service and Pivotal Cloud Foundry and Enterprise Java specialist. I currently contract at Santander United Kingdom.

My first recommendation is to always at the API for open source projects, because you never know what you find inside them that is useful. Recently, we had an issue with dynamic Docker containers and memory utilisation. Suddenly our Spring Boot integration tests were dying with an exit code of 137. We went looking for the failure, we began the investigation and our first port of call, we started with the Docker container. We increased the size from 4GB to 6GB and the memory problem persisted. We could not explain why a certain Jenkins jobs failed whilst the others ran successful. We noticed that those Jenkins jobs that failed relied on a certain in-memory data cache (and this will remain nameless). We looked in there and configured a few things and re-ran the integration tests. We still got an issue. I then went looking deeper inside the dynamic Docker container, we decided SSH into it. How did we SSH into dynamically created Docker instance, one that terminates when the Jenkins Job finished? We forced the job to sleep for 1000 seconds, which meant that we could log into the container with SSH before it died and then check memory. I noticed that the memory was not utilised by Java, because I execute the Linux command free. However if we manually executing the Gradle build from command line inside the container then it ran out of memory. At the command line, we executed: build gradle test --info --stacktrace. So we narrowed down the issue to Java, and there only one thing that causes Java issues. Default configuration of the heap space. It turned out there were two issues.

 

 

 

 

The in-memory data cache product would attempt to consume half of the virtual memory allocated in the container. We found this in the product’s documentation. In other words, if our Docker container was configured with 4GB physical ram, the product would grab 2GB. With 6GB physical ram it would immediately grab 3GB and with 8GB it would grab 4GB. Worst, the grab (reservation of RAM) would keep growing. [ad lib]

The second issue is that Gradle and the integration tests ran with a default JVM settings. It was this default heap space allocation on Java 8 that turned out to be the root cause.

Did you know with Java 8 takes the Larger of 1/6th of physical memory for minimum heap size (-Xms size) and the Smaller of 1/4th of your physical memory for maximum heap size (-Xmx size)? No wonder our Jenkins Docker job ran out of space and died.

The first part of the solution was to allocate a fixed size of 256MB only to our in-memory distributed cache product, which dramatically reduced the data requirements. [the pressure to a container] We asked ourselves, what integration tests are we running that requires so much data? If we ever write them, we are building terrible integration tests.

The second part of the solution was to modify the master [parent] Build Grade and add configuration to the Gradle Test task. So we wrote test configuration in the Gradle DSL that configured the JVM with a minimum heap space to 384MB and a maximum heap space of 768MB. We found the official document on Gradle Testing task very helpful, because we also configure here:

testLogging.showStandardStreams = true
testLogging.exceptionFormat = 'full'

Afterwards, our remaining integrations turned green.

 

+PP+
March 2018

 

 

By the way, all your Shares, Likes, Comments are always more than welcomed!

 

 

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Contents of this blog entry are under copyright © 2017 by Peter Pilgrim and associates. For enquiries after republishing, please contact us for permission. All requests for syndicated content will be ignored /dev/null, consider yourself warned!

I help to design, create and build JVM components and services that are behind popular e-commerce websites.

My Blurb

Please get in touch , directly, to establish hire availability, contract & consulting opportunities.

Speaking at Your Conference

Contact by invitation

What Peter Does

Contact