This are my verbatim notes to the PEAT UK podcast:
Welcome to the first of meaning Pilgrim Engineering Architecture Technology (PEAT UK) Hotshots, which are uniquely positioned audio segments of development information. I am your host, Peter Pilgrim, a platform engineer specialising in Amazon Web Service and Pivotal Cloud Foundry and Enterprise Java specialist. I currently contract at Santander United Kingdom.
My first recommendation is to always at the API for open source projects, because you never know what you find inside them that is useful. Recently, we had an issue with dynamic Docker containers and memory utilisation. Suddenly our Spring Boot integration tests were dying with an exit code of 137. We went looking for the failure, we began the investigation and our first port of call, we started with the Docker container. We increased the size from 4GB to 6GB and the memory problem persisted. We could not explain why a certain Jenkins jobs failed whilst the others ran successful. We noticed that those Jenkins jobs that failed relied on a certain in-memory data cache (and this will remain nameless). We looked in there and configured a few things and re-ran the integration tests. We still got an issue. I then went looking deeper inside the dynamic Docker container, we decided SSH into it. How did we SSH into dynamically created Docker instance, one that terminates when the Jenkins Job finished? We forced the job to sleep for 1000 seconds, which meant that we could log into the container with SSH before it died and then check memory. I noticed that the memory was not utilised by Java, because I execute the Linux command free. However if we manually executing the Gradle build from command line inside the container then it ran out of memory. At the command line, we executed:
build gradle test --info --stacktrace. So we narrowed down the issue to Java, and there only one thing that causes Java issues. Default configuration of the heap space. It turned out there were two issues.
The in-memory data cache product would attempt to consume half of the virtual memory allocated in the container. We found this in the product’s documentation. In other words, if our Docker container was configured with 4GB physical ram, the product would grab 2GB. With 6GB physical ram it would immediately grab 3GB and with 8GB it would grab 4GB. Worst, the grab (reservation of RAM) would keep growing. [ad lib]
The second issue is that Gradle and the integration tests ran with a default JVM settings. It was this default heap space allocation on Java 8 that turned out to be the root cause.
Did you know with Java 8 takes the Larger of 1/6th of physical memory for minimum heap size (
-Xms size) and the Smaller of 1/4th of your physical memory for maximum heap size (
-Xmx size)? No wonder our Jenkins Docker job ran out of space and died.
The first part of the solution was to allocate a fixed size of 256MB only to our in-memory distributed cache product, which dramatically reduced the data requirements. [the pressure to a container] We asked ourselves, what integration tests are we running that requires so much data? If we ever write them, we are building terrible integration tests.
The second part of the solution was to modify the master [parent] Build Grade and add configuration to the Gradle Test task. So we wrote test configuration in the Gradle DSL that configured the JVM with a minimum heap space to 384MB and a maximum heap space of 768MB. We found the official document on Gradle Testing task very helpful, because we also configure here:
testLogging.showStandardStreams = true testLogging.exceptionFormat = 'full'
Afterwards, our remaining integrations turned green.
By the way, all your Shares, Likes, Comments are always more than welcomed!