diff options
author | Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> | 2019-09-10 11:41:47 +0000 |
---|---|---|
committer | Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> | 2019-09-10 13:06:07 +0000 |
commit | 98b2b577ac8df6a26cffbc4d17c95afd91aae849 (patch) | |
tree | 43a8f9798f331a093b92ac5555391b6ad31d2c83 /start-container-docker.sh | |
parent | 8fa33bd7560ef2614cdd85e6251e7a27fc291c8e (diff) |
start-container-docker.sh: Kill container of the previous [aborted] build
In tcwg_bmk jobs we see a strange failure due to processes of aborted
build (timeout) clobbering files of the next build.
Specifically, a tcwg_bmk build can timeout waiting for tcwg-benchmark
job to finish. In this case jenkins kills top-level processes, but
it takes console I/O (I believe) for SIGHUP to propagate down
the process tree. The tcwg-benchmark step of aborted build manages
to write to artifacts/results_id file, which then can be used by
the next build. Wrong data can even be stored in base-artifacts.git,
thus clobbering subsequent builds.
The solution in this patch adds puts docker container name in
$WORKSPACE/.lock, and new builds remove these containers when they
exist.
We didn't observe this problem in other builds because we tended
to wipe workspace clean before every build. We decided to keep
workspace contents in round-robin jobs to reduce git clone overhead.
Change-Id: Ic529c20a16c44ad357c11a30ebda2d30289887cc
Diffstat (limited to 'start-container-docker.sh')
-rwxr-xr-x | start-container-docker.sh | 17 |
1 files changed, 16 insertions, 1 deletions
diff --git a/start-container-docker.sh b/start-container-docker.sh index c2a084cc..5b716aff 100755 --- a/start-container-docker.sh +++ b/start-container-docker.sh @@ -325,6 +325,21 @@ if [ $ret -eq 1 ]; then exit 1 fi +# For CI builds make sure to kill previous build, which might have been +# aborted by jenkins, but processes could have survived. Otherwise old +# build can start writing to files of the current build. +cleanup_lock=false +if [ x"$WORKSPACE" != x"" ]; then + prev_container=$($SSH $session_host flock "$WORKSPACE/.lock" cat "$WORKSPACE/.lock" || true) + if [ x"$prev_container" != x"" ]; then + echo "NOTE: Removing previous container for $WORKSPACE" + $DOCKER rm -vf "$prev_container" || echo "WARNING: Could not remove $prev_container" + fi + + $SSH $session_host bash -c "flock $WORKSPACE/.lock echo $session_name > $WORKSPACE/.lock" + cleanup_lock=true +fi + # Do not remove the container upon exit: it is now ready trap EXIT @@ -339,7 +354,7 @@ exec 1>&3 2>&4 cat <<EOF # v1 interface CONTAINER="${dryruncmd} $SSH -p ${session_port} ${user}${session_host}" -CONTAINER_CLEANUP="${CONTAINER_CLEANUP}" +CONTAINER_CLEANUP="${CONTAINER_CLEANUP}; if $cleanup_lock; then $SSH $session_host flock $WORKSPACE/.lock rm $WORKSPACE/.lock; fi" session_host=${session_host} session_port=${session_port} |