Cloudron Forum

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Gitlab runners fail every 30th build

Moved GitLab

4 Posts 3 Posters 789 Views 3 Watching

R Offline
R Offline
rlp10

wrote on last edited by

#1

We have Gitlab installed as a Cloudron app on our VPS server. Until recently, the our gitlab-runner.service was working fine. On Tuesday 3rd June we started getting unexpected failed jobs from the gitlab-runner. Upon further testing, it appears that the first 29 jobs pass as expected, and then the 30th job, and any subsequent jobs, fail until the service is stopped and restarted. The error seems to happen when the pipeline job is cloning a repository. This is an example of a log from one of the failed jobs:

Running with gitlab-runner 17.11.0 (v17.11.0)
on default_mydevice_somenumber A-number, system ID: id_number
Preparing the "shell" executor 00:00
Using Shell (bash) executor...
Preparing environment
Running on mydevice...
Getting source from Git repository
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /var/lib/private/gitlab-runner/builds/A-number/0/OrgName/mainRepo/.git/
Checking out 130d8ca7 as detached HEAD (ref is refs/merge-requests/294/head)...
Skipping Git submodules setup
Executing "step_script" stage of the job script 00:03
$ export repo2=$(mktemp -d)
$ GIT_TRACE=1 GIT_FLUSH=1 git -c core.progress=false clone https://gitlab-ci-token:$CI_JOB_TOKEN@my-self-hosted-git.com/OrgName/repo2 $repo2
09:27:40.005972 git.c:476 trace: built-in: git clone https://gitlab-ci-token:[MASKED]@my-self-hosted-git.com/OrgName/repo2 /tmp/tmp.NDTArMtQAp
Cloning into '/tmp/tmp.NDTArMtQAp'...
09:27:40.009784 run-command.c:667 trace: run_command: git remote-https origin https://gitlab-ci-token:[MASKED]@my-self-hosted-git.com/OrgName/repo2
09:27:40.009805 run-command.c:759 trace: start_command: /nix/store/805a5wv1cyah5awij184yfad1ksmbh9f-git-2.49.0/libexec/git-core/git remote-https origin https://gitlab-ci-token:[MASKED]@my-self-hosted-git.com/OrgName/repo2
09:27:40.011889 git.c:772 trace: exec: git-remote-https origin https://gitlab-ci-token:[MASKED]@my-self-hosted-git.com/OrgName/repo2
09:27:40.011944 run-command.c:667 trace: run_command: git-remote-https origin https://gitlab-ci-token:[MASKED]@my-self-hosted-git.com/OrgName/repo2
09:27:40.011965 run-command.c:759 trace: start_command: /nix/store/805a5wv1cyah5awij184yfad1ksmbh9f-git-2.49.0/libexec/git-core/git-remote-https origin https://gitlab-ci-token:[MASKED]@my-self-hosted-git.com/OrgName/repo2
warning: redirecting to https://my-self-hosted-git.com/OrgName/repo2.git/
09:27:40.411977 run-command.c:667 trace: run_command: git index-pack --stdin --fix-thin '--keep=fetch-pack 1061622 on mydevice' --check-self-contained-and-connected
09:27:40.412021 run-command.c:759 trace: start_command: /nix/store/805a5wv1cyah5awij184yfad1ksmbh9f-git-2.49.0/libexec/git-core/git index-pack --stdin --fix-thin '--keep=fetch-pack 1061622 on mydevice' --check-self-contained-and-connected
09:27:40.415115 git.c:476 trace: built-in: git index-pack --stdin --fix-thin '--keep=fetch-pack 1061622 on mydevice' --check-self-contained-and-connected
fatal: write error: No space left on device
fatal: fetch-pack: invalid index-pack output
Running after_script
Running after script...
$ rm -rf $repo2
Cleaning up project directory and file based variables
ERROR: Job failed: exit status 1

The device on which the gitlab-runner is running has plenty of space, as does the Linode server. We tried increasing the RAM on the Linode server, but still found the 30th job onwards failed, after the first 29 passed. This behaviour suggests something is building up somewhere and then preventing further jobs from succeeding, but we are not sure what. We are looking for any help or pointers you might be able to give us to solve this issue.

I wonder if the failure may have been caused by an update to the gitlab app. Perhaps someone else may have reported similar problems.

Thanks in advance.
1 Reply Last reply

1
N Offline
N Offline
nebulon
Staff
wrote on last edited by

#2
The logs mention that the system ran out of disk space:
```
...
fatal: write error: No space left on device
....
```
Not sure how the server is setup which has the runner (should not be on Cloudron) but since the runner is failing, this is a bit out of scope for Cloudron itself.
1 Reply Last reply

1
J joseph marked this topic as a regular topic on
J joseph moved this topic from Support on
R Offline
R Offline
rlp10

wrote on last edited by

#3

Thank you, Nebulon, for your reply.

The runners are on a different machine.

I don't think that the machine with the runners is running out of space because it has a very large amount available on all partitions (the root has 300GB available). I think the error is originating from the cloudron machine (although I may be wrong).

In order to try to confirm the location of the error, I will setup another gitlab runner client on a different machine, and see if that has the same problem. Hopefully that will isolate the error either to the Cloudron server, or else the gitlab runner machines.
1 Reply Last reply

2

davejgreen

wrote

We solved this in the end by adding some clean up to the GitLab runner's after_script in our .gitlab-ci.yml file:

  before_script:
    - stat -c %m /tmp
    - du -sh /tmp
    - ls /tmp -la
    - export XDG_CACHE_HOME=$(mktemp -d)

# Each job leaks files into the gitlab-runner's private /tmp
  after_script:
    - rm -rf $XDG_CACHE_HOME
    - du -sh /tmp
    - ls /tmp -la
    - find /tmp -mindepth 1 -maxdepth 1 -user gitlab-runner -exec rm -rf {} +
    - du -sh /tmp
    - ls /tmp -la

We could see the files that got left over when we added the logging lines.

Hello! It looks like you're interested in this conversation, but you don't have an account yet.

Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

With your input, this post could be even better 💗

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Gitlab runners fail every 30th build