GitLab 遭遇 “Chef Infra Client failed.” 报错解决

今日在升级 GitLab 的过程中,经常出现 “Chef Infra Client failed.” 的错误,导致页面访问报 500 的错误。由于我的 GitLab 是自动重启的,重启后恢复正常,故一直没有在意。但是随手在网上搜了一下这个错误后,发现大概率是因为数据库关系错误引起的,之所以重启后回复正常,有可能是因为重启后数据库自行升级成功的原因。不过也还是收集了一下网上的解决办法,以备不时之需。

首先,需要到 GitLab 服务器里面去查询一下数据库关系是否正常:

[root@GitLab ~]# gitlab-rake db:migrate:status
database: gitlabhq_production
 Status   Migration ID    Migration Name
--------------------------------------------------
   up     20210309181019  Add last used at to cluster agent token
   up     20210310000627  Add idx vulnerability occurrences dedup
   up     20210310111009  Add settings to group merge request approval settings
  down    20210311022012  Add text limits to dast site profiles
  down    20210311045138  Set traversal ids for gitlab org group staging
  down    20210311045139  Set traversal ids for gitlab org group com
  down    20210311093723  Add partial index on ci pipelines by cancelable status and users
  down    20210311120152  Add metrics to batched background migration jobs
  down    20210311120153  Initialize conversion of events id to bigint
  down    20210311120154  Initialize conversion of push event payloads event id to bigint
  down    20210311120155  Backfill events id for bigint conversion
  down    20210311120156  Backfill push event payload event id for bigint conversion
...

如果出现上述 down 的情况,那就说明需要手动进行数据库关系修复:

gitlab-rake db:migrate

修复完成后,重启 GitLab 或重新 reconfigure 即可。

为了保险起见,以及便于后续升级,建议手动升级一下数据库(当然,前提是要做好备份。):

gitlab-ctl pg-upgrade