Contents

Be careful how and where do you define your variables in GitLab

Summary

tl;dr Even when you use multiple configuration files for your GitLab CI pipeline, the variable namespace is shared between them. So you can accidentally override something.

Backstory

So I have this hobby serverless project with my friends. We are using GitLab to manually (shame on me) trigger the deployments of certain components. Also because of bad memories of our previous workplace we have decided to run with the Monorepo strategy. There are times when we only want to redeploy one of the projects and sometimes we initiate a full redeployment. We are the “usual” AWS serverless stack:

  • (Standard) CloudFormation
  • AWS SAM
  • Lambda
  • API Gateway
  • Route53
  • S3
  • CloudFront
  • Cognito
  • IAM

Since this is a hobby project it could happen that weeks pass before anyone from the team touches it (we have lives, after all). So I’m not sure how or when, but it seems GitLab changed the behavior how variables are handled in a single repo (or I was blind for months). Let me demonstrate (or at least describe) what do I mean.

The setup

Main .gitlab-ci.yml

If I want to oversimplify then this is how our repo file hierarchy looks like:

1
2
3
4
5
6
7
8
monorepo/
├── deployment1/
│   └── .gitlab-ci.yml
├── deployment2/
│   └── .gitlab-ci.yml
├── deployment3/
│   └── .gitlab-ci.yml
└── .gitlab-ci.yml

The main/root .gitlab-ci.yml file originally looked like this:

1
2
3
4
5
6
7
8
9
stages:
  - deployment1_step1
  - deployment2_step1
  - deployment3_step1
  
include:
  - local: "/deployment1/.gitlab-ci.yml"
  - local: "/deployment2/.gitlab-ci.yml"
  - local: "/deployment3/.gitlab-ci.yml"

Subfolder .gitlab-ci.yml

And the original content of the deployment1/.gitlab-ci.yml file was something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# defining variables
variables:
  DEV_DOMAIN_NAME: domain.test
  DEV_DEP_BUCKET: dep1.domain.test
  PROD_DOMAIN_NAME: domain.prod
  PROD_DEP_BUCKET: dep1.domain.prod

# image to use
image: busybox

# defining dev variables
.dev_variables: &dev_variables
  ENV: dev
  BUCKET: ${DEV_DEP1_BUCKET}
  DOMAIN_NAME: ${DEV_DOMAIN_NAME}

# defining prod variables
.prod_variables: &prod_variables
  ENV: prod
  BUCKET: ${PROD_DEP1_BUCKET}
  DOMAIN_NAME: ${PROD_DOMAIN_NAME}

# script to use in deployment
.echo_script: &echo_script
  script: |
    echo "Deployment name is Deployment 1"
    echo "Deployment environment: ${ENV}"
    echo "Deployment bucket name: ${BUCKET}"
    echo "Deployment domain name: ${DOMAIN_NAME}"    

# deployment job/stage which should run in case ENV=dev and DEP1=true
deployment1_step1:dev:
  stage: deployment1_step1
  variables:
    <<: *dev_variables
  <<: *echo_script
  rules:
    - if: $ENV == "dev" && $DEP1 == "true" 

# deployment job/stage which should run in case ENV=prod and DEP1=true
deployment1_step1:prod:
  stage: deployment1_step1
  variables:
    <<: *prod_variables
  <<: *echo_script
  rules:
    - if: $ENV == "prod" && $DEP1 == "true" 

The only difference with deployment2/.gitlab-ci.yml and deployment3/.gitlab-ci.yml is that the variable DEV_DEP_BUCKET has a different bucket name as target. Of course, our actual project is much more complex, but to show what unexpected issue I have run into I simplified the example as much as I could.

On the gitlab.com web user interface I defined 4 variables:

  • DEP1, DEP2, DEP3 with default value of false
  • ENV with default value of “DEV”

So when we want to run any of the jobs manually from the WebUI (again, shame on me) we just had to specify which project we want to deploy.

The problem

And the above settings worked well for us for a time being (or at least I haven’t noticed anything) until a few weeks ago. Recently we got error when we wanted to reach our different subdomains which are basically different S3 buckets with Route53 entries pointing dedicated CloudFront distributions in front of them. After checking it turned out that all of the S3 bucket contents are messed up with no apparent reason. So we just went and redeployed everything. The pipeline jobs in GitLab were all green, yet we hit the same problem.

So I did what every experienced IT Engineer would do in my place (after initial panic and coursing): “echo"-ed and “print"-ed all the things I could think of. And the result for me was “shocking” (big word just to increase drama): all three deployment went to the 3rd bucket defined in deployment3/.gitlab-ci.yml.

So at this point I did what every experienced IT Engineer would do: run it over and over again. But after some time I was sure I should try a different approach. So I went ahead and created a new repo with simple test cases. I have the above introduced file hierarchy but with an added/modified variables and script command just to prove my hypothesis.

Extra tests I did

Changes I made:

  • Removed all “prod” related lines, since for testing 1 version is enough
  • Changed the DEV_DEP_BUCKET variable name to DEV_DEP1_BUCKET, DEV_DEP2_BUCKET, DEV_DEP3_BUCKET
  • Added the DEV_2ND_BUKET new variable with the value of dep1-2nd.domain.test, dep2-2nd.domain.test, dep2-2nd.domain.test. So the variable name is the same across all 3 files, but have different values reflecting the folder name

New .gitlab-ci.yml content

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
variables:
  DEV_DOMAIN_NAME: domain.test
  DEV_DEP1_BUCKET: dep1.domain.test # I have changed the variable names in to reflect the deployment name
  DEV_2ND_BUKET: dep1-2nd.domain.test # I have introduced the new variable, with same name in all 3 .gitlab-ci.yml file

image: busybox

.dev_variables: &dev_variables
  ENV: dev
  BUCKET: ${DEV_DEP1_BUCKET}
  BUCKET2: ${DEV_2ND_BUKET}
  DOMAIN_NAME: ${DEV_DOMAIN_NAME}
  
.echo_script: &echo_script
  script: |
    echo "Deployment name is Deployment 1"
    echo "Deployment environment: ${ENV}"
    echo "Deployment bucket name: ${BUCKET}"
    echo "Deployment domain name: ${DOMAIN_NAME}"
    echo "Deployment 2nd bucket name: ${BUCKET2}"
        
deployment1_step1:dev:
  stage: deployment1_step1
  variables:
    <<: *dev_variables
  <<: *echo_script
  rules:
    - if: $ENV == "dev" && $DEP1 == "true" 

After running with DEP1,DEP2,DEP3=true the result: DEP1:

1
2
3
4
5
Deployment name is Deployment 1
Deployment environment: dev
Deployment bucket name: dep1.domain.test
Deployment domain name: domain.test
Deployment 2nd bucket name: dep3-2nd.domain.test

DEP2:

1
2
3
4
5
Deployment name is Deployment 2
Deployment environment: dev
Deployment bucket name: dep2.domain.test
Deployment domain name: domain.test
Deployment 2nd bucket name: dep3-2nd.domain.test

DEP3:

1
2
3
4
5
Deployment name is Deployment 3
Deployment environment: dev
Deployment bucket name: dep3.domain.test
Deployment domain name: domain.test
Deployment 2nd bucket name: dep3-2nd.domain.test

Result is the same if I try to run only DEP1=true and leave the others in their default false. So in my mind (and I think historically) the value of the variable should only changed when we execute/process those lines. On the contrary all the included files loaded to memory at the same time, and variables with the same name will be overwritten.

I really don’t know if somehow magically I just missed this behavior or it did actually changed between versions. But I have learned my lesson: variable names should not overlap in a GitLab monorepo.