# Be careful how and where do you define your variables in GitLab ## Summary **tl;dr** Even when you use multiple configuration files for your GitLab CI pipeline, the variable namespace is shared between them. So you can accidentally override something. ## Backstory So I have this hobby serverless project with my friends. We are using GitLab to manually (shame on me) trigger the deployments of certain components. Also because of bad memories of our previous workplace we have decided to run with the [Monorepo strategy](https://en.wikipedia.org/wiki/Monorepo#:~:text=In%20version%20control%20systems%2C%20a,stored%20in%20the%20same%20repository.&text=Many%20attempts%20have%20been%20made,other%2C%20newer%20forms%20of%20monorepos.). There are times when we only want to redeploy one of the projects and sometimes we initiate a full redeployment. We are the "usual" AWS serverless stack: - (Standard) CloudFormation - AWS SAM - Lambda - API Gateway - Route53 - S3 - CloudFront - Cognito - IAM Since this is a hobby project it could happen that weeks pass before anyone from the team touches it (we have lives, after all). So I'm not sure how or when, but it seems GitLab changed the behavior how variables are handled in a single repo (or I was blind for months). Let me demonstrate (or at least describe) what do I mean. ## The setup ### Main .gitlab-ci.yml If I want to oversimplify then this is how our repo file hierarchy looks like: ```bash monorepo/ ├── deployment1/ │ └── .gitlab-ci.yml ├── deployment2/ │ └── .gitlab-ci.yml ├── deployment3/ │ └── .gitlab-ci.yml └── .gitlab-ci.yml ``` The main/root `.gitlab-ci.yml` file originally looked like this: ```yaml stages: - deployment1_step1 - deployment2_step1 - deployment3_step1 include: - local: "/deployment1/.gitlab-ci.yml" - local: "/deployment2/.gitlab-ci.yml" - local: "/deployment3/.gitlab-ci.yml" ``` ### Subfolder .gitlab-ci.yml And the original content of the `deployment1/.gitlab-ci.yml file` was something like this: ```yaml # defining variables variables: DEV_DOMAIN_NAME: domain.test DEV_DEP_BUCKET: dep1.domain.test PROD_DOMAIN_NAME: domain.prod PROD_DEP_BUCKET: dep1.domain.prod # image to use image: busybox # defining dev variables .dev_variables: &dev_variables ENV: dev BUCKET: ${DEV_DEP1_BUCKET} DOMAIN_NAME: ${DEV_DOMAIN_NAME} # defining prod variables .prod_variables: &prod_variables ENV: prod BUCKET: ${PROD_DEP1_BUCKET} DOMAIN_NAME: ${PROD_DOMAIN_NAME} # script to use in deployment .echo_script: &echo_script script: | echo "Deployment name is Deployment 1" echo "Deployment environment: ${ENV}" echo "Deployment bucket name: ${BUCKET}" echo "Deployment domain name: ${DOMAIN_NAME}" # deployment job/stage which should run in case ENV=dev and DEP1=true deployment1_step1:dev: stage: deployment1_step1 variables: <<: *dev_variables <<: *echo_script rules: - if: $ENV == "dev" && $DEP1 == "true" # deployment job/stage which should run in case ENV=prod and DEP1=true deployment1_step1:prod: stage: deployment1_step1 variables: <<: *prod_variables <<: *echo_script rules: - if: $ENV == "prod" && $DEP1 == "true" ``` The only difference with `deployment2/.gitlab-ci.yml` and `deployment3/.gitlab-ci.yml` is that the variable `DEV_DEP_BUCKET` has a different bucket name as target. Of course, our actual project is much more complex, but to show what unexpected issue I have run into I simplified the example as much as I could. On the gitlab.com web user interface I defined 4 variables: * `DEP1`, `DEP2`, `DEP3` with default value of `false` * `ENV` with default value of "DEV" So when we want to run any of the jobs manually from the WebUI (again, shame on me) we just had to specify which project we want to deploy. ## The problem And the above settings worked well for us for a time being (or at least I haven't noticed anything) until a few weeks ago. Recently we got error when we wanted to reach our different subdomains which are basically different S3 buckets with Route53 entries pointing dedicated CloudFront distributions in front of them. After checking it turned out that all of the S3 bucket contents are messed up with no apparent reason. So we just went and redeployed everything. The pipeline jobs in GitLab were all green, yet we hit the same problem. So I did what every experienced IT Engineer would do in my place (after initial panic and coursing): "`echo`"-ed and "`print`"-ed all the things I could think of. And the result for me was "shocking" (big word just to increase drama): all three deployment went to the 3rd bucket defined in `deployment3/.gitlab-ci.yml`. So at this point I did what every experienced IT Engineer would do: run it over and over again. But after some time I was sure I should try a different approach. So I went ahead and created a new repo with simple test cases. I have the above introduced file hierarchy but with an added/modified variables and script command just to prove my hypothesis. ## Extra tests I did Changes I made: * Removed all "prod" related lines, since for testing 1 version is enough * Changed the `DEV_DEP_BUCKET` variable name to `DEV_DEP1_BUCKET`, `DEV_DEP2_BUCKET`, `DEV_DEP3_BUCKET` * Added the `DEV_2ND_BUKET` new variable with the value of `dep1-2nd.domain.test`, `dep2-2nd.domain.test`, `dep2-2nd.domain.test`. So the variable name is the same across all 3 files, but have different values reflecting the folder name ### New .gitlab-ci.yml content ```yaml variables: DEV_DOMAIN_NAME: domain.test DEV_DEP1_BUCKET: dep1.domain.test # I have changed the variable names in to reflect the deployment name DEV_2ND_BUKET: dep1-2nd.domain.test # I have introduced the new variable, with same name in all 3 .gitlab-ci.yml file image: busybox .dev_variables: &dev_variables ENV: dev BUCKET: ${DEV_DEP1_BUCKET} BUCKET2: ${DEV_2ND_BUKET} DOMAIN_NAME: ${DEV_DOMAIN_NAME} .echo_script: &echo_script script: | echo "Deployment name is Deployment 1" echo "Deployment environment: ${ENV}" echo "Deployment bucket name: ${BUCKET}" echo "Deployment domain name: ${DOMAIN_NAME}" echo "Deployment 2nd bucket name: ${BUCKET2}" deployment1_step1:dev: stage: deployment1_step1 variables: <<: *dev_variables <<: *echo_script rules: - if: $ENV == "dev" && $DEP1 == "true" ``` After running with `DEP1,DEP2,DEP3=true` the result: DEP1: ``` Deployment name is Deployment 1 Deployment environment: dev Deployment bucket name: dep1.domain.test Deployment domain name: domain.test Deployment 2nd bucket name: dep3-2nd.domain.test ``` DEP2: ``` Deployment name is Deployment 2 Deployment environment: dev Deployment bucket name: dep2.domain.test Deployment domain name: domain.test Deployment 2nd bucket name: dep3-2nd.domain.test ``` DEP3: ``` Deployment name is Deployment 3 Deployment environment: dev Deployment bucket name: dep3.domain.test Deployment domain name: domain.test Deployment 2nd bucket name: dep3-2nd.domain.test ``` *** Result is the same if I try to run only `DEP1=true` and leave the others in their default `false`. So in my mind (and I think historically) the value of the variable should only changed when we execute/process those lines. On the contrary all the included files loaded to memory at the same time, and variables with the same name will be overwritten. I really don't know if somehow magically I just missed this behavior or it did actually changed between versions. But I have learned my lesson: variable names should not overlap in a GitLab monorepo.