Continuous Integration and Continuous Deployment (CI/CD) is a software development practice that involves frequent code changes, testing, and deployment. GitLab is a popular platform that provides a complete DevOps toolchain, including CI/CD pipelines.
GitLab CI/CD pipelines help automate software building, testing, and deployment, saving developers a lot of time and effort.
GitLab CI/CD pipelines use artifacts to store build outputs, test results, and other files generated during the build process.
Artifacts are stored on the GitLab server and can be downloaded and used by subsequent jobs in the pipeline.
Job artifacts can be configured to include specific files or directories and can be named dynamically using CI/CD variables.
This makes sharing build outputs and other files easy across different pipeline stages.
Let’s see where we could potentially use these within our pipelines.
Understanding GitLab Artifacts, And Where You Will See Them
GitLab Artifacts are file/files generated by a CI/CD pipeline job that is stored on the GitLab server.
These files can be used by other jobs in the same pipeline or can be downloaded by users for further analysis.
Understanding how to use GitLab Artifacts is essential for creating efficient and effective CI/CD pipelines.
Artifacts Archive
The Artifacts Archive is a compressed file containing all of the files a job generates.
This archive is stored on the GitLab server and can be downloaded by users for analysis.
The Artifacts Archive can include only specific files or directories by specifying them in the job configuration.
GitLab Pages
GitLab Pages is a feature that allows users to host static websites directly from their GitLab repository. Artifacts generated by a job can be used to populate a GitLab Pages site.
This is useful for creating documentation or other static content that is generated by a CI/CD pipeline.
API
The GitLab API provides a way to access Artifacts generated by a job programmatically.
This allows users to automate the download and analysis of Artifacts, making integrating them into other systems or workflows easier.
Artifacts for Parent and Child Jobs
When a job is part of a pipeline, it can generate Artifacts that are used by other jobs in the same pipeline.
These jobs can be either parent or child jobs.
Parent jobs are executed before the current job, while child jobs are executed after the current job. Artifacts generated by parent jobs can be used by child jobs, but not vice versa.
Job with the Same Name
If two jobs in the same pipeline have the same name, they share the same Artifacts.
This can be useful for creating parallel jobs that generate the same output, but it can also lead to confusion if the Artifacts are used unexpectedly.
Working with Parent and Child Pipelines
Parent-child pipelines are a feature GitLab CI/CD provides that helps manage complexity while keeping it all in a monorepo.
Splitting complex pipelines into multiple pipelines with a parent-child relationship can improve performance by allowing child pipelines to run concurrently.
A parent pipeline can trigger many child pipelines, and these child pipelines can trigger their child pipelines.
Pipelines within GitLab have a maximum depth of two levels of child pipelines. Once this depth is reached, you can not trigger another level of pipelines.
When working with parent and child pipelines, it is important to understand how to download the artifacts from the child pipelines.
GitLab CI/CD provides the ability to download the artifacts from the child pipelines using the dependencies keyword.
The dependencies keyword specifies the list of child pipelines that this pipeline depends on.
It is also important to note that the artifacts from the parent pipeline are not automatically passed to the child pipeline. You need to use the keyword to pass the artifacts from the parent pipeline to the child pipeline. The artifacts keyword specifies the list of files or directories to pass to the child pipeline.
Another important aspect to consider when working with parent and child pipelines is how to prevent the artifacts from expiring.
By default, artifacts expire after 30 days. To prevent artifacts from expiring, you can use the expire_in keyword. The expire_in keyword specifies the duration for which the artifacts should be kept.
Understanding CI Artifacts
CI artifacts are files generated during the CI/CD pipeline stored on the GitLab server.
These files can be used in subsequent jobs, allowing for faster and more efficient builds.
GitLab CI artifacts can be defined in the .gitlab-ci.yml file. The artifacts keyword is used to specify the files or directories that should be saved as artifacts.
To download job artifacts, navigate to the job page and click on the “Download” button next to the artifact you want to download. Artifacts can also be downloaded using the GitLab API.
GitLab Pages access control can be used to control who can access artifacts stored on GitLab Pages. Access control can be set to public, internal, or private.
GitLab can also be used as an artifact repository. This allows for easy sharing and distribution of artifacts across teams and projects.
Overall, understanding CI artifacts is an essential part of building efficient and effective CI/CD pipelines. By adequately defining and utilizing artifacts, developers can save time and improve the speed and reliability of their builds.
Creating and Managing Build Artifacts
To create and manage build artifacts in GitLab CI/CD, one can use the artifacts keyword in the .gitlab-ci.yml file.
This can be done by specifying the paths of the files and directories that need to be added to the job artifacts.
GitLab CI also allows users to watch a video tutorial on how to create and manage build artifacts. This tutorial is available on the GitLab website and can be accessed by anyone who wants to learn how to use GitLab CI/CD to create and manage build artifacts.
For beginners, a CI pipeline tutorial is available on the GitLab website. This tutorial is designed to help beginners learn how to create and manage CI pipelines using GitLab. The tutorial covers creating a simple pipeline, running tests, and deploying code.
When creating and managing build artifacts, it is important to remember that artifacts are a list of files and directories attached to a job after it finishes.
This feature is enabled by default in all GitLab installations. Disabling job artifacts may result in losing important data.
You can use various tools and technologies such as Gradle, Maven, and Ant to create job artifacts.
These tools allow users to automate creating and managing build artifacts.
Using GitLab UI and Runner
GitLab provides two ways to access job artifacts: through the GitLab UI and through the GitLab Runner.
Both methods have their own benefits and drawbacks, so it’s important to understand which one to use depending on the situation.
GitLab UI
The GitLab UI provides an easy-to-use interface for accessing job artifacts. To download job artifacts from the UI, follow these steps:
- Navigate to the job that produced the artifacts.
- Click on the “Artifacts” tab.
- Select the artifact you want to download.
- Click on the “Download” button.
It’s important to note that the UI only keeps artifacts from the most recent successful job.
If you want to keep artifacts from the most recent job, regardless of whether it was successful or not, you’ll need to use the GitLab Runner.
GitLab Runner (Our Preference)
The GitLab Runner provides a more flexible way of accessing job artifacts. To download job artifacts from the Runner, you’ll need to use the artifacts keyword in your .gitlab-ci.yml file. Here’s an example:
job:
script:
- echo "Your First Runner!"
artifacts:
paths:
- build/
In this example, the job produces an artifact located in the build/ directory. By using the artifacts keyword, the GitLab Runner will automatically upload the artifact to the GitLab server.
To download the artifact, follow these steps:
- Navigate to the job that produced the artifacts.
- Click on the “Artifacts” tab.
- Select the artifact you want to download.
- Click on the “Download” button.
Unlike the UI, the GitLab Runner can keep artifacts from the most recent job, regardless of whether it was successful or not. To do this, use the keep keyword in your .gitlab-ci.yml file.
Here’s an example:
job:
script:
- echo "Your First Runner!"
artifacts:
paths:
- build/
expire_in: 1 hour
when: always
allow_failure: true
artifacts:
name: "Example Artifact"
paths:
- build/
when: always
expire_in: 1 week
keep: true
In this example, the keep keyword is set to true, which means that the artifact will be kept even if the job fails.
The artifact will also be kept for a week before it is automatically deleted.
Understanding Child Pipelines and Jobs
In GitLab, child pipelines are pipelines triggered by a parent pipeline. They are helpful for breaking down complex pipelines into smaller, more manageable pieces. Child pipelines have their own set of jobs, and each job can have its own set of artifacts.
When a child pipeline is triggered, it inherits the variables and artifacts from the parent pipeline. This means that the jobs in the parent pipeline can access any artifacts generated by the jobs in the child pipeline.
However, it’s important to note that the latest artifacts are not immediately available to the parent pipeline. Instead, they are only available once the child pipeline has been completed successfully.
Starting from GitLab 13.5, child pipelines have a job responsible for uploading artifacts to GitLab Pages. This job is automatically added to the child pipeline when GitLab Pages is enabled for the project.
It’s also worth noting that artifacts are automatically deleted after a certain time. By default, artifacts are kept for 30 days, which can be configured in the project’s settings. We mostly utilize the keep keyword within our pipelines to keep our job log clean.
Specific Job and Artifact Storage
In GitLab, artifacts generated by jobs can be stored in a directory specified by the user. Each job can generate its artifacts stored in a unique directory.
This allows for easy access to job-specific artifacts, even when multiple jobs are run in the same pipeline.
When a job is executed, the artifacts generated are stored in a directory specified by the user.
By default, the artifacts are stored in /var/opt/gitlab/gitlab-rails/shared/artifacts.
However, the user can change the storage path by editing the gitlab.rb file and adding the line
gitlab_rails[‘artifacts_path’] = “/mnt/storage/artifacts”.
Once the file is saved, GitLab must be reconfigured by running sudo gitlab-ctl reconfigure.
To download the artifacts archive, the user can use the GitLab UI or the API. The UI provides an easy-to-use interface for downloading artifacts, while the API allows for more programmatic access.
The API endpoint for downloading artifacts is /projects/:id/jobs/:job_id/artifacts.
Artifacts can also be accessed directly from the job page.
When viewing a job, the user can click on the “Artifacts” tab to see a list of artifacts generated by that job.
From there, the user can download the artifacts directly.
For more information on job artifacts in GitLab, please refer to the GitLab product documentation.
Understanding Job Artifacts and Merge Requests
When a merge request is created, GitLab automatically runs a pipeline for the merge request.
This pipeline includes jobs that are specific to the merge request.
The merge request pipeline runs in a separate environment from the main pipeline. This allows developers to test their changes separately before merging them into the main branch.
We like to run things like PyTest and a few other final things when a merge request is ran
We use something like the below to do this:
Here’s an example:
# Define the job that runs on merge requests
merge_request:
stage: test
only:
- merge_requests
script:
- echo "Running tests on merge request..."
Job artifacts in the merge request pipeline are important because they allow developers to see the results of their changes.
Artifacts in the merge request show each job’s output in the pipeline.
Developers can view the artifacts in the merge request UI. They can also download the artifacts using the API.
If a job fails in the merge request pipeline, developers can retry the failed job. When the job is retried, the artifacts from the previous run are still available.
This can be useful for debugging and testing purposes.
Working with Specific Jobs and Artifacts Archive
When working with GitLab CI/CD pipelines, it is often necessary to work with specific jobs and artifacts archives. The artifacts archive is a collection of files and directories a job generates.
GitLab Runner defaults the artifacts archive to GitLab when a job succeeds.
However, it is possible to configure GitLab Runner to upload the artifacts archive on failure or always using the artifacts:when parameter.
When a job generates artifacts, they can be used in subsequent jobs by specifying the dependencies keyword.
This keyword passes all artifacts from previous jobs by default. However, using the keyword, it is also possible to specify which jobs to depend on.
In addition to artifacts, GitLab allows for scanning report uploads and fuzzing report uploads to GitLab. These reports can be generated by specific jobs and uploaded to GitLab for further analysis.
When working with specific jobs and artifacts archives, it is important to consider the following questions:
- Which jobs generate artifacts that need to be used in subsequent jobs?
- How can the artifacts archive be uploaded to GitLab?
- How can scanning and fuzzing reports be uploaded to GitLab?
- What happens when a job succeeds or fails?
By answering these questions, developers can effectively work with specific jobs and artifact archives in GitLab CI/CD pipelines.
Retrieving and Keeping Job Artifacts
To retrieve job artifacts, users can navigate to the job details page and select the “Download” button next to the artifacts they wish to retrieve.
Alternatively, job artifacts can be retrieved programmatically using the GitLab API.
By default, GitLab keeps job artifacts for a limited time before automatically deleting them. However, users can specify a longer or shorter expiration time for job artifacts using the expire_in keyword in their job configuration.
If expire_in is not defined, the instance-wide setting is used. Users can select “Keep” from the job details page to prevent artifacts from expiring.
Test reports provide more details on the job’s performance and can be used to diagnose problems and improve the quality of the codebase. GitLab supports various test report formats, including JUnit, Cucumber JSON, and Cobertura.
For example, collecting a JUnit test report requires adding a script to the job that generates the report and then specifying the path to the report using the paths keyword in the job configuration.
The resulting JUnit test report can be viewed on the job details page, and it can be used to track the progress of the test suite over time.
Some Other CI/CD Articles
Here at enjoymachinelearning.com we have a few other in-depth articles about CI/CD.
Here are a few of those:
- Gitlab CICD Rules
- MLOps vs. Data Engineer
- .NET CI/CD In Gitlab
- Timeout CICD GitLab
- GitLab CI/CD PyTest Tutorial For Beginners
- How to Use WD Passport Without Installing Software [Unlock Performance Secrets] - November 14, 2024
- Mastering Incident Management in Software Testing [Boost Your QA Productivity Now] - November 14, 2024
- Top CAD Software Choices for Product Designers [Must-See Recommendations] - November 13, 2024