– By Wes Dean
Overview
MegaLinter is a tool we use at Flexion to provide Static Code Analysis (SCA) of the software we write, support, and maintain. It provides us with a window to help our customers see the quality of what we do and verify that we’re collaborating with them in meeting our Quality Assurance Surveillance Program (QASP) goals. This helps us deliver on one of our Flexion Fundamentals, “Empower customers.”
In my previous article, I laid a step-wise foundation for configuring MegaLinter. Personally, I use MegaLinter in virtually every single repository on every single one of my projects. Regardless of what the purpose of the repository is, what language I’m using, or how complicated it is, there are always ways that MegaLinter can help. Throughout my MegaLinter journeys, I’ve amassed a variety of tips and tricks to make life easier and make MegaLinter even more effective.
1. Configure MegaLinter Variables
MegaLinter can be configured in a few different ways. First, MegaLinter supports environment variables. Environment variables may be set at the command line when running the tool locally or by the CI/CD system (e.g., GitHub Actions, Jenkins, etc.).
Second, MegaLinter supports reading configuration files. By default, the configuration file MegaLinter expects to use is .mega-linter.yml(note the hyphen) in the root of the repository.
Variables passed to MegaLinter through the environment take precedence over variables in the configuration file. That is, if APPLY_FIXES is true in the environment but false in the .mega-linter.yml file, the true from the environment variable is given priority.
2. Validate Your MegaLinter Configuration
There are a bunch of tools included with MegaLinter. Each of those tools has a bunch of configuration options. Making sure a MegaLinter configuration file is valid before committing a configuration change to the repository can help reduce the number of “Fix typo” and “Update MegaLinter config” commits. Fortunately, we have a few tools that can help.
First, yamllint can make sure the .mega-linter.yml configuration file is valid YAML. If a MegaLinter configuration isn’t valid YAML, there’s a chance that MegaLinter won’t be able to parse it.
Second, v8r checks the validity of JSON and YAML files against published schemae. MegaLinter provides a schema file that can be used to validate that not only is a MegaLinter configuration valid YAML, but that the variables and directives included in the file are valid. So, for example, this is correct:
VALIDATE_ALL_CODEBASE: true
But this is not:
VALIDATE-ALL-CODEBASE: true
In situations like those, yamllint will report that the file is valid YAML – because it is – but v8r will report that VALIDATE-ALL-CODEBASE is not a valid MegaLinter option because it is not.
To run v8r manually, run the following command:
v8r -s \ "https://raw.githubusercontent.com/megalinter/megalinter/main/megalinter/descriptors/schemas/megalinter-configuration.jsonschema.json" \
.mega-linter.yml
3. Specify an Alternative MegaLinter Configuration File
By default, MegaLinter looks in the repository’s root directory for a file named .mega-linter.yml for configuration details. That can be overridden by specifying an alternative location with the MEGALINTER_CONFIG environment variable.
Another nifty fact about MEGALINTER_CONFIG is that it can accept a filename – as one would expect – but it can also accept a URL.
If a URL is provided and the URL begins with ‘https://raw.githubusercontent.com’ and the GITHUB_TOKEN variable is set, an Authorization header is set with the token.
4. Extend a Common MegaLinter Configuration
Say you’re like me and you have a whole bunch of repositories, most of which share most of the same configuration across them with only a few repository-specific changes. You could make a copy of one configuration file for each repository and make sure each repository is kept current and spend the rest of your days tracking down changes. That’s one option.
Another option is to use a common, base MegaLinter configuration file with the stuff that’s the same across all repositories and use the EXTENDS variable in the configuration.
The EXTENDS parameter can be a string (for one configuration file) or an array (for multiple configuration files). In the case of several configuration files, configuration values specified earlier take precedence over values from later configuration files.
Just like with MEGALINTER_CONFIG, configuration files may be filenames or URLs. The GITHUB_TOKEN functionality from the MEGALINTER_CONFIG also applies here.
5. Use MegaLinter Flavors
MegaLinter includes many different tools. At the time of writing, more than 100 language, tooling, and file format linters are included with more added regularly. That’s awesome! Some of my repositories use a few languages and tools, but nowhere near the full spectrum of what MegaLinter supports. Because of all of these tools, the full MegaLinter test suite is quite large (3.39 GB compressed, 9.36 GB uncompressed for the 7.11.1 image at the time of writing), so downloading and decompressing it can expend some resources (bandwidth, compute time).
Fortunately, MegaLinter includes several (19 at the time of writing) subsets of the full test suite called “flavors.” Suppose I’m working on a Java project and I don’t have any Rust code in my repository. There’s no need to download, decompress, and run the Rust tooling only to lint zero files. In this case, I can use the Java flavor of MegaLinter which includes a healthy 54 scanners (as compared to 122 for the full version). The Java flavor is much smaller (1.00 GB compressed, 2.53 GB uncompressed), a savings of roughly two-thirds. For more information, see the list of flavors on the MegaLinter site.
6. Enabling and Disabling Linters
Along the same lines as using flavors, individual linters or whole groups (descriptors) of linters can be enabled or disabled. Disabling linters isn’t the same as using flavors: a disabled linter is still downloaded, it just isn’t run.
The rules for enabling and disabling linters are a little tricky:
By default, all linters are enabled.
If the ENABLE variable is set, only the groups (descriptors) listed will be enabled.
- If ENABLE_LINTERS is set, only the individual liters listed will be enabled.
- If DISABLE is set, all groups (descriptors) are enabled except for those listed
- If DISABLE_LINTERS is set, all linters are enabled except for those listed
So, for example, if I’m configuring MegaLinter to run in a repository and I don’t need to worry about the accuracy of spelling in any of the files and the spell checkers are telling me that PERL isn’t a word (I’m looking at you, cspell), I can disable the whole family of spell checkers with:
DISABLE: SPELL
Similarly, if want to lint my markdown files but don’t need to check their links, I would only disable the MARKOWN_MARKDOWN_LINK_CHECK linter:
DISABLE_LINTERS: MARKDOWN_MARKDOWN_LINK_CHECK
All of these options accept either strings or lists, so this will also work:
DISABLE: - SPELL DISABLE_LINTERS: - MARKDOWN_MARKDOWN_LINK_CHECK
7. Only Scan Changed Files
Suppose you have a healthy-sized repository with a few thousand files. Someone reports a bug and the cause is one typo in a single file. The repo has been using MegaLinter for a while now so everything’s looking good. The only question now is, “Do we need to scan the entire repository? Can’t we just scan the changed files and leave the rest?”
Don’t worry. MegaLinter’s got you covered. There is a configuration variable named VALIDATE_ALL_CODEBASE that defaults to true (scan every file every time). By setting this variable to false, MegaLinter will only lint the files that git reports have changed.
Scanners and linters under the REPOSITORY group (e.g., REPOSITORY_SECRETLINT) scan the entire repository, even when VALIDATE_ALL_CODEBASE is set.
8. Run Command Before or After Linters
It’s sometimes helpful to be able to run arbitrary commands before or after individual linters run, or even before or after all of the linters run. Once again, MegaLinter has you covered.
The PRE_COMMANDS variable accepts a list of shell commands to run before any of the linters run. The commands listed here are run in the MegaLinter container at either the root or the workspace (use the cwd variable to specify).
The command variable is used to specify what command to run. The Python subprocess library is used with the shell parameter set to true and the shell used by the subprocess is /bin/bash (under non-Win32). As a result, the command must be provided exactly as if it was being typed on the command line (i.e., shell quoting, backslash escapes, spaces in filenames, etc. all need to be addressed). Here’s an example:
PRE_COMMANDS:
- command: "echo 'hello world'"
cwd: "root"
- command: "touch '.somefile'"
cwd: "workspace"
The POST_COMMANDS work the same way:
POST_COMMANDS:
- command: "rm -f '.somefile'"
cwd: "workspace"
There is also support for running commands before and/or after specific linters by specifying the group (descriptor), the linter, and then PRE_COMMANDS or POST_COMMANDS like this: BASH_SHELLCHECK_PRE_COMMANDS (the group (descriptor) is BASH, the linter is SHELLCHECK, etc.).
9. Using SARIF Reports
A bunch of the linters support writing SARIF reports to document warnings, errors, security findings, and more. MegaLinter supports having these linters write their reports to the reports directory along with the typical logs that are created.
For repositories that are either public or are private but have GitHub Advanced Security enabled (assuming the repository in question is hosted on GitHub), SARIF files can be uploaded right to GitHub. This functionality is based on the github/codeql-action/upload-sarif action and its invocation is built right into the default GitHub Action file. All you have to do is enable the SARIF reporter in the MegaLinter configuration file:
SARIF_REPORTER: true
Personally, I use the andstor/file-existence-action to verify that the SARIF report exists first before trying to upload it.
If a repository is private and GitHub Advanced Security is not enabled (or available for whatever reason), have no fear!
There are GitHub Actions available to add SARIF output to either Pull Request (PR) comments or to generate new issues.
SARIF files may also be uploaded into tools that support them, such as DefectDojo.
Another quick tip with SARIF files is that when a workspace is being reused (e.g., with a local Jenkins instance), if the issues in a file are fixed, a new SARIF report (i.e., one with zero findings) won’t overwrite an old SARIF file which could lead tools like DefectDojo to ingest old findings. In situations like these, the CLEAR_REPORT_FOLDER parameter can be used to remove old logs, SARIF files, etc. at the start of each MegaLinter run. This is usually a non-issue when ephemeral workspaces are used (e.g., in the context of a GitHub Action).
10. Disable Errors from Failing Builds
One of the biggest reasons I’ve received when trying to roll out MegaLinter onto projects is that developers don’t want MegaLinter to serve as a quality gate and prevent merges or promotions, especially when there’s a backlog of technical debt that MegaLinter may discover. Another is that they don’t want the deployment to stop because of trivial reasons like the wrong number of spaces between content and comments in a YAML file.
Once again – and this shouldn’t come as a surprise – MegaLinter can be configured to work with your situation. To get there, we have three variables we can set:
-
-
- When DISABLE_ERRORS is set to true, all “errors” are reclassified as “warnings” and the build will always succeed. It’ll still run and it’ll still find things and it’ll still report its findings, but the presence of findings won’t cause MegaLinter to fail
- Individual linters can have errors reclassified as warnings with configuration for each of those linters. So, for example, to make shellcheck (from the bash group (descriptor)) errors into warnings, use:
BASH_SHELLCHECK_DISABLE_ERRORS: true
- The DISABLE_ERRORS_LINTERS variable can take an array of linters whose errors are reclassified as warnings, just like with the DESCRIPTOR_LINTER_DISABLE_ERRORS variables listed previously.
-
This can come in really handy when we leave DISABLE_ERRORS set to off (that is, errors will cause MegaLinter to fail the build), but then disable errors for the linters that are less problematic. Consider the following:
DISABLE_ERRORS: false
DISABLE_LINTER_ERRORS:
- MARKDOWN_MARKDOWNLINT
- MARKDOWN_REMARK_LINT
- MARKDOWN_MARKDOWN_LINK_CHECK
- MARKDOWN_MARKDOWN_TABLE_FORMATTER
In situations like this, the REPOSITORY linters, many of which are security-related, will still allow MegaLinter to fail builds; however, problems with markdown files, no matter what they are, will not cause MegaLinter to fail builds.
Bonus Tip: APPLY_FIXES
For some tools, MegaLinter provides the ability to automatically fix errors that it encounters, reformat files to meet a standard, etc.. This functionality is enabled with several variables:
- APPLY_FIXES: either set to all (for all linters) or a list of linters where fixes should be applied
- APPLY_FIXES_EVENT must be set to all, push, or pull_request as an environment variable, not in a configuration file
- APPLY_FIXES_MODE must be set to commit or pull_request as an environment variable, not in a configuration file
- PAT which includes a Personal Access Token that allows write access to the repository. Note that if a MegaLinter fix is applied to a workflow file (i.e., those under .github/workflows/*.yml), then the workflow scope must be added to the PAT.
Also, keep in mind that if fixes are applied, the subsequent capture of updated files may grab the logs and reports from the megalinter-reports/ directory, so it’s helpful to tell git to ignore megalinter-reports/ with a .gitignore file.
Summary
MegaLinter is an amazing tool ❤️. MegaLinter helps us to be more effective and deliver better results – results that we can be proud of. The tips and tricks laid out here help us to use MegaLinter more effectively and efficiently. Contact us to learn more about how Flexion can help ensure software quality for your projects and empower you every step of the way.
Wes Dean, a Senior DevSecOps Engineer at Flexion, brings his extensive experience in the UNIX and Linux world since the early 1990s to his role. He supports a variety of U.S. Federal agencies, helping them work safer, faster, more efficiently, and more securely. Wes’s unique position as a member of the CMS Open Source Program Office Advisory Board’s CMS Source Code Stewardship Taskforce underscores his expertise and credibility. He is also a staunch supporter of MegaLinter and a contributor to the tool’s prose scanning functionality, among other improvements.