Modern static application security testing (SAST) tools are typically used for two main purposes: finding bugs and finding violations of coding standards. The primary purpose of CodeSonar is the first: it was originally designed to detect serious security flaws such as memory errors, API misuse, and concurrency issues; however, it is also perfectly capable of being used for the latter, including the most popular encoding standard, MISRA C 2012.
When developers are required to adhere to coding standards, they look for a tool that can help them find violations. One of the metrics they use to compare tools is the cover: the proportion of rules that the tool claims to verify, a naive strategy being to choose the tool that claims the highest coverage of the norm.
Unfortunately, the notion of coverage is not well defined, and since there is no reliable source of information that can be used to compare coverage between tools, customers must trust vendors to reasonably interpret the term and make account of their coverage in an equitable manner. Unfortunately, some sellers ruthlessly exaggerate coverage in order to gain a competitive advantage, and in doing so, confuse consumers as well.
In this article, I explain why hedging is a slippery notion, which I hope will in turn help clients make informed decisions on which tool to select.
Some coverage is easy
Some rules are so simple that it’s easy to write a checker that can find all violations without false positives. For such rules, hedging is easy: the tool can detect violations or not, without any intermediate reason. In MISRA C 2012, these rules are labeled Decidable. If the violation can be detected by looking at only one compilation unit, the rule is also labeled Single translation unit. For example, rule 4.2 prohibits the use of trigraphs and is labeled that way. If a tool claims to cover this category of rules, then it is perfectly reasonable to believe that claim.
Things get a little murkier, however, if a violation can only be reliably found if the tool has to examine multiple compilation units at the same time (these are labeled System in the standard). For example, MISRA C 2012 rule 5.1: “External identifiers must be distinct” is certainly decidable, but the only way for a tool to reliably find a violation is to examine all compilation units and compare all identifiers found in each.
If a tool claims to have full coverage of a ruler with System scope, then it is reasonable to believe that the tool is also capable of finding all compilation units that contribute to the program. Over-approximation and under-approximation of the set can lead to both false positives and false negatives. Humans routinely get it wrong, so a user of a tool that doesn’t offer an automatic way to determine the set runs the risk of getting incorrect results.
Automatic techniques are surprisingly difficult to master. The most effective approach is one that integrates tightly with the build system, as it is most often the most trusted source.
In MISRA C 2012, some rules are labeled “undecidable”, which means that it is basically impossible to have a method that can, in general, say with certainty whether a violation is present or not. Because of this property, the author of a verifier must balance the risk of false positives against the risk of false negatives. Most of these rules require analysis that can reason about program execution, so only the most sophisticated SAST tools can do a good job. A good example is rule 17.2 of MISRA C 2012, which prohibits recursion, both direct and indirect (i.e. calls through function pointers).
The problem is that coverage claims are often made without knowing whether a tool is good or bad at finding violations. If a tool can only find the most obvious and superficial cases of violation, is it reasonable to claim that it covers this rule? The other side of the coin is also worth considering – if a tool finds all violations, but also reports so many false positives that it’s impossible to inspect them all, is it fair to say it has coverage? ?
Scope of coverage
The final aspect of rule coverage that complicates matters is that coding standards are usually defined quite loosely, whereas SAST tools must have a precise definition of the properties they are looking for. Therefore, it is common for a verifier to find a property that is either a superset or a subset of what the rule requires.
For example, consider CodeSonar’s coverage of MISRA C 2012 Rule 2.2: “There shall be no dead code”. For the purposes of this rule, dead code is code that is executed, but whose removal cannot affect the behavior of the program. CodeSonar has a Unused value checker, which finds places where a variable is assigned a value that is never used afterwards. All of these places break the MISRA rule, but there are other ways to break the rule that are not detected by this checker. So the Unused value checker only covers a subset of what the rule specifies, and other CodeSonar checkers fill in the gaps.
In some cases, the rule and verifier are not in a strict subset/superset relationship. They may overlap a lot or a little, or the verifier may detect a property that is not a direct violation but is very likely to lead to a violation of the rule.
In CodeSonar, our policy is to claim coverage only if there is a large overlap between what the rule specifies and what our verifier will find, and where the verifier does not give warnings that would reasonably be judged as false positives for this rule (notwithstanding that they may otherwise be true positives).
One rule to ring them all
There is a particular MISRA C 2012 rule for which this problem is acute. Alarm bells should ring if you see a SAST tool (especially one of the superficial tools) claiming coverage of the MISRA C 2012 1.3 rule: “There shall be no unspecified indefinite or critical behavior”. This rule is so broad that it requires an additional 10-page appendix that lists some of the specific things to avoid. This in turn refers to the C standards: those of C90/99 list 230 instances of undefined behavior (65 of these are not covered by any other MISRA rules) and 51 instances of unspecified critical behavior (of which 17 are not covered by any other rule). MISRA rule). Further, the rationale for guideline 4.1 adds: “the presence of a runtime error indicates a violation of rule 1.3.”
Therefore, Rule 1.3 specifies an enormous amount of prohibited behavior, including null pointer dereferences, buffer overflows, use of uninitialized memory, data runs, use after free errors, and many other dangers of programming in C.
The issues with rule coverage claims should be clear – although this one rule (of the standard’s 143) is only 0.7% of the standard, it covers perhaps 50% of the chess types really unpleasant things that can happen to a C program.
Additionally, if a tool is to claim Rule 1.3 coverage, it must have controllers that have a good-sized intersection with all of these undesirable behaviors. If a tool can only find a tiny percentage of it, it’s unreasonable for it to claim coverage.
Many SAST tools that claim to find violations of MISRA rules are quite superficial tools (like those from the Lint family), and as such are very weak at finding violations of 1.3, even though they claim coverage . In contrast, advanced SAST tools such as CodeSonar are explicitly designed to find the types of runtime errors that constitute violations of Rule 1.3. Analyzes that allow them to find such faults with reasonable accuracy must be program-wide, path-aware, aware of dangerous information flows, and able to reason about concurrently executing threads.
Evaluation of applications for MISRA coverage
In conclusion, let me summarize the most important points:
- There is no widely accepted good definition of MISRA coverage, even for rules that are decidable.
- It is basically impossible to have a perfect verifier for a rule that is undecidable. False positives and false negatives for these are inevitable.
- Checkers and rulers don’t always intersect perfectly.
- Claims of high coverage of MISRA rules should not be taken at face value. Sellers have a penalty-free incentive to overdo it.
- Rule 1.3 of MISRA C 2012 encompasses a large number of behaviors to avoid. Only sophisticated SAST tools whose primary purpose is to find such bugs are good at finding violations of this rule.
When deciding which tool to use, the questions to ask are, “What are the real issues in my code?” Are they just coding boilerplate violations or real bugs? How this tool works really find the problems I need to find? The best way to answer this question is to try the tool on your own code and evaluate the results rationally.
To learn more, we invite you to download and read this white paper, “Accelerating MISRA Automotive Safety Compliance with Static Application Safety Testing.”
*** This is a syndicated blog from the Security Bloggers Blog Network written by Christian Simko. Read the original post at: https://blogs.grammatech.com/the-minefields-of-misra-coverage-1