A bit late (deadline for submission is today) but are my notes on the version currently at http://projects.webappsec.
My comments/notes are marked as Conted to add in underscore, bold and Italic or [content to be deleted in red]
When I wanted to make a comment on particular change or deletion, I did it on a new line:
DC Comment: ... a comment goes here in dark blue
- Operational Criteria - These are generic items that are desired on any application that wants to be deployed on an enterprise (or to a large number of users). Anything that is not specific to analysing an application for security issues (see next point) should be here. For example installation, deployability, standards, licensing, etc.. (in fact this could be a common document/requirement across the multiple WASC/OWASP published criterias)
- Static Analysis Criteria - Here is where all items that are relevant to an static analysis tool should exist. These items should be specific and non-generic. For example
- 'the rules used by the engine should be exposed and consumable' is an operational criteria (all tools should allow that)
- 'the rules used by the engine should support taint-flow analysis' is an static analysis criteria (since only these tools do taint-flow analysis)
Below I marked each topic with either [Operational Criteria] or [Static Analysis Criteria]
- Can the tool 'connect' traces from server-side code to traces on the client-side code?
- Can the tool understand the context that the server-side code is used on the client side (for example the difference between a Response.Write/TagLib been used to output data into a an HTML element or an HTML attribute)
DC Comment: customisation is (from my point of view) THE most important differentiator of an engine (since out-of-the-box most, most commercial scanners are kind-of-equivaleant (i.e. they all work well in some areas and really struggle on others).
Here are some important areas to take into account when talking about customization:
- Ability to access (or even better, to manipulate) the internal-representations of the code/app being analysed
- Ability to extend the current types of rules and findings (being able to for example add an app/framework specific authorization analysis)
- Open (or even known/published) schemas for the tool's: rules, findings and intermediate representations
- Ability for the client to publish their own rules in a license of their choice
- REPL environment to test and develop those rules
- Clearly define and expose the types of findings/analysis that the Tools rules/engine are NOT able to find (ideally this should be application specific)
- Provide the existing 'out-of-the-box' rules in an editable format (the best way to create a custom rules is to modify an existing one that does a similar job). This is a very important point, since (ideally) ALL rules and logic applied by the scanning engine should be customizable
- Ability to package rules, and to run selective sets of rules
- Ability to (re)run an analysis for one 1 (one) type of issue
- Ability to (re)run an analysis for one 1 (one) reported issue (or for a collection of the same issues)
- Ability to create unit tests that validate the existence of those rules
- Ability to create unit tests that validate the findings provided by the tools
The last points are very important since they fit into how developers work (focused on a particular issue which they want to 'fix' and move on into the next issue to 'fix')
- Taint propagation (not all do this, like FxCop)
- Handing of Collections, setters/getters, Hashmaps (for example is the whole object tainted or just the exact key (and for how long))
- Event driven flows (like the ones provided by ASP.NET HttpModules, ASP.NET MVC, Spring MVC, etc...)
- Memory/objects manipulations (important for buffer overflows)
- String Format analysis (i.e. what actually happens in there, and what is being propagated)
- String Analysis (for regex and other string manipulations)
- Interfaces (and how they are mapped/used)
- Mapping views to controllers , and more importantly, mapping tainted data inserted in model objects used in views
- Views nesting (when a view uses another view)
- Views use of non-view APIs (or custom view controls/taglibs)
- Mapping of Authorization and Authentication models and strategies
- Mapping of internal methods that are exposed to the outside world (namely via WEB and REST services)
- Join traces (this is a massive topic and one that when supported will allow the post-scan handling of a lot of the issues listed here)
- Modelling/Visualization of the real size of Models used in MVC apps (to deal with Mass-Assignment/Auto-binding), and connecting them with the views used
- Mapping of multi-step data-flows (for example data in and out of the database, or multi-step forms/worflows). Think reflected SQLi or XSS
- Dependency injection
- AoP code (namely cross cuttings)
- Validation/Sanitisation code which can be applied by config changed, metadata or direct invocation
- Convention-based behaviours , where the app will behave on a particular way based on how (for example) a class is named
- Ability to consume data from other tools (namely black-box scanners, Thread modelling tools, Risk assessment, CI, bug tracking, etc..), including other static analysis tools
- List the type of coding techniques that are 'scanner friendly' , for example an app that uses hashmaps (to move data around) or has a strong event-driven architecture (with no direct connection between source and sink) is not very static analysis friendly
- ....there are more, but hopefully this makes my point....
As you can see, the list above is focused on the capabilities of static analysis tool, not on the type of issues that are 'claimed' that can be found.
All tools say they will detect SQL injection, but what is VERY IMPORTANT (and what matters) is the ability to map/rate all this 'capabilities' to the application being tested (i.e asked the question of 'can vuln xyz be found in the target application given that it uses Framework XYZ and is coded using Technique XYZ' )
This last point is key, since most (if not all tools) today only provide results/information about what they found and not what they analyzed.
I.e if there are no findings of vuln XYZ? does that mean that there are no XYZ vulns on the app? or the tool was not able to find them?
In a way what we need is for tools to also report back the level of assurance that they have on their results (i.e based on the code analysed, its coverage and current set of rules, how sure is the tool that it found all issues?)