Dataflow Analysis
What is dataflow analysis?
- Static analysis reasoning about the flow of data within a program
- Different kinds of data can be tracked with dataflow analysis:
- Constants
- Variables
- Expressions
- Dataflow analysis is used by bug-finding tools and compilers
Soundness, completeness, and termination
It is impossible for a software analysis to achieve all three of these objectives - dataflow analysis sacrifices completeness. Dataflow analysis is sound in that it will report all facts that could occur in actual runs. Dataflow analysis is incomplete in that it may report additional facts that can't occur in actual runs.
Abstracting control-flow conditions
Dataflow analysis abstracts away control-flow conditions that contain non-deterministic choices - it assumes that some conditions are able to evaluate to true or false. It considers all paths possible in actual runs (sound) and maybe paths that are never possible during (incomplete).
Applications of dataflow analysis
- Reaching definitions analysis
- This software analysis application using dataflow analysis is aimed at finding the usage of uninitialized variables in the program
- Very busy expressions analysis
- This software analysis application using dataflow analysis is aimed at reducing code size
- Available expressions analysis
- This software analysis application using dataflow analysis is aimed at avoiding recomputing expressions - useful for compiler design
- Live variables analysis
- This software analysis application using dataflow analysis is aimed at allocating registers efficiently for data storage - useful for compiler design
Reaching definitions analysis
The goal of reaching definitions analysis is to determine, for each program point, which assignments have been made and not overwritten when execution reaches that point along some path
Does the reaching definitions analysis always terminate?
Yes, the chaotic iteration algorithm used by RDA always terminates. This is because:
- The two operations of RDA are monotonic - meaning the IN and OUT sets never shrink, only grow
- Largest the IN and OUT sets can be are all definitions in the program
- they cannot grow forever
- IN and OUT will stop changing after some iteration
Very busy expressions analysis
The goal of very busy expressions analysis is to identify very busy expression at the exit from the point. An expression is very busy if, no matter what path is taken, the expression is used before any of the variables used in it are redefined.
Available expression analysis
The goal of available expressions analysis is to identify, for each program point, which expressions must already have been computed, and not later modified, on all paths to the program point.
Live variables analysis
The goal of live variables analysis is to identify, for each program point, which variables could be live at the point's exit. A variables is live if there is a path to a use of the variable that doesn't redefine the variable.