Update CompilerImprovements

John Johansen 2021-01-28 11:06:44 +00:00
parent 24bc57b9bf
commit 217e603ab0

@ -1,38 +1,37 @@
This pages tracks improvements made to the AppArmor compiler
performance over each release.
Areas for improvement
=====================
# Areas for improvement
#. lexer - Front end duplicate include elimination
1. lexer - Front end duplicate include elimination
- per block cache which files have already been included so they don't need to be processed multiple times
#. file cache (memory for speed)
1. file cache (memory for speed)
* Include parse caching
- cache include files post parsing so that they don't need to be re-parsed
* generic file cache
- cache files that have already been read so they don't need to be reread
- requires multiple profiles be processed by one compiler call to be effective (especially if duplicate include elimination is in effect)
#. Rule duplicate removal (only done on files)
1. Rule duplicate removal (only done on files)
- almost unneeded if fix for 1 is done
- requires: to full replace (so code can be dropped) need 1 and 4
#. aare -> pcre convertion
1. aare -> pcre convertion
- add native parsing of aare to the background eliminating this step
#. early tree node merging
. early tree node merging
- as part of tree creation instead of blindly creating new nodes, merge into existing tree as parsing
- requires: 3
#. tree simplification
1. tree simplification
- store alt nodes is sorted vector
- store cat nodes in vector
- do tree factoring on vectors, does not require tree rewrites or left/right manipulations
- merge char nodes into charset nodes
- replace char nodes with charset node to remove duplicate code
#. dfa creation - currently creates sets and removes duplicates which means a lot of creation and freeing
1. dfa creation - currently creates sets and removes duplicates which means a lot of creation and freeing
- push anode and nnode split into expr tree for nullable, firstpos, lastpos and follow calculations
- removes need to split sets, reducing allocation of set just to split it apart and lookup its constituent anode and nnode sets
- push anode and nnode caching into expr tree so it can be used by nullable, firstpos, lastpos
@ -45,12 +44,12 @@ Areas for improvement
- Eliminate nnode cache?
- its possible that between the nodevec merge cache and final combined nnode, anode cache the dedicated nnode cache is no longer benifical
#. comb compression
1. comb compression
- change to sliding window algorithm, to reduce set of slot comparisons done
- Note: diff compression also reduces the number of slot comparisons being done by reducing the number of transition
#. parallel compile
1. parallel compile
- compile individual profiles in parallel - separating the compile at the profile level should be fairly easy
- requires removal of global state
- compile profile components in parallel (see partial compiles)
@ -60,14 +59,13 @@ Areas for improvement
- tree optimization -> DFA construction -> minimization -> diff-encode -> comb compression
- since each stage is separate but dependent, separate threads could work on a given thread
#. partial compiles - shared/precompiled
1. partial compiles - shared/precompiled
- share and potentially cache partial compile components
- includes are a natural boundary to precompile
- Requires dfa set operations
I
mprovements per Release
========================
#Improvements per Release
- 2.1 DFA introduced
- 2.3 tree factoring