apparmor/parser/libapparmor_re
Alfonso Sánchez-Beato 5aab543a3b parser: replace dynamic_cast with is_type method
The dynamic_cast operator is slow as it needs to look at RTTI
information and even does some string comparisons, especially in deep
hierarchies like the one for Node. Profiling with callgrind showed
that dynamic_cast can eat a huge portion of the running time, as it
takes most of the time that is spent in the simplify_tree()
function. For some complex profiles, the number of calls to
dynamic_cast can be in the range of millions.

This commit replaces the use of dynamic_cast in the Node hierarchy
with a method called is_type(), which returns true if the pointer can
be casted to the specified type. It works by looking at a Node object
field that is an integer with bits set for each type up in the
hierarchy. Therefore, dynamic_cast is replaced by a simple bits
operation.

This change can reduce the compilation times for some profiles more
that 50%, especially in arm/arm64 arch. This opens the door to maybe
avoid "-O no-expr-simplify" in the snapd daemon, as now that option
would make the compilation slower in almost all cases.

This is the example profile used in some of my tests, with this change
the run-time is around 1/3 of what it was before on an x86 laptop:

profile "test" (attach_disconnected,mediate_deleted) {
dbus send
    bus={fcitx,session}
    path=/inputcontext_[0-9]*
    interface=org.fcitx.Fcitx.InputContext
    member="{Close,Destroy,Enable}IC"
    peer=(label=unconfined),
dbus send
    bus={fcitx,session}
    path=/inputcontext_[0-9]*
    interface=org.fcitx.Fcitx.InputContext
    member=Reset
    peer=(label=unconfined),
dbus receive
    bus=fcitx
    peer=(label=unconfined),
dbus receive
    bus=session
    interface=org.fcitx.Fcitx.*
    peer=(label=unconfined),
dbus send
    bus={fcitx,session}
    path=/inputcontext_[0-9]*
    interface=org.fcitx.Fcitx.InputContext
    member="Focus{In,Out}"
    peer=(label=unconfined),
dbus send
    bus={fcitx,session}
    path=/inputcontext_[0-9]*
    interface=org.fcitx.Fcitx.InputContext
    member="{CommitPreedit,Set*}"
    peer=(label=unconfined),
dbus send
    bus={fcitx,session}
    path=/inputcontext_[0-9]*
    interface=org.fcitx.Fcitx.InputContext
    member="{MouseEvent,ProcessKeyEvent}"
    peer=(label=unconfined),
dbus send
    bus={fcitx,session}
    path=/inputcontext_[0-9]*
    interface=org.freedesktop.DBus.Properties
    member=GetAll
    peer=(label=unconfined),
dbus (send)
    bus=session
    path=/org/a11y/bus
    interface=org.a11y.Bus
    member=GetAddress
    peer=(label=unconfined),
dbus (send)
    bus=session
    path=/org/a11y/bus
    interface=org.freedesktop.DBus.Properties
    member=Get{,All}
    peer=(label=unconfined),
dbus (receive, send)
    bus=accessibility
    path=/org/a11y/atspi/**
    peer=(label=unconfined),
dbus (send)
    bus=system
    path=/org/freedesktop/Accounts
    interface=org.freedesktop.DBus.Introspectable
    member=Introspect
    peer=(label=unconfined),
dbus (send)
    bus=system
    path=/org/freedesktop/Accounts
    interface=org.freedesktop.Accounts
    member=FindUserById
    peer=(label=unconfined),
dbus (receive, send)
    bus=system
    path=/org/freedesktop/Accounts/User[0-9]*
    interface=org.freedesktop.DBus.Properties
    member={Get,PropertiesChanged}
    peer=(label=unconfined),
dbus (send)
    bus=session
    interface=org.gtk.Actions
    member=Changed
    peer=(name=org.freedesktop.DBus, label=unconfined),
dbus (receive)
    bus=session
    interface=org.gtk.Actions
    member={Activate,DescribeAll,SetState}
    peer=(label=unconfined),
dbus (receive)
    bus=session
    interface=org.gtk.Menus
    member={Start,End}
    peer=(label=unconfined),
dbus (send)
    bus=session
    interface=org.gtk.Menus
    member=Changed
    peer=(name=org.freedesktop.DBus, label=unconfined),
dbus (send)
    bus=session
    path="/com/ubuntu/MenuRegistrar"
    interface="com.ubuntu.MenuRegistrar"
    member="{Register,Unregister}{App,Surface}Menu"
    peer=(label=unconfined),
}
2021-02-16 10:23:10 +01:00
..
aare_rules.cc parser: replace dynamic_cast with is_type method 2021-02-16 10:23:10 +01:00
aare_rules.h parser: don't apply exec mapping computations to the policydb 2020-09-29 03:34:47 -07:00
apparmor_re.h Fix dfa minimization 2014-01-09 17:06:48 -08:00
chfa.cc parser: Fix warnings in chfa.cc 2020-06-03 16:29:58 -07:00
chfa.h add ability to use out of band transitions 2019-11-26 21:32:08 -08:00
expr-tree.cc parser: replace dynamic_cast with is_type method 2021-02-16 10:23:10 +01:00
expr-tree.h parser: replace dynamic_cast with is_type method 2021-02-16 10:23:10 +01:00
flex-tables.h add ability to use out of band transitions 2019-11-26 21:32:08 -08:00
hfa.cc parser: replace dynamic_cast with is_type method 2021-02-16 10:23:10 +01:00
hfa.h treewide: spelling/typo fixes in comments and docs 2020-12-01 12:47:11 -08:00
Makefile parser: allow overriding which ar(1) is invoked 2019-07-08 12:28:30 -07:00
parse.h Split out parsing and expression trees from regexp.y 2011-03-13 05:46:29 -07:00
parse.y treewide: spelling/typo fixes in comments and docs 2020-12-01 12:47:11 -08:00
README treewide: spelling/typo fixes in comments and docs 2020-12-01 12:47:11 -08:00

apparmor_re.h - control flags for hfa generation
expr-tree.{h,cc} - abstract syntax tree (ast) built from a regex parse
parse.{h,y} - code to parse a regex into an ast
hfc.{h,cc} - code to build and manipulate a hybrid finite automata (state
             machine).
flex-tables.h - basic defines used by chfa
chfa.{h,cc} - code to build a highly compressed runtime readonly version
              of an hfa.
aare_rules.{h,cc} - code to that binds parse -> expr-tree -> hfa generation
                    -> chfa generation into a basic interface for converting
		    rules to a runtime ready state machine.

Regular Expression Scanner Generator
====================================

Notes in the scanner File Format
--------------------------------

The file format used is based on the GNU flex table file format
(--tables-file option; see Table File Format in the flex info pages and
the flex sources for documentation). The magic number used in the header
is set to 0x1B5E783D instead of 0xF13C57B1 though, which is meant to
indicate that the file format logically is not the same: the YY_ID_CHK
(check) and YY_ID_DEF (default) tables are used differently.

Flex uses state compression to store only the differences between states
for states that are similar. The amount of compression influences the parse
speed.

The following two states could be stored as in the tables outlined
below:

States and transitions on specific characters to next states
------------------------------------------------------------
 1: ('a' => 2, 'b' => 3, 'c' => 4)
 2: ('a' => 2, 'b' => 3, 'd' => 5)

Flex-like table format
----------------------
index: (default, base)
    0: (      0,    0)  <== dummy state (nonmatching)
    1: (      0,    0)
    2: (      1,  256)

  index: (next, check)
      0: (   0,     0)  <== unused entry
	 (   0,     1)  <== ord('a') identical entries
  0+'a': (   2,     1)
  0+'b': (   3,     1)
  0+'c': (   4,     1)
	 (   0,     1)  <== (255 - ord('c')) identical entries
256+'c': (   0,     2)
256+'d': (   5,     2)

Here, state 2 is described as ('c' => 0, 'd' => 5), and everything else
as in state 1. The matching algorithm is as follows.

Flex-like scanner algorithm
---------------------------
  /* current state is in <state>, input character <c> */
  while (check[base[state] + c] != state)
    state = default[state];
  state = next[state];
  /* continue with the next input character */

This state compression algorithm performs well, except when there are
many inverted or wildcard matches ("[^x]", "."). Each input character
may cause several iterations in the while loop.


We will have many inverted character classes ("[^/]") that wouldn't
compress very well. Therefore, the regexp matcher uses no state
compression, and uses the check and default tables differently. The
above states could be stored as follows:

Regexp table format
-------------------

index: (default, base)
    0: (      0,    0)  <== dummy state (nonmatching)
    1: (      0,    0)
    2: (      1,    3)

  index: (next, check)
      0: (   0,     0)  <== unused entry
	 (   0,     0)  <== ord('a') identical, unused entries
  0+'a': (   2,     1)
  0+'b': (   3,     1)
  0+'c': (   4,     1)
  3+'a': (   2,     2)
  3+'b': (   3,     2)
  3+'c': (   0,     0)  <== entry is unused
  3+'d': (   5,     2)
	 (   0,     0)  <== (255 - ord('d')) identical, unused entries

All the entries with 0 in check (except the first entry, which is
deliberately reserved) are still available for other states that
fit in there.

Regexp scanner algorithm
------------------------
  /* current state is in <state>, matching character <c> */
  if (check[base[state] + c] == state)
    state = next[state];
  else
    state = default[state];
  /* continue with the next input character */

This representation and algorithm allows states which match more
characters than they do not match to be represented as their inverse. 
For example, a third state that accepts everything other than 'a' can
be added to the tables as one entry in (default, base) and one entry in
(next, check):

State
-----
 3: ('a' => 0, everything else => 5)

Regexp tables
-------------
index: (default, base)
    0: (      0,    0)  <== dummy state (nonmatching)
    1: (      0,    0)
    2: (      1,    3)
    3: (      5,    7)

  index: (next, check)
      0: (   0,     0)  <== unused entry
	 (   0,     0)  <== ord('a') identical, unused entries
  0+'a': (   2,     1)
  0+'b': (   3,     1)
  0+'c': (   4,     1)
  3+'a': (   2,     2)
  3+'b': (   3,     2)
  3+'c': (   0,     0)  <== entry is unused
  3+'d': (   5,     2)
  7+'a': (   0,     3)
	 (   0,     0)  <== (255 - ord('a')) identical, unused entries

While the current code does not implement any form of state compression,
the flex state compression representation could be combined by
remembering (in a bit per state, for example) which default entries
refer to inverted matches, and which refer to parent states.