Commit graph

33 commits

Author SHA1 Message Date
John Johansen
1ebd991155 parser: change priority so that it accumulates based on permissions
The current behavior of priority rules can be non-intuitive with
higher priority rules completely overriding lower priority rules even in
permissions not held in common. This behavior does have use cases but
its can be very confusing, and does not normal policy behavior

Eg.
  priority=0 allow r /**,
  priority=1 deny  w /**,

will result in no allowed permissions even though the deny rule is
only removing the w permission, beause the higher priority rule
completely over ride lower priority permissions sets (including
none shared permissions).

Instead move to tracking the priority at a per permission level. This
allows the w permission to still override at priority 1, while the
read permission is allowed at priority 0.

The final constructed state will still drop priority for the final
permission set on the state.

Note: this patch updates the equality tests for the cases where
the complete override behavior was being tested for.

The complete override behavior will be reintroduced in a future
patch with a keyword extension, enabling that behavior to be used
for ordered blocks etc.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-02-06 11:02:20 -08:00
John Johansen
50452e1147 parser: add a hfa dump that matches the renumbered chfa
Construction of the chfa can reorder states from what the numbering
given during the hfa constuctions because of reordering for better
compression, dead state removal to ensure better packing etc.

This however means the dfa dump is difficult (it is possible using
multiple dumpes) to match up to the chfa that the kernel is
using. Make this easier by making the dfa dump be able to take the
emapping as input, and provide an option to dump the chfa equivalent
hfa.

Renumbered states will show up as {new <== {orig}} in the dump

Eg.
--D dfa-states
{1} <== priority (allow/deny/prompt/audit/quiet)
{5} 0 (0x 4/0//0/0/0)

{1} perms: none
    0x2 -> {5}  0 (0x 4/0//0/0/0)
    0x4 -> {5}  0 (0x 4/0//0/0/0)
    \a 0x7 -> {5}  0 (0x 4/0//0/0/0)
    \t 0x9 -> {5}  0 (0x 4/0//0/0/0)
    \n 0xa -> {5}  0 (0x 4/0//0/0/0)
    \  0x20 -> {5}  0 (0x 4/0//0/0/0)
    4 0x34 -> {3}
{3} perms: none
    0x0 -> {6}
{6} perms: none
    1 0x31 -> {5}  0 (0x 4/0//0/0/0)

-D dfa-compressed-states
{1} <== priority (allow/deny/prompt/audit/quiet)
{2 == {5}} 0 (0x 4/0//0/0/0)

{1} perms: none
    0x2 -> {2 == {5}}  0 (0x 4/0//0/0/0)
    0x4 -> {2 == {5}}  0 (0x 4/0//0/0/0)
    \a 0x7 -> {2 == {5}}  0 (0x 4/0//0/0/0)
    \t 0x9 -> {2 == {5}}  0 (0x 4/0//0/0/0)
    \n 0xa -> {2 == {5}}  0 (0x 4/0//0/0/0)
    \  0x20 -> {2 == {5}}  0 (0x 4/0//0/0/0)
    4 0x34 -> {3}
{3} perms: none
    0x0 -> {4 == {6}}
{4 == {6}} perms: none
    1 0x31 -> {2 == {5}}  0 (0x 4/0//0/0/0)

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-03 14:18:50 -08:00
John Johansen
5d2a38e816 parser: add some new dfa dump options.
The dfa goes through several stages during the build. Allow dumping it
at the various stages instead of only at the end.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-12-22 15:02:04 -08:00
John Johansen
f86fda02f5 parser: fix 16 bit state limitation
The hfa stores next/check transitions in 16 bit fields to reduce memory
usage. However this means the state machine can on contain 2^16
states.

Allow the next/check tables to be 32 bit. This theoretically could allow
for 2^32 states however the base table uses the top 8 bits as flags
giving us only 2^24 bits to index into the next/check tables. With
most states having at least 1 transition this effectively caps the
number of states at 2^24.

To obtain 2^32 possible states a flags table needs to be added. Add
a skeleton around supporting a flags table, so we can note the remaining
work that needs to be done. This patch will only allow for 2^24 states.

Bug: https://gitlab.com/apparmor/apparmor/-/issues/419

Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-08-14 17:01:30 -07:00
John Johansen
501e87a3f2 parser: Cleanup parser control flags, so they display as expected to user
Instead of having multiple tables, since we have room post split
of optimization and dump flags just move all the optimization and
dump flags into a common table.

We can if needed switch the flag entry size to a long in the future.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-08 19:58:59 -07:00
John Johansen
e84e481263 parser: cleanup and rework optimization and dump flag handling
In preparation for more flags (not all of the backend dfa based),
rework the optimization and dump flag handling which has been exclusively
around the dfa up to this point.

- split dfa control and dump flags into separate fields. This gives more
  room for new flags in the existing DFA set
- rename DFA_DUMP, and DFA_CONTROL to CONTROL_DFA and DUMP_DFA as
  this will provide more uniform naming for none dfa flags
- group dump and control flags into a structure so they can be passed
  together.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-07 17:47:41 -07:00
John Johansen
f0b154528d Fix dfa minimization
So DFA minimization has a bug and feature that keeps it from  minimizing
some dfas completely. This feature/bug did not result in incorrect dfas,
it just fails to result in full minimization.

The same mappings comparison is wrong. Or more correctly it is right when
transitions are not remapped to minimization partitions, but it may be
wrong when states are remapped. This means it will cause excess
partitioning (not removing all the states it should).

The trans hashing does a "guess" at partition splitting as a performance
enhancement. Basically it leverages the information that states that have
different transitions or transitions on different characters are not the
same. However this isn't always the case, because minimization can cause
some of those transitions to be altered. In previous testing this was
always a win, with only a few extra states being added some times. However
this changes with when the same mappings are fixed, as the hashing that was
done was based on the same flawed mapping as the broken same mappings.

If the same mappings are fixed and the hashing is not removed then there
is little to no change. However with both changes applied some dfas see
significant improvements. These improvements often result in performance
improvements despite minimization doing more work, because it means less
work to be done in the chfa comb compression

eg. test case that raised the issue (thanks tyler)
  /t { mount fstype=ext2, mount, }

  used to be minimized to
   {1} <== (allow/deny/audit/quiet)
   {6} (0x 2/0/0/0)

   {1} -> {2}: 0x7
   {2} -> {3}: 0x0
   {2} -> {2}: []
   {3} -> {4}: 0x0
   {3} -> {3}: []
   {4} -> {6}: 0x0
   {4} -> {7}: 0x65 e
   {4} -> {5}: []
   {5} -> {6}: 0x0
   {5} -> {5}: []
   {6}  (0x 2/0/0/0) -> {6}: [^\0x0]
   {7} -> {6}: 0x0
   {7} -> {8}: 0x78 x
   {7} -> {5}: []
   {8} -> {6}: 0x0
   {8} -> {5}: 0x74 t
   {8} -> {5}: []

  with the patch it is now properly minimized to
    {1} <== (allow/deny/audit/quiet)
    {6} (0x 2/0/0/0)

    {1} -> {2}: 0x7
    {2} -> {3}: 0x0
    {2} -> {2}: []
    {3} -> {4}: 0x0
    {3} -> {3}: []
    {4} -> {6}: 0x0
    {4} -> {4}: []
    {6}  (0x 2/0/0/0) -> {6}: [^\0x0]


The evince profile set sees some significant improvements picking a couple
example from its "minimized" dfas (it has 12) we see a reduction from 9720
states to 6232 states, and 6537 states to 3653 states. All told seeing the
performance/profile size going from
  2.8 parser: 4.607s 1007267 bytes
  dev head:   3.48s  1007267 bytes
  min fix:    2.68s  549603 bytes

of course evince is an extreme example so a few more

firefox
   2.066s   404549 bytes
 to
   1.336s   250585 bytes


cupsd
   0.365s   90834 bytes
 to
   0.293s   58855 bytes

dnsmasq
   0.118s   35689 bytes
 to
   0.112s   27992 bytes


smbd
   0.187s   40897 bytes
 to
   0.162s   33665 bytes


weather applet profile from ubuntu touch
   0.618s   105673 bytes
 to
   0.432s   89300 bytes


I have not seen a case where the parser regresses on performance but it is
possible. This patch will not cause a regression on generated policy size,
at worst it will result in policy that is the same size

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
2014-01-09 17:06:48 -08:00
John Johansen
22855508e8 Add Differential State Compression to the DFA
Differential state compression encodes a state's transitions as the
difference between the state and its default state (the state it is
relative too).

This reduces the number of transitions that need to be stored in the
transition table, hence reducing the size of the dfa.  There is a
trade off in that a single input character may have to traverse more
than one state.  This is somewhat offset by reduced table sizes providing
better locality and caching properties.

With carefully encoding we can still make constant match time guarentees.
This patch guarentees that a state that is differentially encoded will do at
most 3m state traversal to match an input of length m (as opposed to a
non-differentially compressed dfa doing exactly m state traversals).
In practice the actually number of extra traversals is less than this becaus
we selectively choose which states are differentially encoded.

In addition to reducing the size of the dfa by reducing the number of
transitions that have to be stored.  Differential encoding reduces the
number of transitions that need to be considered by comb compression,
which can result in tighter packing, due to a reduction in sparseness, and
also reduces the time spent in comb compression which currently uses an
O(n^2) algorithm.

Differential encoding will always result in a DFA that is smaller or equal
in size to the encoded DFA, and will usually improve compilation times,
with the performance improvements increasing as the DFA gets larger.

Eg. Given a example DFA that created 8991 states after minimization.
* If only comb compression (current default) is used

 52057 transitions are packed into a table of 69591 entries. Achieving an
 efficiency of about 75% (an average of about 7.74 table entries per state).
 With a resulting compressed dfa16 size of 404238 bytes and a run time for
 the dfa compilation of
   real 0m9.037s
   user 0m8.893s
   sys  0m0.036s

* If differential encoding + comb compression is used, 8292 of the 8991
  states are differentially encoded, with 31557 trans removed.  Resulting in

  20500 transitions are packed into a table of 20675 entries.  Acheiving an
  efficiency of about 99.2% (an average of about 2.3 table entries per state
  With a resulting compressed dfa16 size of 207874 bytes (about 48.6%
  reduction) and a run time for the dfa compilation of
   real 0m5.416s (about 40% faster)
   user 0m5.280s
   sys  0m0.040s

Repeating with a larger DFA that has 17033 states after minimization.
* If only comb compression (current default) is used

 102992 transitions are packed into a table of 137987 entries.  Achieving
 an efficiency of about 75% (an average of about 8.10 entries per state).
 With a resultant compressed dfa16 size of 790410 bytes and a run time for d
 compilation of
  real  0m28.153s
  user  0m27.634s
  sys   0m0.120s

* with differential encoding
 39374 transition are packed into a table of 39594 entries. Achieving an
 efficiency of about 99.4% (an average of about 2.32 entries per state).
 With a resultant compressed dfa16 size of 396838 bytes (about 50% reduction
 and a run time for dfa compilation of
  real  0m11.804s (about 58% faster)
  user  0m11.657s
  sys   0m0.084s

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2014-01-09 16:55:55 -08:00
John Johansen
a34059b1e5 Convert the parser to C++
This conversion is nothing more than what is required to get it to
compile. Further improvements will come as the code is refactored.

Unfortunately due to C++ not supporting designated initializers, the auto
generation of af names needed to be reworked, and "netlink" and "unix"
domain socket keywords leaked in. Since these where going to be added in
separate patches I have not bothered to do the extra work to replace them
with a temporary place holder.

Signed-off-by: John Johansen <john.johansen@canonical.com>
[tyhicks: merged with dbus changes and memory leak fixes]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
2013-09-27 16:13:22 -07:00
John Johansen
e61b7b9241 Update the copyright dates for the apparmor_parser
Signed-off-by: John Johansen <john.johansen@canonical.com>
2012-02-24 04:21:59 -08:00
John Johansen
e7c550243c Make second minimization pass optional
The removal of deny information is a one way operation, that can result
in a smaller dfa, but also results in a dfa that should not be used in
future operations because the deny rules from the precomputed dfa would
not get applied.

For now default filtering out of deny information to off, as it takes
extra time and seldom results in further state reduction.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2012-02-16 07:43:02 -08:00
John Johansen
6f95ff5637 Track full permission set through all stages of DFA construction.
Previously permission information was thrown away early and permissions
where packed to their CHFA form at the start of DFA construction.  Because
of this permissions hashing to setup the initial DFA partitions was
required as x transition conflicts, etc. could not be resolved.

Move the mapping of permissions to CHFA construction, and track the full
permission set through DFA construction.  This allows removal of the
perm_hashing hack, which prevented a full minimization from happening
in some DFAs.  It also could result in x conflicts not being correctly
detected, and deny rules not being fully applied in some situations.

Eg.
 pre full minimization
   Created dfa: states 33451
   Minimized dfa: final partitions 17033

 with full minimization
   Created dfa: states 33451
   Minimized dfa: final partitions 9550
   Dfa minimization no states removed: partitions 9550

The tracking of deny rules through to the completed DFA construction creates
a new class of states.  That is states that are marked as being accepting
(carry permission information) but infact are non-accepting as they
only carry deny information.  We add a second minimization pass where such
states have their permission information cleared and are thus moved into the
non-accepting partion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2012-02-16 07:41:40 -08:00
John Johansen
627638a6cf Add debugging dump for DFA partition minimization
Allow dumping out which states where dropped during partition minimization
and which state became the partitions representative state.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2011-05-20 09:26:44 -07:00
John Johansen
099f19f99c Update the licence for apparmor_re.h
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2011-03-13 05:59:48 -07:00
John Johansen
6aad970d1c Split out compressed dfa "transition table" compression
Split hfa into hfa and compressed_hfa files.  The hfa portion focuses on
creating an manipulating hfas, while compressed_hfa is used for creating
compressed hfas that can be used/reused at run time with much less memory
usage than the full blown hfa.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2011-03-13 05:50:34 -07:00
John Johansen
298a36bffb Split out aare_rules which are used to encapsulate creating the dfa
Split out the aare_rule bits that encapsulate the convertion of apparmor
rules into the final compressed dfa.

This patch will not compile because of the it needs hfa to export an interface
but hfa is going to be split so just delay until hfa and transtable are
split and they can each export their own interface.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2011-03-13 05:49:15 -07:00
John Johansen
846cee5066 Split out parsing and expression trees from regexp.y
Start of splitting regexp.y into logical components instead of the mess
it is today.  Split out the expr-tree and parsing components from regexp.y
int expr-tree.x and parse.y and since regexp.y no longer does parsing
rename it to hfa.cc

Some code cleanups snuck their way into this patch and since I am to
lazy to redo it, I have left them in.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2011-03-13 05:46:29 -07:00
Kees Cook
723a20ba7d as ACKed on IRC, drop the unused $Id$ tags everywhere 2010-12-20 12:29:10 -08:00
John Johansen
14e7d94701 Add ability to dump unique permission sets 2010-11-09 11:56:28 -08:00
John Johansen
318351376c Add the ability to dump NodeSet to dfa state mapping 2010-11-09 11:55:40 -08:00
John Johansen
a949b075b4 The dfa flags currently are a weird mix of position and negative assertions.
Its cleaner just to have them all assert one way and let the cmd line
options apply them correctly.
2010-11-09 11:23:45 -08:00
John Johansen
36e99af7fb Split dfa minimizing hashing into two seperately controllable hashes. The
first hash does hashing on state just state transitions, which always results
in a performance improvement.

The second does hashing based off of accept permissions, which can create
more initial states but can result in not being able to achieve a true
minimum dfa.  This can also lead to slowing down total dfa creation because
while minimization, compression can take longer if the dfa isn't completely
minimized.

permission hashing is currently required, as minimization does not accumulate
redundant Node permissions.
2010-11-09 11:22:54 -08:00
John Johansen
4be07c3265 This adds a basic debug dump for the conversion of each rule in a profile to its expression
tree.  It is limited in that it doesn't currently handle the permissions of a rule.

conversion output presents an aare -> prce conversion followed by 1 or more expression
tree rules, governed by what the rule does.
eg.
  aare: /**   ->   /[^/\x00][^\x00]*
  rule: /[^/\x00][^\x00]*  ->  /[^\0000/]([^\0000])*

eg.
echo "/foo { /** rwlkmix, } " | ./apparmor_parser -QT -D rule-exprs -D expr-tree

aare: /foo   ->   /foo
aare: /**   ->   /[^/\x00][^\x00]*
rule: /[^/\x00][^\x00]*  ->  /[^\0000/]([^\0000])*

rule: /[^/\x00][^\x00]*\x00/[^/].*  ->  /[^\0000/]([^\0000])*\0000/[^/](.)*


DFA: Expression Tree
(/[^\0000/]([^\0000])*(((((((((((((<513>|<2>)|<4>)|<8>)|<16>)|<32>)|<64>)|<8404992>)|<32768>)|<65536>)|<131072>)|<262144>)|<524288>)|<1048576>)|/[^\0000/]([^\0000])*\0000/[^/](.)*((<16>|<32>)|<262144>))


This simple example shows many things
1. The profile name under goes pcre conversion.  But since no regular expressions where found
   it doesn't generate any expr rules
2. /** is converted into the pcre expression /[^\0000/]([^\0000])*
3. The pcre expression /[^\0000/]([^\0000])* is converted into two rules that are then
   converted into expression trees.

   The reason for this can not be seen by the output as this is actually triggered by
   permissions separation for the rule.  In this case the link permission is separated
   into what is shown as the second rule: statement.
4. DFA: Expression Tree dump shows how these rules are combined together

You will notice that the rule conversion statement is fairly redundant currently as it just
show pcre to expression tree pcre.  This will change when direct aare parsing occurs,
but currently serves to verify the pcre conversion step.


It is not the prettiest patch, as its touching some ugly code that is schedule to be cleaned
up/replaced. eg. convert_aaregex_to_pcre is going to replaced with native parse conversion
from an aare straight to the expression tree, and dfaflag passing will become part of the
rule set.
2010-07-23 13:29:35 +02:00
John Johansen
80c7ee74a2 Speedup transition table compression. This is a basic improvement and
not an algorithmic improvement.  It does the same basic algorithm of
test until it can insert the data, but instead of only tracking the
first free entry (and recomputing it each pass).  It tracks all
free entries reducing the number of comparisons done and the table
grows in size.

This may actually result in a small loss on small tables, but is a win
for larger tables.
2010-01-27 17:20:13 -08:00
John Johansen
91dd7527d9 Dfa minimization and unreachable state removal
Add basic Hopcroft based dfa minimization.  It currently does a simple
straight state comparison that can be quadratic in time to split partitions.
This is offset however by using hashing to setup the initial partitions so
that the number of states within a partition are relative few.

The hashing of states for initial partition setup is linear in time.  This
means the closer the initial partition set is to the final set, the closer
the algorithm is to completing in a linear time.  The hashing works as
follows:  For each state we know the number of transitions that are not
the default transition.  For each of of these we hash the set of letters
it can transition on using a simple djb2 hash algorithm.  This creates
a unique hash based on the number of transitions and the input it can
transition on.  If a state does not have the same hash we know it can not
the same as another because it either has a different number of transitions
or or transitions on a different set.

To further distiguish states, the number of transitions of each transitions
target state are added into the hash.  This serves to further distiguish
states as a transition to a state with a different number of transitions
can not possibly be reduced to an equivalent state.

A further distinction of states is made for accepting states in that
we know each state with a unique set of accept permissions must be in
its own partition to ensure the unique accept permissions are in the
final dfa.

The unreachable state removal is a basic walk of the dfa from the start
state marking all states that are reached.  It then sweeps any state not
reached away.  This does not do dead state removal where a non accepting
state gets into a loop that will never result in an accepting state.
2010-01-20 03:32:34 -08:00
John Johansen
dce395e7ad Add basic controls for dfa optimization 2010-01-08 04:30:56 -08:00
John Johansen
4f044e753c Add basic dfa stats and debug dumps for
equivelence classes
expr tree (add stats, update parser switch)
dfa
transition table
2010-01-08 02:17:45 -08:00
John Johansen
17a67d7227 Update parser to allow for multiple debug dump options via -D or --dump.
This will allow turning on and off various debug dumps as needed.
Multiple dump options can be specified as needed by using multiple
options.
  eg. apparmor_parser -D variables
      apparmor_parser -D dfa-tree -D dfa-simple-tree


The help option has also been updated to take an optional argument
to display help about give parameters, currently only dump is supported.

  eg.  apparmor_parser -h       # standard help
       apparmor_parser -h=dump  # dump info about --dump options

Also Enable the dfa expression tree dumps
2010-01-07 16:21:02 -08:00
Kees Cook
da6c9246f5 clear remaining $Id$ tags, since bzr does not suppor them 2009-11-11 10:44:26 -08:00
John Johansen
397ead10af add aare_reset_matchflags() to reset match flags
Signed-Off-By: Kees Cook <kees.cook@canonical.com>
2009-07-24 07:33:09 +00:00
John Johansen
814773b2e1 Update the parse to emit a 0 to seperate pairs in the dfa.
This was always the intended behavior and fixes a bug where
the dfa will match change profile rules using // seperator.
2008-03-13 16:46:19 +00:00
John Johansen
cd79c1ac77 update copyright dates 2007-04-11 08:12:51 +00:00
John Johansen
f3ba454d8c Add dfa support to the parser 2007-02-27 02:29:16 +00:00