Fixes two resource leaks. https://scan.coverity.com/projects/apparmor
I don't actually know how to link to the individual reports but the
first one comes from an early return. The second comes from an iterator
potentially being empty.
The current encoding makes every xattr optional and uses this to
propogate the permission from the tail to the individual rule match
points.
This however is wrong. Instead change the encoding so that an xattr
(unless optional) is required to be matched before allowing moving
onto the next xattr match.
The permission is carried on the end on each rule portion file match,
xattr 1, xattr 2, ...
Signed-off-by: John Johansen <john.johansen@canonical.com>
xattrs can contain NULL characters in their values which means we can
not user regular NULL transitions to separate values. To fix this
use out of band transition instead.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Currently the NULL character is used as an out of band transition
for string/path elements. This works for them as the NULL character
is not valid for this data. However this does not work for binary
data that can contain a NULL character.
So far we have only dealt with fixed length fields of binary data
making the NULL separator either unnecessary.
However binary data like in the xattr match and mount data field are
variable length and can contain NULL characters. To deal with this
add the ability to specify out of band transitions, that can only
be triggered by code not input data.
The out of band transition can be used to separate variable length
data fields just as the NULL transition has been used to separate
variable length strings.
In the compressed hfa out of band transitions are expressed as a
negative offset from the states base. This leaves us room to expand
the character match range in the future if desired and on average
makes the range between the out of band transition and the input
transitions smaller than would be had if the out of band transition
had been stored after the valid input transitions.
Out of band transitions in the dfa will not break old kernels
that don't know about them, but they won't be able to trigger
the out of band transition match. So they should not be used unless
the kernel indicates that it supports them.
It should be noted that this patch only adds support for a single
out of band transition. If multiple out of band transitions are
required. It is trivial to extend.
- Add a tag indicating support in the kernel
- add a oob max range field to the dfa header so the kernel knows
what the max range that needs verifying is.
- extend oob generation fns to generate oob based on value instead
of a fixed -1.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Currently the parser is not correctly setting the dfa flag value
and it hasn't been caught because base policy uses a flag value
of 0.
Signed-off-by: John Johansen <john.johansen@canonical.com>
As a step in preparing for out of band transitions and double walk
transitions rework the backend from using a char index to a class
with an larger range than char.
Signed-off-by: John Johansen <john.johansen@canonical.com>
When cross compiling apparmor-parser, Makefile will use ar for
creating the static library. However, ar produces libraries on
the build platform. The right ar could be prefixed with the target
platform triples.
Signed-off-by: Xiang Fei Ding <dingxiangfei2009@gmail.com>
Signed-off-by: Steve Beattie <steve.beattie@canonical.com>
Ref: https://github.com/NixOS/nixpkgs/pull/63999
Bug: https://gitlab.com/apparmor/apparmor/issues/41
Add userland support for matching based on extended file attributes. This
leverages DFA based matching already in the kernel:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8e51f908https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=73f488cd
Matching is exposed via flags on the profile:
/usr/bin/* xattrs=(user.foo=bar user.bar=foo) {
# ...
}
xattr values are appended to the existing xmatch via a null transition.
$ echo '/usr/bin/* xattrs=(user.foo=foo user.bar=bar) {}' | \
./parser/apparmor_parser -QT -D expr-tree
DFA: Expression Tree
/usr/bin/[^\0000/]([^\0000/])*(\0000bar)?(\0000foo)?< 0x1>
DFA: Expression Tree
(\a|(\n|(\0002|\t)))< 0x4>
Tested manually on a 4.19 kernel via QEMU+KVM.
TODO:
* ~~Add regression tests~~ (EDIT: done)
* ~~EDIT: add support in the tools~~ (EDIT: done)
Questions for reviewers:
* ~~parser/libapparmor: regex construction probably needs cleaning up~~ (EDIT: done)
* ~~parser/parser_regex.c: confused what xmatch length is for~~ (EDIT: done)
/cc @mjg59
PR: https://gitlab.com/apparmor/apparmor/merge_requests/270
Signed-off-by: John Johansen <john.johansen@canonical.com>
Compiling the parser currently prints a deprecation warning. Remove
throw(int) annotations from function signatures. These aren't required
to catch exceptions.
For example, the following program catches the exception without a
throw(int) annotation:
#include <iostream>
void throw_an_error()
{
throw 3;
return;
}
int main ()
{
try
{
throw_an_error();
}
catch (int e)
{
std::cout << "caught exception " << e << '\n';
}
return 0;
}
This program prints:
$ g++ -o error error.cc
$ ./error
caught exception 3
Signed-off-by: Eric Chiang <ericchiang@google.com>
The length of a xmatch is used to prioritize multiple profiles that
match the same path, with the intent that the more specific match wins.
Currently, the length of a xmatch is computed by the position of the
first regex character.
While trying to work around issues with no_new_privs by combining
profiles, we noticed that the xmatch length computation doesn't work as
expected for multiple regexs. Consider the following two profiles:
profile all /** { }
profile bins /{,usr/,usr/local/}bin/** { }
xmatch_len is currently computed as "1" for both profiles, even though
"bins" is clearly more specific.
When determining the length of a regex, compute the smallest possible
match and use that for xmatch priority instead of the position of the
first regex character.
Expr tree simplification makes multiple passes at simplifying the
expression tree trying to use fatoring rules and heuristics to achieve
the minimum tree, so that dfa construction has fewer nodes to deal
with.
Unfortunately expr tree simplification can slow some policy compiles,
dependent on the type of expressions generated, down, and even worse
is currently subject to never terminating on some expressions as the
left and right passes keep undoing each others work.
Limiting the number of passes that expr tree simplification does can
provide most of its benefits (later passes generally have diminishing
returns), reduces the overhead it has on simple policy where it is of
little benefit, and insures that simplifications can not get stuck in
an infinite loop due to the left and right passes ping-ponging on each
others factoring.
Note: This also results in a performance improvement in evince
compiles, and general policy compiles because it achieves a better
balance between time spent on simplifying the tree to remove nodes and
time the dfa build requires to build with extra nodes and then
eliminate with minimization.
$ time apparmor_parser -QT /etc/apparmor.d/usr.bin.evince
real 0m2.744s
user 0m2.714s
sys 0m0.028s
vs.
$ time apparmor_parser -QT /etc/apparmor.d/usr.bin.evince
real 0m2.992s
user 0m2.979s
sys 0m0.012s
and
$ time apparmor_parser -QT /etc/apparmor.d/
real 0m3.568s
user 0m14.529s
sys 0m0.152s
vs.
$ time apparmor_parser -QT /etc/apparmor.d/
real 0m3.741s
user 0m15.400s
sys 0m0.179s
PR: https://gitlab.com/apparmor/apparmor/merge_requests/246
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Elaborate in class comment of firstpos, lastpos, followpos, and nullable
fields beyond just referencing the Dragon book. Also add the section of
the book these are explained in.
The current rule simplification algorithm has issues that need to be
addressed in a rewrite, but it is still often a win, especially for
larger profiles.
However doing rule simplification as a single pass limits what it can
do. We default to right simplification first because this has historically
shown the most benefits. For two reasons
1. It allowed better grouping of the split out accept nodes that we
used to do (changed in previous patches)
2. because trailing regexes like
/foo/**,
/foo/**.txt,
can be combined and they are the largest source of node set
explosion.
However the move to unique node sets, eliminates 1, and forces 2 to
work within only the single unique permission set on the right side
factoring pass, but it still incures the penalty of walking the whole
tree looking for potential nodes to factor.
Moving tree simplification into the construction phases gets rid of
the need for the right side factoring pass to walk other node sets
that will never combine, and since we are doing simplification we can
do it before the cat and permission nodes are added reducing the
set of nodes to look at by another two.
We do loose the ability to combine nodes from different sets during
the left factoring pass, but experimentation shows that doing
simplification only within the unique permission sets achieve most of
the factoring that a single global pass would achieve.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
Currently rules are added to the expression tree in order, and then
tree simplification and factoring is done. This forces simplification
to "search" through the tree to find rules with the same permissions
during right factoring, which dependent on ordering of factoring may
not be able to group all rules of the same permissions.
Instead of having tree factoring do the work to regroup rules with the
same permissions, pregroup them as part of the expr tree construction.
And only build the full tree when the dfa is constructed.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
accept nodes per perm bit where done from the very begining in a
false belief that they would help produce minimized dfas because
a nfa states could share partial overlapping permissions.
In reality they make tree factoring harder, reduce in longer nfa
state sets during dfa construction and do not result in a minimized
dfa.
Moving to unique permission sets, allows us to minimize the number
of nodes sets, and helps reduce recreating each set type multiple
times during the dfa construction.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
This fixes the incorrect compilation of audit modifiers for exec and
pivot_root as detailed in
https://launchpad.net/bugs/1431717https://launchpad.net/bugs/1432045
The permission accumulation routine on the backend was incorrectly setting
the audit mask based off of the exec type bits (info about the exec) and
not the actual exec permission.
This bug could have also caused permissions issues around overlapping exec
generic and exact match exec rules, except the encoding of EXEC_MODIFIERS
ensured that the
exact_match_allow & AA_USER/OTHER_EXEC_TYPE
test would never fail for a permission accumulation with the exec permission
set.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
This patch adjusts the bison grammer in libapparmor and the parser
to use the %define api.pure directive instead of the deprecated
%pure_parser and %pure-parser keywords. Bison had been warning about
the former:
libraries/libapparmor/src/grammar.y:71.1-12: warning: deprecated directive, use ‘%pure-parser’ [-Wdeprecated]
%pure_parser
^^^^^^^^^^^^
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Refactor add_new_state into two versions, one that splits anodes from
nnodes, and one for use when anodes and nnodes are presplit
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
The shared node type will be used in the future to add new capabilities
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
We need to rework permission type mapping to nodesets, which means we
need to move the nodeset computations earlier in the dfa creation
processes, instead of a post step of follow(), so move the nodeset
into expr-tree
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Also for characters that are not recognized as a valid escape seq
make sure that the character is emitted.
previously
\$ resulted in \
where it should have been \$ if $ wasn't recognized
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
This cleans things up a bit and fixes a bug where not all rules are
getting properly counted so that the addition of policy_mediation
rules fails to generate the policy dfa in some cases.
Because the policy dfa is being generated correctly now we need to
fix some tests to use the new -M flag to specify the expected features
set of the test.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Fix the octal escape sequence that was broken, so that short escapes \0,
\00 \xa, didn't work and actually resulted in some encoding bugs.
Also we were missing support for the decimal # conversion \d123
Incorporate and update Steve Beattie's unit tests of escape sequences
patch
v2
- unify escape sequence processing, creating lib fns.
- address Steve Beattie's feedback
- incorporate Steve Beattie's feedback
v3
- address Seth's feedback
- add missing strn_escseq tests
- expand strn_escseq to take a 3rd parameter to allow specifying chars to
convert straight across. . eg "+" will cause it to convert \+ as +
- fix libapparmor/parse.y failed escape pass through to match processunqoted
Unit tests by Steve Beattie
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
This patch eliminates the bison warning about "%name-prefix =" being
deprecated.
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-by: John Johansen <john.johansen@canonical.com>
std::max in C++ requires that both arguments be the same type. The
previous fix added std::max comparisons between unsigned long numeric
constants and size_t, this fix casts the numeric constants to size_t.
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Make the accept information dump output be in hexidecimal like the
other dumps so its easier to reference between them.
Signed-off-by: John Johansen <john.johansen@canonical.com>
This diff is part of the diffencode patch but was dropped when it was
applied to bzr. I have no idea why and status showed a clean tree.
Signed-off-by: John Johansen <john.johansen@canonical.com>
So DFA minimization has a bug and feature that keeps it from minimizing
some dfas completely. This feature/bug did not result in incorrect dfas,
it just fails to result in full minimization.
The same mappings comparison is wrong. Or more correctly it is right when
transitions are not remapped to minimization partitions, but it may be
wrong when states are remapped. This means it will cause excess
partitioning (not removing all the states it should).
The trans hashing does a "guess" at partition splitting as a performance
enhancement. Basically it leverages the information that states that have
different transitions or transitions on different characters are not the
same. However this isn't always the case, because minimization can cause
some of those transitions to be altered. In previous testing this was
always a win, with only a few extra states being added some times. However
this changes with when the same mappings are fixed, as the hashing that was
done was based on the same flawed mapping as the broken same mappings.
If the same mappings are fixed and the hashing is not removed then there
is little to no change. However with both changes applied some dfas see
significant improvements. These improvements often result in performance
improvements despite minimization doing more work, because it means less
work to be done in the chfa comb compression
eg. test case that raised the issue (thanks tyler)
/t { mount fstype=ext2, mount, }
used to be minimized to
{1} <== (allow/deny/audit/quiet)
{6} (0x 2/0/0/0)
{1} -> {2}: 0x7
{2} -> {3}: 0x0
{2} -> {2}: []
{3} -> {4}: 0x0
{3} -> {3}: []
{4} -> {6}: 0x0
{4} -> {7}: 0x65 e
{4} -> {5}: []
{5} -> {6}: 0x0
{5} -> {5}: []
{6} (0x 2/0/0/0) -> {6}: [^\0x0]
{7} -> {6}: 0x0
{7} -> {8}: 0x78 x
{7} -> {5}: []
{8} -> {6}: 0x0
{8} -> {5}: 0x74 t
{8} -> {5}: []
with the patch it is now properly minimized to
{1} <== (allow/deny/audit/quiet)
{6} (0x 2/0/0/0)
{1} -> {2}: 0x7
{2} -> {3}: 0x0
{2} -> {2}: []
{3} -> {4}: 0x0
{3} -> {3}: []
{4} -> {6}: 0x0
{4} -> {4}: []
{6} (0x 2/0/0/0) -> {6}: [^\0x0]
The evince profile set sees some significant improvements picking a couple
example from its "minimized" dfas (it has 12) we see a reduction from 9720
states to 6232 states, and 6537 states to 3653 states. All told seeing the
performance/profile size going from
2.8 parser: 4.607s 1007267 bytes
dev head: 3.48s 1007267 bytes
min fix: 2.68s 549603 bytes
of course evince is an extreme example so a few more
firefox
2.066s 404549 bytes
to
1.336s 250585 bytes
cupsd
0.365s 90834 bytes
to
0.293s 58855 bytes
dnsmasq
0.118s 35689 bytes
to
0.112s 27992 bytes
smbd
0.187s 40897 bytes
to
0.162s 33665 bytes
weather applet profile from ubuntu touch
0.618s 105673 bytes
to
0.432s 89300 bytes
I have not seen a case where the parser regresses on performance but it is
possible. This patch will not cause a regression on generated policy size,
at worst it will result in policy that is the same size
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
Differential state compression encodes a state's transitions as the
difference between the state and its default state (the state it is
relative too).
This reduces the number of transitions that need to be stored in the
transition table, hence reducing the size of the dfa. There is a
trade off in that a single input character may have to traverse more
than one state. This is somewhat offset by reduced table sizes providing
better locality and caching properties.
With carefully encoding we can still make constant match time guarentees.
This patch guarentees that a state that is differentially encoded will do at
most 3m state traversal to match an input of length m (as opposed to a
non-differentially compressed dfa doing exactly m state traversals).
In practice the actually number of extra traversals is less than this becaus
we selectively choose which states are differentially encoded.
In addition to reducing the size of the dfa by reducing the number of
transitions that have to be stored. Differential encoding reduces the
number of transitions that need to be considered by comb compression,
which can result in tighter packing, due to a reduction in sparseness, and
also reduces the time spent in comb compression which currently uses an
O(n^2) algorithm.
Differential encoding will always result in a DFA that is smaller or equal
in size to the encoded DFA, and will usually improve compilation times,
with the performance improvements increasing as the DFA gets larger.
Eg. Given a example DFA that created 8991 states after minimization.
* If only comb compression (current default) is used
52057 transitions are packed into a table of 69591 entries. Achieving an
efficiency of about 75% (an average of about 7.74 table entries per state).
With a resulting compressed dfa16 size of 404238 bytes and a run time for
the dfa compilation of
real 0m9.037s
user 0m8.893s
sys 0m0.036s
* If differential encoding + comb compression is used, 8292 of the 8991
states are differentially encoded, with 31557 trans removed. Resulting in
20500 transitions are packed into a table of 20675 entries. Acheiving an
efficiency of about 99.2% (an average of about 2.3 table entries per state
With a resulting compressed dfa16 size of 207874 bytes (about 48.6%
reduction) and a run time for the dfa compilation of
real 0m5.416s (about 40% faster)
user 0m5.280s
sys 0m0.040s
Repeating with a larger DFA that has 17033 states after minimization.
* If only comb compression (current default) is used
102992 transitions are packed into a table of 137987 entries. Achieving
an efficiency of about 75% (an average of about 8.10 entries per state).
With a resultant compressed dfa16 size of 790410 bytes and a run time for d
compilation of
real 0m28.153s
user 0m27.634s
sys 0m0.120s
* with differential encoding
39374 transition are packed into a table of 39594 entries. Achieving an
efficiency of about 99.4% (an average of about 2.32 entries per state).
With a resultant compressed dfa16 size of 396838 bytes (about 50% reduction
and a run time for dfa compilation of
real 0m11.804s (about 58% faster)
user 0m11.657s
sys 0m0.084s
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
This patch adds a parser make variable and a make target for building
the compiler with coverage compilation flags. With this, coverage
information can be generated by running tests/test suites against the
built parser and run through tools like gcovr.
Patch History:
v1: initial version
v2: refreshed/no change
v3: address feedback from sarnold:
- mark coverage target as phony
- correct missing '.' typo in clean target
- make coverage extensions consistent in clean targets
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
This is patch tries to reduce the number of dynamic_cast<>s needed
during normalization by pushing the operations of normalize_tree()
into the expr-tree classes themselves rather than perform it as
an external function. This eliminates the need for dynamic_cast<>
checks on the current object under inspection and reduces the number
of checks needing to be performed on child Nodes as well.
In non-strict benchmarking, doing the dynamic_cast<> reduction
for just the tree normalization operation resulted in a ~10-15%
improvement in overall time on a couple of different hosts (amd64,
armel), as measured against apparmor_parser -Q. Valgrind's callgrind
tool indicated a reduction in the number of calls to dynamic_cast<>
on the tst/simple_tests/vars/dbus_vars_9.sd test profile from ~19
million calls to ~12 million.
In comparisons with dumped expr trees over both the entire
tst/simple_tests/ tree and from 1000 randomly generated profiles via
stress.rb, the generated trees were identical.
Patch history:
v1: initial version of patch
v2: update patch to take into account the infinite loop fix in
trunk rev 1975 and refresh against current code.
v3: no change
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Acked-by: John Johansen <john.johansen@canonical.com>
This patch converts the problematic-with-g++ 4.6 state_names array
into a C++ unordered_map type. Using this depends on using the c++0x
(aka c++11) standard, and as we have gnuisms elsewhere (using the
typeof builtin), the patch also adds/converts to using -std=gnu++c0x
in the build rules (which conveniently eliminates some other warnings
we had due to other c++11-isms).
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-By: Seth Arnold <seth.arnold@canonical.com>
patch is needed to fix the build.
patch from: Jan Rękorajski <baggins@pld-linux.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>
This patch is a very minor optimization to the search to determine
whether a given rule is an exact match or not. If a wildcard rule
(i.e. an inexact match) is discovered, exact_match is set to 0,
so we don't need to continue the tree traversal.
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-by: John Johansen <john.johansen@canonical.com>
This patch addresses a bunch of the compiler string conversion warnings
that were introduced with the C++-ification patch.
Signed-off-by: Steve Beattie <steve@nxnw.org>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
This conversion is nothing more than what is required to get it to
compile. Further improvements will come as the code is refactored.
Unfortunately due to C++ not supporting designated initializers, the auto
generation of af names needed to be reworked, and "netlink" and "unix"
domain socket keywords leaked in. Since these where going to be added in
separate patches I have not bothered to do the extra work to replace them
with a temporary place holder.
Signed-off-by: John Johansen <john.johansen@canonical.com>
[tyhicks: merged with dbus changes and memory leak fixes]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Acked-by: Steve Beattie <steve@nxnw.org>