-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Policies versioning (map of maps) #3305
Conversation
34f4c5e
to
7c569b1
Compare
e3f65d3
to
062bcbe
Compare
469a4c7
to
03d7ef9
Compare
1c717b0
to
d04dde7
Compare
6e27a92
to
c19e788
Compare
cc4f828
to
45ebe93
Compare
It's ready for review again. |
45ebe93
to
5b58ac8
Compare
Im literally missing only the eBPF filters portion (file) now, will finish tomorrow. Things I like from the PR so far:
Left some questions/comments. |
5b58ac8
to
debccc4
Compare
@geyslan Im done with my review. I cannot find any flaw (have not tested it, just read and made logic judgement). I believe it touches sensitive filtering logic, so I would make sure to do lots of tests before merging (not only unit testing, but making sure each policy has a specific filter, or set of filters, including == and !=s and making sure at least 5 consecutive version changes are good, things like that). Either way, it was a very nice work of you, as usual. Congrasts! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after concurrency issues are addressed, documentation is improved and some of my comments are taken into consideration. I would recommend some testing like said before. Other than that, it's a very nice work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work. I hope what I've written below will be useful.
@@ -981,6 +989,7 @@ int uprobe_lkm_seeker_submitter(struct pt_regs *ctx) | |||
|
|||
// Uprobes are not triggered by syscalls, so we need to override the false value. | |||
p.event->context.syscall = NO_SYSCALL; | |||
p.event->context.policies_version = p.config->policies_version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note about uprobe triggered events in general:
The way you added policies versions to them is by checking the config at time of ebpf context. But recall that triggering them is done in userland, meaning that it may be more appropriate to pass the policies version at time of userland trigger to the ebpf program. Because note that there can be a policy version change in the time between the uprobe trigger and the ebpf program starting.
I'm not saying this isn't a reasonable pragmatic choice to do as you did, but I would argue that there is a kind of inaccuracy in doing it like this. @OriGlassman @AlonZivony since uprobe events are basically all for your domain, your take would be good.
Edit: I see that you reinject the policy version later in the "trigger context store". Note how this doesn't solve the issue here, that the applied policy filters in eBPF are not guaranteed to be with the same policy version later injected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this, I'll wait @OriGlassman @AlonZivony feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because note that there can be a policy version change in the time between the uprobe trigger and the ebpf program starting.
I had put a check on the last stage of the event submission, checking the event context version against the current one from config_entry; if different discard the event. We had discussions about that and agreed that it would be better to send the event even with wrong version. But we could reintroduce it again to set the version as 0, so userland would know that it's a rogue event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking the event context version against the current one from config_entry;
But even the event context version can be later than the one that should be injected from the trigger. So that kind of corruption would be missed. This isn't a huge problem for our trigger events right now since they are infrequent, but if we had events which are more frequent, it may be more noticeable.
I'd at least document the limitation, it may not be worth fixing (I think eBPF cookies in later kernels could be used to solve this for example).
@@ -815,7 +832,7 @@ func (t *Tracee) getOptionsConfig() uint32 { | |||
return cOptVal | |||
} | |||
|
|||
func (t *Tracee) computeConfigValues() []byte { | |||
func (t *Tracee) computeConfigValues(newPolicies *policy.Policies) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function could (and should) be moved to the policy snapshots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps to Policies
, as it's more correlated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the problem is this would reinject the bpf logic into the Policies/Snapshots types, which is what we're trying to avoid: coupling things together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the existence of UpdateBPF
in snapshots show that the snapshot object is the coupling point. That's not a bad thing, there must be some point of coupling if we want to make policies meaningful in bpf logic.
Maybe, the UpdateBPF
function should be taken out of the snapshots object so that the coupling will be out of the object and into some "effect function".
I'm keeping the point on because the tracee file is huge, and we should aim to refactor these kind of functions out given the opportunity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NDStrahilevitz after analyse this better I agree with you. We really need to split logic and diminish the tracee file. However, the complexity involved is also huge. For example, the computeConfigValues
takes care of "config_map" which has a lot from policies but it's a tracee generic config place. And splitting or moving its logic may compromise the configVal config_entry
encode understanding. What I propose now is to update this EPIC with next steps, so we can tackle then meanwhile.
This comment was marked as outdated.
This comment was marked as outdated.
Moving it to v0.21.0 as discussed offline. |
Make use of map of maps to versionize policies. This is a first step towards supporting runtime policy updates.
a9d6a3b
to
d80c9a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nice work Geyslan!
A few comments ahead and we can merge this soon
@@ -189,6 +217,10 @@ statfunc u64 compute_scopes(program_data_t *p) | |||
return 0; | |||
} | |||
|
|||
// | |||
// unversioned filters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, we should add boolean filters versioning in the future by adding a dedicated map for each one of them. If we won't do it, there might by a mismatch between the policies version used by the versioned filters to those that are not versioned. Is that right?
If so, it is ok that we postpone this for a future PR, but we should add a comment explaining this (and an issue) to track this required change
@@ -153,8 +153,10 @@ cgrpctxmap_t cgrpctxmap_eg SEC(".maps"); // saved info SKB caller <=> SKB egr | |||
typedef struct net_task_context { | |||
struct task_struct *task; | |||
task_context_t taskctx; | |||
s32 syscall; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for using s32 and not u32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC due to respective userland EventContext field:
tracee/pkg/bufferdecoder/protocol.go
Line 47 in c6d88a0
Syscall int32 |
BPF_HASH_INNER(uid_filter, u32, eq_t, 256); // filter events by UID prototype, for specific UIDs either by == or != | ||
BPF_HASH_OUTER(uid_filter_version, uid_filter, 64); // map of UID filters maps | ||
BPF_HASH_INNER(pid_filter, u32, eq_t, 256); // filter events by PID prototype | ||
BPF_HASH_OUTER(pid_filter_version, pid_filter, 64); // map of PID filters maps | ||
BPF_HASH_INNER(mnt_ns_filter, u64, eq_t, 256); // filter events by mount namespace id prototype | ||
BPF_HASH_OUTER(mnt_ns_filter_version, mnt_ns_filter, 64); // map of mount namespace filters maps | ||
BPF_HASH_INNER(pid_ns_filter, u64, eq_t, 256); // filter events by pid namespace id prototype | ||
BPF_HASH_OUTER(pid_ns_filter_version, pid_ns_filter, 64); // map of pid namespace filters maps | ||
BPF_HASH_INNER(uts_ns_filter, string_filter_t, eq_t, 256); // filter events by uts namespace name prototype | ||
BPF_HASH_OUTER(uts_ns_filter_version, uts_ns_filter, 64); // map of uts namespace filters maps | ||
BPF_HASH_INNER(comm_filter, string_filter_t, eq_t, 256); // filter events by command name prototype | ||
BPF_HASH_OUTER(comm_filter_version, comm_filter, 64); // map of command name filters maps | ||
BPF_HASH_INNER(cgroup_id_filter, u32, eq_t, 256); // filter events by cgroup id prototype | ||
BPF_HASH_OUTER(cgroup_id_filter_version, cgroup_id_filter, 64); // map of cgroup id filters maps | ||
BPF_HASH_INNER(binary_filter, binary_t, eq_t, 256); // filter events by binary path and mount namespace prototype | ||
BPF_HASH_OUTER(binary_filter_version, binary_filter, 64); // map of binary filters maps | ||
BPF_HASH_INNER(process_tree_map, u32, eq_t, 10240); // filter events by the ancestry of the traced process | ||
BPF_HASH_OUTER(process_tree_map_version, process_tree_map, 64); // map of process tree maps | ||
BPF_HASH_INNER(events_map, u32, event_config_t, MAX_EVENT_ID); // map to persist event configuration data | ||
BPF_HASH_OUTER(events_map_version, events_map, 64); // map of events maps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean those filter maps now take 64 times more space than before? Or is it according to the amount of inner maps that are populated in a given time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's according the new map creation/insertion. Rafael also noticed this size. It was put as 64 just to align the max number of policies, however, we can reduce it to 4-8. I assume it would be unlikely to have to keep more than that, even if we compute new policies very quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, it would keep up to 64 references to the respective filter map:
❯ sudo bpftool map show name events_map_vers
78: hash_of_maps name events_map_vers flags 0x0
key 2B value 4B max_entries 64 memlock 9280B
pids tracee(51042)
Changing max_entries to 2 only reduced memlock by ~53%. The usage is minimal.
❯ sudo bpftool map show name events_map_vers
188: hash_of_maps name events_map_vers flags 0x0
key 2B value 4B max_entries 2 memlock 4320B
pids tracee(53850)
pkg/ebpf/tracee.go
Outdated
@@ -436,6 +443,16 @@ func (t *Tracee) Init(ctx gocontext.Context) error { | |||
return errfmt.Errorf("error initializing event derivation map: %v", err) | |||
} | |||
|
|||
// Initialize events parameter types map | |||
t.eventsParams = make(map[events.ID][]bufferdecoder.ArgType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was first confused from the name eventsParams
, but now I understand its about the event parameter types,
so maybe rename to eventsParamTypes
?
@@ -1691,6 +1677,7 @@ func (t *Tracee) triggerMemDump(event trace.Event) []error { | |||
|
|||
errs := []error{} | |||
|
|||
// TODO: consider to iterate over given policies when policies are changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do I understand correctly that this will break the Subscribe
API if merged as is?
Add policy versioning, which will support runtime policy updates. This also: - Adds the singleton Snapshots circular buffer (for policy versioning). - Introduces the Cloner interface (used for all policy objects). - Moves the filters' ebpf logic to a new ebpf file (filter types are now ebpf-agnostic). - Splits filters_test.go into multiple files. - Removes the unused containers filter (it was just a wrapper for StringFilter). - Removes unused RandStringRunes function from sets_test.go. - Makes other minor refactorings/changes.
d80c9a4
to
df1f1f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Close #3469
1. Explain what the PR does
df1f1f9 feat(userland): add support for policy versioning
53dd343 feat(ebpf): add support for policy versioning
2. Explain how to test it
3. Other comments