Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support antctl command for packetcapture #6884

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

hangyan
Copy link
Member

@hangyan hangyan commented Dec 24, 2024

TODO items:

after #3659 , the CodeQL alert should be fixed and UT coverage will also be improved.

Sorry, something went wrong.

@hangyan hangyan force-pushed the topic/yhang/packetcapture-antctl branch from 4a616d6 to bf4cb68 Compare January 14, 2025 09:03
@hangyan hangyan changed the title [WIP] Support antctl command for packetcapture Support antctl command for packetcapture Jan 14, 2025
@hangyan
Copy link
Member Author

hangyan commented Feb 6, 2025

cc @antoninbas Can you help review this? also maybe merge #3659 first

cc @luolanzone

@hangyan hangyan force-pushed the topic/yhang/packetcapture-antctl branch from 676981f to b27b72b Compare February 6, 2025 10:21
Signed-off-by: Hang Yan <[email protected]>
@hangyan hangyan force-pushed the topic/yhang/packetcapture-antctl branch from b27b72b to 43a9451 Compare February 7, 2025 10:54
restInterface rest.Interface
}

func (p *podFile) CopyFromPod(ctx context.Context, namespace, name, containerName, srcPath, dstDir string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have ExecInPod defined in pkg/antctl/raw/check/. Can we unify a bit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will try.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense to move the common code to pkg/antctl/raw/helper.go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved. I only verified this with single file in container, not directory.

return fmt.Sprintf("%s-%s", prefix, rand.String(8))
}

func parseFlow() (*v1alpha1.Packet, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you group all input parsing functions together?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getFlowFields is not grouped with the other ones

@luolanzone luolanzone added this to the Antrea v2.4 release milestone Feb 8, 2025
@luolanzone
Copy link
Contributor

Please fix the antctl e2e test failures: https://github.com/antrea-io/antrea/actions/runs/13198383877?pr=6884

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Co-authored-by: Lan <[email protected]>
Co-authored-by: Antonin Bas <[email protected]>
@luolanzone
Copy link
Contributor

@hangyan there are lots of failures in github checks, please check and fix them.

@hangyan hangyan force-pushed the topic/yhang/packetcapture-antctl branch from e1ba1b2 to 3436bfe Compare February 11, 2025 10:19
@hangyan
Copy link
Member Author

hangyan commented Feb 12, 2025

Please fix the antctl e2e test failures: https://github.com/antrea-io/antrea/actions/runs/13198383877?pr=6884

I disable the flag to run packetcapture sub-command in agentMode but controllerMode will still fail. One must be chosen, or this sub-command wont be registered to the rootCmd. However, seems packetcapture doens't make much sense in agentMode or controllerMode(controllerMode also contains outside of cluster case I think, but the e2e will check the command in controller pod).

The tests expect run antctl packetcapture to be successful. the first error comes from packet number should be larger than 0, that's the default value and it will cause error. Even we set with a valid default value, no src/dest endpoint ( i think Traceflow support this?), or packetcapture is not enabled , the command will still fail.

Can we make an exception in the test-case or we need to provide some sane defaults for it to working in the controllerMode?

cc @luolanzone @antoninbas

update
Signed-off-by: Hang Yan <[email protected]>
@hangyan hangyan force-pushed the topic/yhang/packetcapture-antctl branch from 5ac92d9 to e44de62 Compare February 12, 2025 07:15
return false, nil
})

if wait.Interrupted(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my concern is that when the timeout is hit, these 2 events will happen at roughly the same time:

  1. the PacketCapture expires and the agent writes the partial pcap to disk
  2. the wait.PollUntilContextTimeout is interrupted and the antctl command exits right away, without retrieving the capture results

Do you think this can happen?
Maybe after the wait.PollUntilContextTimeout expires, we could wait for the results for an extra (at most) 5 seconds. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the concern make sense.updated with increased timeout for PollUntilContextTimeout. If the PC timeout, and we have a pacap file path, it will be copied

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, can you add a comment before the wait.PollUntilContextTimeout line to explain why we are doing this

restInterface rest.Interface
}

func (p *podFile) CopyFromPod(ctx context.Context, namespace, name, containerName, srcPath, dstDir string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense to move the common code to pkg/antctl/raw/helper.go

update
Signed-off-by: Hang Yan <[email protected]>
Signed-off-by: Hang Yan <[email protected]>
### PacketCapture

`antctl packetcapture` (or `antctl pc`) command is used to start a `PacketCapture`
and retrieve the captured result. After the result packet file is copied out,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the result packet file (in pcapng format)


The `--flow` (or `-f`) argument can be used to specify the PacketCapture packet
headers with the [ovs-ofctl](http://www.openvswitch.org//support/dist-docs/ovs-ofctl.8.txt)
flow syntax(This argument works similar as Traceflow). The supported flow fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
flow syntax(This argument works similar as Traceflow). The supported flow fields
flow syntax. This argument works the same way as the one for `antctl traceflow`. The supported flow fields

(`tcp_src`, `tcp_dst`, `udp_src`, `udp_dst`).

By default, the command will wait for the PacketCapture to succeed or fail, or
timeout. The default timeout is 10 seconds, but can be changed with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or to timeout

(`tcp_src`, `tcp_dst`, `udp_src`, `udp_dst`).

By default, the command will wait for the PacketCapture to succeed or fail, or
timeout. The default timeout is 10 seconds, but can be changed with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong default timeout value, it is 60s

More examples of `antctl packetcapture`:

```bash
Start capturing packets from pod1 to pod2, both Pods are in Namespace default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you use the bash syntax highlighting, add the comment sign # before each description sentence

return false, nil
})

if wait.Interrupted(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, can you add a comment before the wait.PollUntilContextTimeout line to explain why we are doing this

Comment on lines +215 to +218
splits := strings.Split(latestPC.Status.FilePath, ":")
if len(splits) < 2 {
return errors.New("no packets file generated, this maybe caused by timeout")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think we should check the Status field first?

if len(splits) < 2 {
return errors.New("no packets file generated, this maybe caused by timeout")
}
fileName := filepath.Base(splits[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as earlier, path would make more sense

defer func() {
if !option.nowait {
if err = antreaClient.CrdV1alpha1().PacketCaptures().Delete(context.TODO(), pc.Name, metav1.DeleteOptions{}); err != nil {
fmt.Fprintf(cmd.OutOrStdout(), "error when deleting PacketCapture: %s", err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe define out := cmd.OutOrStdout() instead of repeating cmd.OutOrStdout() multiple times (whenever you call fmt.Printf)

return fmt.Sprintf("%s-%s", prefix, rand.String(8))
}

func parseFlow() (*v1alpha1.Packet, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getFlowFields is not grouped with the other ones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants