Skip to content

Performance issues with nested PacketFields #4705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alxroyer-thales opened this issue Mar 28, 2025 · 5 comments
Open

Performance issues with nested PacketFields #4705

alxroyer-thales opened this issue Mar 28, 2025 · 5 comments

Comments

@alxroyer-thales
Copy link

Brief description

When PacketFields are nested on several stages,
Scapy races exponential performance issues.

Scapy version

2.6.1

Python version

3.8.6

Operating system

Windows

Additional environment information

No response

How to reproduce

The following .uts test demonstrates the issue:

% Packet field performance issue

############
############
+ Test utils

= Prepare a `check_exec_time()` function.

def check_exec_time(description, t0, tf, max):
    print(f"{description}: {tf - t0:.3f} seconds")
    assert tf - t0 <= max, f"{description!r} FAILED: {tf - t0:.3f} > {max:.3f}"


############
############
+ Test data

= Define the `F` final packet class.

class F(Packet):
    fields_desc = [
        ByteField(name="value", default=0),
    ]
    def extract_padding(self, s):
        return (b'', s)

= Define the `I` intermediate packet class.

class I(Packet):
    NUMBER_OF_F_PER_I = 100
    fields_desc = [
        PacketField(name=f"f{i}", pkt_cls=F, default=F(value=i))
        for i in range(NUMBER_OF_F_PER_I)
    ]
    def extract_padding(self, s):
        return (b'', s)

= Define the `M` main packet class.

class M(Packet):
    NUMBER_OF_I_PER_M = 100
    fields_desc = [
        PacketField(name=f"i{i}", pkt_cls=I, default=I())
        for i in range(NUMBER_OF_I_PER_M)
    ]

= Define `MAX_INSTANTIATION_TIME_EXPECTED`.

MAX_INSTANTIATION_TIME_EXPECTED = 1.0

= Define `MAX_SERIALIZATION_TIME_EXPECTED`.

MAX_SERIALIZATION_TIME_EXPECTED = 0.250

= Define `MAX_PARSING_TIME_EXPECTED`.

MAX_PARSING_TIME_EXPECTED = 0.250


############
############
+ Default instantiations

= Build a default instance of `F`.

t0 = time.time()
f = F()
tf = time.time()
check_exec_time("Default instantiation of `F`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)

= Build a default instance of `F` again.

t0 = time.time()
f = F()
tf = time.time()
check_exec_time("Default instantiation of `F`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)

= Build a default instance of `I`.

t0 = time.time()
i = I()
tf = time.time()
check_exec_time("Default instantiation of `I`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)

= Build a default instance of `B` again.

t0 = time.time()
i = I()
tf = time.time()
check_exec_time("Default instantiation of `I`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)

= Build a default instance of `M`.

t0 = time.time()
m = M()
tf = time.time()
check_exec_time("Default instantiation of `M`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)

= Build a default instance of `M` again.

t0 = time.time()
m = M()
tf = time.time()
check_exec_time("Default instantiation of `M`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)


############
############
+ Serialization

= Launch serialization from the latest instance of `M` created.

t0 = time.time()
buf = m.build()
tf = time.time()
check_exec_time("Serialization of `M`", t0, tf, MAX_SERIALIZATION_TIME_EXPECTED)


############
############
+ Parsing

= Parse the buffer serialized before with the latest instance of `M` created.

t0 = time.time()
m.dissect(buf)
tf = time.time()
check_exec_time("Parsing of `M`", t0, tf, MAX_PARSING_TIME_EXPECTED)

Actual result

No response

Expected result

No response

Related resources

No response

@alxroyer-thales
Copy link
Author

Possibly related to #3894?

@alxroyer-thales
Copy link
Author

Analysis

Let N the number of packet fields at a given stage.
Let D (depth) the number of stages of nested packet fields.
The expected complexity for this kind of packets should be O(N ^ D).

As far as I've noticed, this is due to useless copies of packets during
initializations, parsing and serialization operations.

Let K the factor of time lost induced by these useless instantiations.
It leads to the following complexity degradation: O((K.N) ^ (K.D)).

@alxroyer-thales
Copy link
Author

Successfully fixed in a Scapy wrapper.

I'll provide a PR soon to propose the integration of it directly in Scapy implementation.

Notes:

@gpotter2
Copy link
Member

Thanks for the report ! And thank you for the future PR.

alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 18, 2025
alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 18, 2025
alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 18, 2025
…4706, secdev#4707)

- secdev#4705: Performance issues with nested `PacketField`s:
    - Cache saved in `Packet.self_build()`.
    - `Packet`s marked as explicit systematically in `Packet.build()`
      to avoid a useless clone in `do_build()`.
    - `Packet.do_init_cached_fields()` optimized:
        - `init_fields` parameter added to `Packet.init_fields()` and
          `do_init_cached_fields()`.
        - Misc: `do_init_cached_fields()` fixed for list field copies.
    - `Packet.prepare_cached_fields()` optimized:
        - Default fields not copied anymore.
    - `Packet.copy()` and `clone_with()` optimized with `_fast_copy()`:
        - Avoid instantiating *ref fields* with default instantiation.
        - `Packet.copy_fields_dict()`
          optimized with new `use_fields_already_set` parameter
          and reworked as an inner method to work directly
          on the `clone` instance.
        - Misc: `_T` variable type at the module level made local
          and renamed as `_VarDictFieldType` in `_fast_copy()`.
    - `Packet.setfieldval()` optimized:
        - New value discarded if unchanged.:
    - `_PacketField.m2i()` optimized:
        - Avoid useless copies of fields with default instance.
        - Then `dissect()` is used on it.
- secdev#4707: [enhancement] Enable clear_cache() to spread upward
  in the packet tree:
    - `Packet.clear_cache()` reworked.
- secdev#4706: Force cache reset as soon as the payload or a packet field
  is modified:
    - `Packet.clear_cache()` (reworked as per secdev#4707) called in
      `setfieldval()`, `delfieldval()`, `add_payload()` and
      `remove_payload()`.
    - `Packet.raw_packet_cache_fields` now useless, thus removed.
    - Same for `Packet._raw_packet_cache_field_value()`.
    - `Packet.self_build()` simplified by the way.
    - `Packet.no_cache` flag added to avoid setting `raw_packet_cache`
      when specified, used in `Packet.self_build()` and `do_dissect()`.
alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 18, 2025
Useless now that `Packet.init_fields()` has a `for_dissect_only` parameter.
alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 23, 2025
- Set step numbers.
- Clarify max times.
- Use regular parsing through instantiation in step 014.
alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 23, 2025
- Test configurations extracted as global constants.
- Base `TestPacket` class added. Made `M`, `I` and `F` inherit from it.
- Max time computed from expected number of operations.
- Default instantiations ran once (not twice anymore).
- Serialization test cases added to highlight cache improvements.
alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 23, 2025
…4706, secdev#4707)

- secdev#4705: Performance issues with nested `PacketField`s:
    - Cache saved in `Packet.self_build()`.
    - `Packet`s marked as explicit systematically in `Packet.build()`
      to avoid a useless clone in `do_build()`.
    - `Packet.do_init_cached_fields()` optimized:
        - `init_fields` parameter added to `Packet.init_fields()` and
          `do_init_cached_fields()`.
        - Misc: `do_init_cached_fields()` fixed for list field copies.
    - `Packet.prepare_cached_fields()` optimized:
        - Default fields not copied anymore.
    - `Packet.copy()` and `clone_with()` optimized with `_fast_copy()`:
        - Avoid instantiating *ref fields* with default instantiation.
        - `Packet.copy_fields_dict()`
          optimized with new `use_fields_already_set` parameter
          and reworked as an inner method to work directly
          on the `clone` instance.
        - Misc: `_T` variable type at the module level made local
          and renamed as `_VarDictFieldType` in `_fast_copy()`.
    - `Packet.setfieldval()` optimized:
        - New value discarded if unchanged.:
    - `_PacketField.m2i()` optimized:
        - Avoid useless copies of fields with default instance.
        - Then `dissect()` is used on it.
- secdev#4707: [enhancement] Enable clear_cache() to spread upward
  in the packet tree:
    - `Packet.clear_cache()` reworked.
- secdev#4706: Force cache reset as soon as the payload or a packet field
  is modified:
    - `Packet.clear_cache()` (reworked as per secdev#4707) called in
      `setfieldval()`, `delfieldval()`, `add_payload()` and
      `remove_payload()`.
    - `Packet.raw_packet_cache_fields` now useless, thus removed.
    - Same for `Packet._raw_packet_cache_field_value()`.
    - `Packet.self_build()` simplified by the way.
    - `Packet.no_cache` flag added to avoid setting `raw_packet_cache`
      when specified, used in `Packet.self_build()` and `do_dissect()`.
alxroyer-thales added a commit to ThalesGroup/scapy that referenced this issue Apr 23, 2025
Useless now that `Packet.init_fields()` has a `for_dissect_only` parameter.
@alxroyer-thales
Copy link
Author

@gpotter2 Draft-PR #4727 initiated.
Could you please send me your remarks on it? so that I can make it clean.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants