Skip to content

Fix done message parse #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zhaofeng0019
Copy link

Though the document of kernel (https://kernel.org/doc/html/next/userspace-api/netlink/intro.html#netlink-message-types) specify the format of netlink message with NLMSG_DONE, it also says that "Note that some implementations may issue custom NLMSG_DONE messages in reply to do action requests. In that case the payload is implementation-specific and may also be absent.".

After searching the source code of kernel, we can find that

  1. the formant specified in the document is obeyed by most generic netlink but some generic netlink like this (https://elixir.bootlin.com/linux/v6.15/source/drivers/net/team/team_core.c#L2494) has no payload, so as a generic lib, we should not suppose the format of DoneMessage. it's sensible to just save the payload.

  2. when use NLMSG_DONE as an end of multi messages, there will always be a NLM_F_MULTIPART in the flag and only in this case should we parse it as a DoneMessage, in other occassion like connector netlink (https://elixir.bootlin.com/linux/v6.15/source/drivers/connector/connector.c#L101), we should parse it as a common message.

@cathay4t
Copy link
Member

cathay4t commented Jul 5, 2025

@zhaofeng0019 Thanks for the detailed explanation!

Please give me sometime to test this PR as this might impact all other rust-netlink crates on handling DONE message.

Meanwhile:

  1. Please fix the cargo fmt CI failure.
  2. Please provide example on kernel code using custom NLMSG_DONE messages instead of 0 size blob.

@cathay4t
Copy link
Member

cathay4t commented Jul 5, 2025

It is not clear to me why current code is not compliant with linux kernel:

pub struct DoneMessage {
    pub code: i32,
    pub extended_ack: Vec<u8>,
}

Could you explain on what use case this become a problem?

Though the document of kernel (https://kernel.org/doc/html/next/userspace-api/netlink/intro.html#netlink-message-types)
specify the format of netlink message with NLMSG_DONE, it also says that "Note that some implementations may issue custom NLMSG_DONE messages in reply to do action requests.
In that case the payload is implementation-specific and may also be absent.".

After searching the source code of kernel, we can find that
1. the format specified in the document is obeyed by most generic
   netlink but some generic netlink like this (https://elixir.bootlin.com/linux/v6.15/source/drivers/net/team/team_core.c#L2494) has no payload,
   so as a generic lib, we should not suppose the format of DoneMessage.
   it's sensible to just save the payload.

2. when use NLMSG_DONE as an end of multi messages, there will always be
   a NLM_F_MULTIPART in the flag and only in this case should we parse it
   as a DoneMessage, in other occassion like connector netlink
   (https://elixir.bootlin.com/linux/v6.15/source/drivers/connector/connector.c#L101),
   we should parse it as a common message.

Signed-off-by: feng zhao <[email protected]>
@zhaofeng0019 zhaofeng0019 force-pushed the fix-done-message-parse branch from c9afcf8 to a92e93a Compare July 6, 2025 00:46
@zhaofeng0019
Copy link
Author

zhaofeng0019 commented Jul 6, 2025

@zhaofeng0019 Thanks for the detailed explanation!

Please give me sometime to test this PR as this might impact all other rust-netlink crates on handling DONE message.

Meanwhile:

  1. Please fix the cargo fmt CI failure.
  2. Please provide example on kernel code using custom NLMSG_DONE messages instead of 0 size blob.
  1. fixed
  2. it seems that in current kernel tree, custom NLMSG_DONE messages instead of 0 size blob are all without NLM_F_MULTIPART, which can be covered by flag check

so what's your opinion? maybe a DoneBuffer which can endure empty buffer to avoid break change? It's fine to me.
BTW,some third-party kernel module use netlink to communicate can still use any format they want to mark an end for mulit messages.

@cathay4t
Copy link
Member

cathay4t commented Jul 6, 2025

Let me consult our kernel team for this next week.

@cathay4t
Copy link
Member

cathay4t commented Jul 6, 2025

I am a no-issue-no-change guy. Do we have any actual issue for current code or just your imaginary based on kernel document and code?

Of course, if you are coding a proprietary kernel module, no need to share the detail.

@zhaofeng0019
Copy link
Author

I am a no-issue-no-change guy. Do we have any actual issue for current code or just your imaginary based on kernel document and code?

Of course, if you are coding a proprietary kernel module, no need to share the detail.

try this

use std::{error::Error, fmt};

use netlink_packet_core::{
    NLMSG_DONE, NetlinkDeserializable, NetlinkHeader, NetlinkMessage, NetlinkPayload,
    NetlinkSerializable,
};

use netlink_sys::{Socket, SocketAddr, protocols::NETLINK_CONNECTOR};

#[derive(Debug, Clone)]
pub struct CnMessage {
    pub idx: u32,
    pub val: u32,
    pub seq: u32,
    pub ack: u32,
    pub len: u16,
    pub flags: u16,
    pub data: Vec<u8>,
}

#[derive(Debug, Clone, Eq, PartialEq)]
pub struct DeserializeError(&'static str);
impl Error for DeserializeError {
    fn description(&self) -> &str {
        self.0
    }
    fn source(&self) -> Option<&(dyn Error + 'static)> {
        None
    }
}

impl fmt::Display for DeserializeError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "{}", self.0)
    }
}

impl NetlinkDeserializable for CnMessage {
    type Error = DeserializeError;

    fn deserialize(_header: &NetlinkHeader, payload: &[u8]) -> Result<Self, Self::Error> {
        let idx = u32::from_ne_bytes(payload[0..4].try_into().unwrap());
        let val = u32::from_ne_bytes(payload[4..8].try_into().unwrap());
        let seq = u32::from_ne_bytes(payload[8..12].try_into().unwrap());
        let ack = u32::from_ne_bytes(payload[12..16].try_into().unwrap());
        let len = u16::from_ne_bytes(payload[16..18].try_into().unwrap());
        let flags = u16::from_ne_bytes(payload[18..20].try_into().unwrap());
        let date = payload[20..(20 + len as usize)].to_vec();
        Ok(Self {
            idx,
            val,
            seq,
            ack,
            len,
            flags,
            data: date,
        })
    }
}

impl NetlinkSerializable for CnMessage {
    fn message_type(&self) -> u16 {
        NLMSG_DONE
    }

    fn buffer_len(&self) -> usize {
        20 + self.data.len()
    }

    fn serialize(&self, buffer: &mut [u8]) {
        buffer[0..4].copy_from_slice(&self.idx.to_ne_bytes());
        buffer[4..8].copy_from_slice(&self.val.to_ne_bytes());
        buffer[8..12].copy_from_slice(&self.seq.to_ne_bytes());
        buffer[12..16].copy_from_slice(&self.ack.to_ne_bytes());
        buffer[16..18].copy_from_slice(&self.len.to_ne_bytes());
        buffer[18..20].copy_from_slice(&self.flags.to_ne_bytes());
        buffer[20..(20 + self.data.len())].copy_from_slice(&self.data);
    }
}

fn test_cn_proc() {
    let mut socket = Socket::new(NETLINK_CONNECTOR).unwrap();
    socket.bind(&SocketAddr::new(0, 1)).unwrap();
    let cn_msg_req = CnMessage {
        idx: 1,
        val: 1,
        len: 4,
        seq: 1,
        ack: 1,
        flags: 1,
        data: vec![1, 0, 0, 0],
    };
    let mut request = NetlinkMessage::new(
        NetlinkHeader::default(),
        NetlinkPayload::InnerMessage(cn_msg_req),
    );
    request.finalize();
    let mut buff = vec![0; request.buffer_len()];
    request.serialize(&mut buff);
    socket.send(&buff, 0).unwrap();
    let mut recv_buff = Vec::with_capacity(128);
    let l = socket.recv(&mut recv_buff, 0).unwrap();
    let message = NetlinkMessage::<CnMessage>::deserialize(&recv_buff[0..l]).unwrap();
    println!("{message:?}");

    let cn_msg_req = CnMessage {
        idx: 1,
        val: 1,
        len: 4,
        seq: 1,
        ack: 1,
        flags: 1,
        data: vec![2, 0, 0, 0],
    };
    let mut request = NetlinkMessage::new(
        NetlinkHeader::default(),
        NetlinkPayload::InnerMessage(cn_msg_req),
    );
    request.finalize();
    let mut buff = vec![0; request.buffer_len()];
    request.serialize(&mut buff);
    socket.send(&buff, 0).unwrap();
}

fn main() {
    test_cn_proc();
}

@zhaofeng0019
Copy link
Author

zhaofeng0019 commented Jul 6, 2025

I am a no-issue-no-change guy. Do we have any actual issue for current code or just your imaginary based on kernel document and code?

Of course, if you are coding a proprietary kernel module, no need to share the detail.

if you like, i can still give you an example that a DoneMessage without payload thus fail to be parsed. it's easy to write according to the kernel code.

try this

prerequisite

sudo nmcli connection add type team con-name team0 ifname team0 config '{"runner": {"name": "activebackup"}}'
sudo nmcli connection up team0
sudo ip a

get the index of team0

// change to the index of team0 get by ip
const TEAM_INDEX: u32 = 3;
use netlink_packet_core::{NLM_F_REQUEST, NetlinkMessage, NetlinkPayload};

use netlink_packet_generic::{
    GenlFamily, GenlMessage,
    ctrl::{GenlCtrl, GenlCtrlCmd, nlas::GenlCtrlAttrs},
};
use netlink_packet_utils::nla::Nla;
use netlink_sys::{Socket, protocols::NETLINK_GENERIC};

use genetlink::message::RawGenlMessage;

fn resolve(family_name: &str) -> Option<u16> {
    let mut genlmsg: GenlMessage<GenlCtrl> = GenlMessage::from_payload(GenlCtrl {
        cmd: GenlCtrlCmd::GetFamily,
        nlas: vec![GenlCtrlAttrs::FamilyName(family_name.to_owned())],
    });
    genlmsg.finalize();
    let mut nlmsg = NetlinkMessage::from(genlmsg);
    nlmsg.header.flags = NLM_F_REQUEST;
    nlmsg.finalize();
    let socket = Socket::new(NETLINK_GENERIC).unwrap();
    let mut buff = vec![0; nlmsg.buffer_len()];
    nlmsg.serialize(&mut buff);
    socket.send(&buff, 0).unwrap();
    let mut recv_buff = Vec::with_capacity(1024);
    loop {
        recv_buff.clear();
        let l = socket.recv(&mut recv_buff, 0).unwrap();
        let message =
            NetlinkMessage::<GenlMessage<GenlCtrl>>::deserialize(&recv_buff[0..l]).unwrap();
        if let NetlinkPayload::InnerMessage(msg) = message.payload {
            let family_id = msg.payload.nlas.iter().find_map(|nla| {
                if let GenlCtrlAttrs::FamilyId(id) = nla {
                    Some(*id)
                } else {
                    None
                }
            });
            if family_id.is_some() {
                return family_id;
            }
        }
    }
}



#[derive(Debug, Clone)]
struct TeamIndex {
    pub index: u32,
}

impl Nla for TeamIndex {
    fn value_len(&self) -> usize {
        4
    }

    fn kind(&self) -> u16 {
        1
    }

    fn emit_value(&self, buffer: &mut [u8]) {
        buffer[0..4].copy_from_slice(&self.index.to_ne_bytes());
    }
}

impl GenlFamily for TeamIndex {
    fn family_name() -> &'static str {
        "team"
    }

    fn command(&self) -> u8 {
        3
    }

    fn version(&self) -> u8 {
        1
    }
}

fn test_team() {
    let family_id = resolve("team").unwrap();
    let nla = TeamIndex { index: TEAM_INDEX };
    let mut gen_msg: GenlMessage<TeamIndex> = GenlMessage::from_payload(nla);
    gen_msg.set_resolved_family_id(family_id);
    let mut nlmsg = NetlinkMessage::from(gen_msg);
    nlmsg.header.flags = NLM_F_REQUEST;
    nlmsg.finalize();

    let socket = Socket::new(NETLINK_GENERIC).unwrap();
    let mut buff = vec![0; nlmsg.buffer_len()];
    nlmsg.serialize(&mut buff);
    socket.send(&buff, 0).unwrap();
    let mut recv_buff = Vec::with_capacity(128);
    recv_buff.clear();
    let l = socket.recv(&mut recv_buff, 0).unwrap();

    let mut parsed = 0;
    while parsed < l {
        let message = NetlinkMessage::<RawGenlMessage>::deserialize(&recv_buff[parsed..]);
        println!("{message:?}");
        /*
                will get Done message with length 0

                Caused by:
            Decode error occurred: invalid DoneBuffer: length is 0 but DoneBuffer are at least 4 bytes })
        Err(DecodeError { inner: failed to parse NLMSG_DONE

                 */
        if let Ok(msg) = message {
            parsed += msg.buffer_len();
        } else {
            break;
        }
    }
}

fn main() {
    test_team();
}

clean up

sudo nmcli connection down team0
sudo nmcli connection delete team0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants