Skip to content

Latest commit

 

History

History
1176 lines (1049 loc) · 45.5 KB

cyw43439.md

File metadata and controls

1176 lines (1049 loc) · 45.5 KB

CYW43439

Is a single-chip 802.11b, 802.11g, and 802.11.n radio with integrated Bluetooth 5.2 compatability from Infineon.

This chip is used for example on the Raspberry Pico Pi W (Wirless) microcontroller board.

Infineon aquired Cypress, who earlier aquired Broadcom's Wireless Internet of Things business. I mention this as when looking for information and reading source code there can be references to Cypress and often Broadcom.

Architecture

The CYW43439 is a system of chip (SoC) which has its own CPU (ARM Cortex M3), RAM/ROM:

            +-------------------------------------------+
clock ----→ | ARM Cortex M3         ROM                 |
sdio  ←---→ | RAM                   Wireless interface  |
power ----→ | Power Management      Bluethooth          |
            |                       I/O                 |
            +-------------------------------------------+

The processor is ARM Cortex M3 32 bit RISC processor. This processor runs an embedded operating system named HNDRTE which is from Infineon. My understanding is that the device only contains a minimal boot ROM which will bring up the device and the host, which is the device that the CYW43349 is connected/mounted to/on, for example a Raspberry Pico PI W, is responsible for copying the ARM firmware into the devices memory. More on this later.

If we take the Pico Pi W as an example I think it would communicate with the CYW43439 using SPI, or SDIO to forward and receive frames. Also control requests are passed from the host to CYW43439 and it responds with device responses.

The choice of SDIO or SPI is done be setting SDIO_DATA_2 which is WLAN host interface select and if this is set to 1 (default) then SDIO is chosen, and if set to 0 gSPI will be selected.

There is a host driver (DHD) that presents a driver interface. There is a bus for this communication and there is a control channel and a data channel. The message format is BDC protocol. TODO: what is this BDC protocol?

The transfers between the host and CYW43439 is message based (framed). The SDIO bus is an addressable bus and each message must contain an explicit device target address.

SDIO

Three functions are supported:

  • Function 0, standard SDIO function
  • Function 1, backplane function access to the internal System-on-Chip (SoC)
  • Function 2, WLAN function for efficient WLAN packet transfer through DMA
  • Function 3, BT function for efficient Bluetooth packet transfer through DMA

Function 1 enables access to the internal address space of the device.

Boot sequence

Upon power up/on, the gSPI host has to wait for 50ms for the device to get out or reset. This is accomplished by reading from the gSPI register address 0x14 which is function 0 (more on functions later).

SDPCM

This is a protocol, and in https://infineon.github.io/wifi-host-driver/html/index.html#overview we find this:

The Bus layer provides bus level protocol handling. For SDIO, a bus protocol
known as SDPCM is used.

SDPCM - SDIO/SPI Bus Layer SDPCM Layer takes care of:
* Adding a Sequence number to packet sent to WLAN Chip
* Flow control between WHD and WLAN Chi

So this add credit management, is that what CM stand for perhaps? SDIO Packet and Credit Management layer? SDIO Packet Control Management layer?

Country Locale Matrix (CLM)

This is a database maintained by Cypress and contains regulatory configuration information like target power outputs of WiFi modules with respect to country, bands, data rates, and channels. This blob is loaded dynamically by the Cypress FMAC driver on module activation.

generic SPI (gSPI)

This is a signaling mode, in addtion to the normal signaling modes. This provides fixed delays for responses and data from the device, alignment of host gSPI frames (16 or 32 bits), little/big endian configuration, packet transfer through DMS for WLAN.

This mode can be enabled by setting strapping option SDIO_DATA_2, if this is 1 which is the default which will be SDIO. Pullling this low will enable gSPI.

    RP Pico Pi W                              CYW34439 (WiFi device)
  +-----------------+                      +----------------------+
  |     Host        |     SCLCK            |      Device          |
  |                 |--------------------->|                      |
  |                 |     DI               |                      |
  |                 |<-------------------->|                      |
  |                 |     DO               |                      |
  |                 |<-------------------->|                      |
  |                 |     IRQ              |                      |
  |                 |<-------------------->|                      |
  |                 |     CS               |                      |
  |                 |<-------------------->|                      |
  |                 |                      |                      |
  |                 |                      |                      |
  |                 |                      |                      |
  +-----------------+                      +----------------------+

DI = Data Input
DO = Data Output
CS = Chip select

The gSPI commands are structures of 32-bits:

31                                                         0
 +---------------------------------------------------------+
 |C|A|F1|F0| Address (17-bits)   | Packet content (11-bits)|
 +---------------------------------------------------------+

C = Command: 0=Read, 1= Write
A = Access: 0=Fixed, 1=Incremental address
F1/F0 = Function number:
          Func 0 (F0) 00=All SPI-specific registers
          Func 1 (F1) 01=Registers and memory beloging to other (backplanei?) block on the device
          Func 2 (F2) 10=DMA channel 1. WLAN packets up to 2048 bytes
          Func 3 (F3) 11=DMS channel 2 (options), like Func 2 up to 2048 bytes

In the documentation I was confused with the references to F0, F1, F2, and F3 but I think there are referring to commands with the above SDIO Functions. Like a command could be a read or write and then the function is specified after that.

Notice that the addresses are only 17-bits. The chip actually uses 32-bit addresses though. So are some of the addresses inaccessible?
No, it turns out there is a windowing scheme used to help with the addressing.

So reading from one of the SPI registers would looks something like this I think:

31                                                         0
 +-----------------------------------------------------------+
 |0|0|00|   0x0002               | 0                         |
 +-----------------------------------------------------------+
          0x0002 = Status Enable   Read field/bit 0

After a read/write transaction the gSPI interface supports a status notification to the host which includes info about packet errors, protocol errors, etc. This will be available in the Data Input wire/line is 32-bits with the following format:

 Bit 0: Data Not Available
 Bit 1: Underflow
 Bit 2: Overflow
 Bit 3: F2 Interrupt
 Bit 4: Reserved?
 Bit 5: F2 RX ready        - F2 FIFO ready to receive data (the FIFO is empty)
 Bit 6: Reserved?
 Bit 7: Reserved
 Bit 8: F2 packet avilable - Packet is available in F2 TX FIFO

gSPI Register

Section 4.2.3:

Address  Bit position     Name
0x0000   0                Word length      0=16-bit word (default), 1=32-bit word
         1                Endianess        0=litle (default), 1 = big
         4                High speed mode  1=High speed (default)
	 5                Interrupt polarity 1=Active higt (default)
         7                Wake-up          1=Wake up command from host. 0 (default)
0x0002   0                Status enable    0=No status sent after read/write, 1=status sent after read/write (default)
         1                Interrupt with status
0x0003   Reserved
0x0004                    Interrupt register
         0                Request data not available
         1                F2/F3 FIFO underflow from last read
         2                F2/F3 FIFO overflow from last write
         5                F2 packet available
         6                F3 packet available
         7                F1 FIFO overflow from last write
0x0005                    Interrupt register
         5                F1 Interrupt 
         6                F2 Interrupt 
         7                F3 Interrupt 
0x0006                    Interrupt register enable
0x0007                    Interrupt register enable
0x0008-0x000B             Status Register
...
0x0014                    Test-Read only register (default=0xFEEDBEAD) used by the host to check if the gSPI interface is working as expected
...
0x0018                    Test-R/W Register Can be used by the host to check if the gSPI interface is working as expected.

Take the following example example from embassy/cyw43:

const REG_BUS_CTRL: u32 = 0x0;
...

  self.write32_swapped(REG_BUS_CTRL, 0x00010031).await;

If we look at write_32_swapped it does the following:

    async fn write32_swapped(&mut self, addr: u32, val: u32) {
        let cmd = cmd_word(true, true, FUNC_BUS, addr, 4);

        self.spi
            .transaction(|bus| {
                let bus = unsafe { &mut *bus };
                async {
                    bus.write(&[swap16(cmd), swap16(val)]).await?;
                    Ok(())
                }
            })
            .await
            .unwrap();
    }

fn swap16(x: u32) -> u32 {                                                      
    x.rotate_left(16)                                                           
}

fn cmd_word(write: bool, incr: bool, func: u32, addr: u32, len: u32) -> u32 {   
    (write as u32) << 31 | (incr as u32) << 30 | (func & 0b11) << 28 | (addr & 0x1FFFF) << 11 | (len & 0x7FF)
} 

Lets just go though this, we are bacially calling cmd_word with the following arguments:

  cmd_word(true, true, FUNC_BUS, REG_BUS_CTRL, 0x00010031, 4)
                          0           0

(true as u32) = 1 so that becomes
  (1 << 31)
  100000000000000000000000000000000
And incr is also true so that will become
  (1 << 30)
  010000000000000000000000000000000
| 100000000000000000000000000000000
  110000000000000000000000000000000
After that we have func:
   (0 & 11) << 28
  000000000000000000000000000000000
| 110000000000000000000000000000000
  110000000000000000000000000000000
After that we have addr which is REG_BUS_CTRL which is 0:
   (0 & &1FFFF) << 28
  000000000000000000000000000000000
| 110000000000000000000000000000000
  110000000000000000000000000000000
And finally we have the len which is 4:
  100 & x7FF
  000000000100 & 011111111111 = 000000000000
  000000000000000000000000000000100
| 110000000000000000000000000000000
  110000000000000000000000000000100

So this function will return:
cmd = 110000000000000000000000000000100
0.273836 INFO  cmd: 1001100000000000000

And that will map to the following fields in the command packet:
      C A F1 F0  Address           Packet Length
cmd = 1 1 0  0   00000000000000000 000000000100

Now, the value being is 0x00010031 and in binary that is 10000000000110001.

This is a log running the example in embassy-cyw43:

0.273811 INFO  before setting word_length 32 (little endian)
└─ cyw43::{impl#4}::init::{async_fn#0} @ /home/danielbevenius/work/iot/embassy/cyw43/src/fmt.rs:138
0.273835 INFO  cmd: 11000000000000000000000000000100, swapped: 1001100000000000000
└─ cyw43::{impl#4}::write32_swapped::{async_fn#0} @ /home/danielbevenius/work/iot/embassy/cyw43/src/fmt.rs:138
0.273870 INFO  val: 10000000000110001, swapped: 1100010000000000000001
└─ cyw43::{impl#4}::write32_swapped::{async_fn#0} @ /home/danielbevenius/work/iot/embassy/cyw43/src/fmt.rs:138

So if we assume that we are sending these bits using the SPI write protocol and using 16-bit word operation little endian (which I think might be the default) then our bytes should be packed like this:

  +--+--+--+--+
  |C1|C0|C3|C2|
  +--+--+--+--+

cmd:         11000000000000000000000000000100
rotated(16): 00000000000001001100000000000000

If we split cmd into bytes and rotated the bits we get:

   C3     C2         C1       C0
11000000 00000000 00000000 00000100
   
   C1      C0        C3       C2
00000000 00000100 11000000 00000000

We can do the same with the data to be written

   D3      D2        D1       D0
00000000 00000001 00000000 00110001

   D1      D0        D3       D2 
00000000 00110001 00000000 00000001

When we create the command we specify the len which is the packet size to be written which is 4, which I think is in bytes because the function is named write32_swapped. Looking at the write procotol it seems to me that the above will indeed set the Word Length to 1 but will also set other values (but they are all the default values):

                                  Status enable (1=default)
   D1      D0        D3       D2  ↓
00000000 00110001 00000000 00000001
           ↑↑  ↑↑                 ↑
           ||  |Word Length (1=32-bit)
           ||  |
           ||  Endianess (0=Little)
           ||
           |High-speed mode (1=High speed (default))
           |
           Interrupt polarity (1=high (default))

Could this instead be changes to be a constant like

const WORD_LENGTH_32: u32 = 0x1;

And the the call to write32_swapped could be updated to be:

self.write32_swapped(REG_BUS_CTRL, WORD_LENGTH_32).await;

   D3      D2        D1       D0
00000000 00000000 00000000 00000001
   D1      D0        D3       D2
00000000 00000001 00000000 00000000

Backplane

Function 1 provides access to the internal System-on-Chip (SoC). If we take a look at the block diagram of the CYW43439 we can see this backplane.

It is accessed using by using a command structure with F1F0 being 01.

This allows access directly to memory addresses. Now, I've been used to being able to look up addresses for registers in datasheets for different microcontrollers but I could not find any such information in the datasheet for CWY43439 and did not understand where these addresses are documented. My current understanding is that these are not publicly available. I'm sort of guessing here but I think that an older model was reverse engineered and then others ported that to new models. Like I said I'm guessing here.

But the following addresses I've found in whd_sdio.h

/* SDIO Function 1 (Backplane) register addresses */
/* Addresses 0x00000000 - 0x0000FFFF are directly access the backplane
 * throught the backplane window. Addresses above 0x0000FFFF are
 * registers relating to backplane access, and do not require a backpane
 * clock to access them
 */
 #define SDIO_GPIO_SELECT              ( (uint32_t)0x10005 )
 #define SDIO_GPIO_OUTPUT              ( (uint32_t)0x10006 )
 #define SDIO_GPIO_ENABLE              ( (uint32_t)0x10007 )
 #define SDIO_FUNCTION2_WATERMARK      ( (uint32_t)0x10008 )
 #define SDIO_DEVICE_CONTROL           ( (uint32_t)0x10009 )
 #define SDIO_BACKPLANE_ADDRESS_LOW    ( (uint32_t)0x1000A )
 #define SDIO_BACKPLANE_ADDRESS_MID    ( (uint32_t)0x1000B )
 #define SDIO_BACKPLANE_ADDRESS_HIGH   ( (uint32_t)0x1000C )
 #define SDIO_FRAME_CONTROL            ( (uint32_t)0x1000D )
 #define SDIO_CHIP_CLOCK_CSR           ( (uint32_t)0x1000E )
 #define SDIO_PULL_UP                  ( (uint32_t)0x1000F )
 #define SDIO_READ_FRAME_BC_LOW        ( (uint32_t)0x1001B )
 #define SDIO_READ_FRAME_BC_HIGH       ( (uint32_t)0x1001C )
 #define SDIO_WAKEUP_CTRL              ( (uint32_t)0x1001E )
 #define SDIO_SLEEP_CSR                ( (uint32_t)0x1001F )
 #define I_HMB_SW_MASK                 ( (uint32_t)0x000000F0 )

 #define SBSDIO_ALP_AVAIL_REQ       ( (uint32_t)0x08 )     /* Make ALP ready (power up xtal) */

And if we look in embassy-rs/cyw43 we can see that these match.

Lets take a look at following write8 call (from Embassy):

const FUNC_BACKPLANE: u32 = 1;
...
const BACKPLANE_ALP_AVAIL_REQ: u8 = 0x08;

self.write8(FUNC_BACKPLANE, REG_BACKPLANE_CHIP_CLOCK_CSR, BACKPLANE_ALP_AVAIL_REQ)
            .await; 

If we take a look at write8:

    async fn write8(&mut self, func: u32, addr: u32, val: u8) {
        self.writen(func, addr, val as u32, 1).await
    }

func in this case is FUNC_BACKPLANE which is Function 1, which recall enables access to the internal addresses of the cyw43 SoC. And addr is 0x1000E which is the address to write to. The value we want to write is BACKPLANE_ALP_AVAIL_REQ which has the value 0x08.

Now, if we look at writen (write n, as in number of things to write or length to write). So the call would be something like this:

    self.writen(1, 0x1000E, 0x08, 1); 
    async fn writen(&mut self, func: u32, addr: u32, val: u32, len: u32) {
        let cmd = cmd_word(true, true, func, addr, len);

        self.spi
            .transaction(|bus| {
                let bus = unsafe { &mut *bus };
                async {
                    bus.write(&[cmd, val]).await?;
                    Ok(())
                }
            })
            .await
            .unwrap();
    }

The cmd_word function was described earlier. This is using SPI which we can see.

There are also functions named bp_writexx (bp = backplane), for example:

    // Upload firmware.                                                     
    self.core_disable(Core::WLAN).await;
    self.core_reset(Core::SOCSRAM).await;
    self.bp_write32(CHIP.socsram_base_address + 0x10, 3).await;
    self.bp_write32(CHIP.socsram_base_address + 0x44, 0).await;

    let ram_addr = CHIP.atcm_ram_base_address;

    info!("loading fw");
    self.bp_write(ram_addr, firmware).await;

Now bp_write32 looks like this:

    async fn bp_write32(&mut self, addr: u32, val: u32) {
        self.backplane_writen(addr, val, 4).await
    }

    async fn backplane_writen(&mut self, addr: u32, val: u32, len: u32) {
        self.backplane_set_window(addr).await;
                                                                                
        let mut bus_addr = addr & BACKPLANE_ADDRESS_MASK;
        if len == 4 {
            bus_addr |= BACKPLANE_ADDRESS_32BIT_FLAG
        }
        self.writen(FUNC_BACKPLANE, bus_addr, val, len).await
    }

So lets take the first call to bp_write32:

struct Chip {                                                                   
    arm_core_base_address: u32,
    socsram_base_address: u32,                                                  
    ...
}

// Data for CYW43439                                                            
const CHIP: Chip = Chip {                                                       
    arm_core_base_address: 0x18003000 + WRAPPER_REGISTER_OFFSET,                
    socsram_base_address: 0x18004000,                                           
    ...
};

    self.bp_write32(CHIP.socsram_base_address + 0x10, 3).await;self.bp_write32(0x1800410, 3).await;self.backplane_writen(0x180041, 3, 4).await

Now, lets look at backplane_set_window:

const BACKPLANE_ADDRESS_MASK: u32 = 0x7FFF;

   async fn backplane_set_window(&mut self, addr: u32) {
        // start by masking out the lower 15 address bits
        let new_window = addr & !BACKPLANE_ADDRESS_MASK;

        // If the high part of the offset/window changed then update the
        // high register.
        if (new_window >> 24) as u8 != (self.backplane_window >> 24) as u8 {
            self.write8(
                FUNC_BACKPLANE,
                REG_BACKPLANE_BACKPLANE_ADDRESS_HIGH,
                (new_window >> 24) as u8,
            )
            .await;
        }
        // If the middle part of the offset/window changed then update the
        // mid register.
        if (new_window >> 16) as u8 != (self.backplane_window >> 16) as u8 {
            self.write8(
                FUNC_BACKPLANE,
                REG_BACKPLANE_BACKPLANE_ADDRESS_MID,
                (new_window >> 16) as u8,
            )
            .await;
        }
        // If the middle part of the offset/window changed then update the
        // mid register.
        if (new_window >> 8) as u8 != (self.backplane_window >> 8) as u8 {
            self.write8(
                FUNC_BACKPLANE,
                REG_BACKPLANE_BACKPLANE_ADDRESS_LOW,
                (new_window >> 8) as u8,
            )
            .await;
        }
        // Store the value of the offset/window which in the three registers
        // on the chip so we can update the offset. I think it would be possible
        // to just update/write to the registers but by checking if they have
        // changed we save a call to write (and awaiting the response).
        self.backplane_window = new_window;
    }

Notice that the BACKPLANE_ADDRESS_MASK off 17 of the highest bits in the address and the ARM Cortext M3 is a 32-bit CPU which has memory addresses of 32 bits. There are three offset/window registers which allow for the complete address space to be accessed.

Initially self.backplane_window is set to 0xAAAAAAAA:

pub struct Runner<'a, PWR, SPI> {
    state: &'a State,

    pwr: PWR,
    spi: SPI,

    ioctl_id: u16,
    sdpcm_seq: u8,
    backplane_window: u32,

    sdpcm_seq_max: u8,
}

let mut runner = Runner {                                                   
        state,                                                                  
        pwr,                                                                    
        spi,                                                                    
                                                                                
        ioctl_id: 0,                                                            
        sdpcm_seq: 0,                                                           
        backplane_window: 0xAAAA_AAAA,                                          
                                                                                
        sdpcm_seq_max: 1,

That is 10101010101010101010101010101010 in binary.

The following is a snippet of a log after adding some log statements:

0.274369 INFO  addr                    = 0b00011000000000000000000000000000, dec= 402653184
0.274427 INFO  BACKPLANE_ADDRESS_MASK  = 0b00000000000000000111111111111111
0.274462 INFO  !BACKPLANE_ADDRESS_MASK = 0b11111111111111111000000000000000
0.274500 INFO  new_window              = 0b00011000000000000000000000000000, dec= 402653184
0.274543 INFO  backplane_window        = 0b10101010101010101010101010101010, dec= 2863311530
0.274596 INFO  update high offset reg. new_window >> 24 != backplane_window >> 24. writing: 0b00011000
0.274777 INFO  update mid offset reg. new_window >> 16 != backplane_window >> 16. writing: 0b00000000
0.274943 INFO  update low offset reg. new_window >> 8 != backplane_window >> 8. writing: 0b00000000
0.275105 INFO  backplane_window set    = 0b00011000000000000000000000000000
0.275143 INFO  bus_addr = 0b00000000000000000000000000000000, dec= 0

Recall that what we are passing into this function is an address:

    async fn backplane_writen(&mut self, addr: u32, val: u32, len: u32) {
        self.backplane_set_window(addr).await;
                                                                                
        let mut bus_addr = addr & BACKPLANE_ADDRESS_MASK;
        if len == 4 {
            bus_addr |= BACKPLANE_ADDRESS_32BIT_FLAG
        }
        self.writen(FUNC_BACKPLANE, bus_addr, val, len).await

So we are actually writing to these addresses/registers:

  self.write8(FUNC_BACKPLANE, REG_BACKPLANE_BACKPLANE_ADDRESS_HIGH, 0b00011000);
  self.write8(FUNC_BACKPLANE, REG_BACKPLANE_BACKPLANE_ADDRESS_MID,  0b00000000);
  self.write8(FUNC_BACKPLANE, REG_BACKPLANE_BACKPLANE_ADDRESS_LOW,  0b00000000);

  const REG_BACKPLANE_BACKPLANE_ADDRESS_LOW: u32 = 0x1000A;                       
  const REG_BACKPLANE_BACKPLANE_ADDRESS_MID: u32 = 0x1000B;                       
  const REG_BACKPLANE_BACKPLANE_ADDRESS_HIGH: u32 = 0x1000C;

So in total we are writing 3x8, 8 bits to three different registers. These registers act as offsets into the internal address space. The reason for doing this is that the addresses that the command have available are only 17-bits (see above for details) but the actual memory address space on the chip is 32-bits.

So after writing those values we would then do:

    let mut bus_addr = addr & BACKPLANE_ADDRESS_MASK;
    00011000000000000000000000000000
    00000000000000000111111111111111;

So we have the 15 topmost bits which are the address, and the offset is provided by the three registers which need to be written before reading or writing to/from an address.

      offset 17-bit   address 15-bits  
    |---------------||-------------|
    00000000000000000000000000000000

BACKPLANE_ADDRESS_MASK:
    00000000000000000111111111111111;

!BACKPLANE_ADDRESS_MASK:
    11111111111111111000000000000000;

So to summarize this the above section, we have 17-bit addresses available to use when writing/reading. But the chip has a 32-bit address spaces available and we use 32-bit addresses in the code we write. Before we actually read or write a value, we have to update the offset/window registers on the device to contain the correct offset/window. The offset/window is taken by masking out the

So at this point the offset has been written and now One thing that confused me a little at first was that the offset/window registers are all 8-bits, but we only use 17 bits for the offset

new_window = 00011000 00000000 0000000 000000000

When we write the low register we right shift by 8, but remember that we masked out the other bits so they will be zero.

Input-Output Control (IOCTL)

Normally userland programs issue system calls to make request to the kernel to perform operations. Normally we might think of file io operations as system calls but most often these are infact c library function calls which actually then call make the system call for us. On linux this can be done using man syscall which takes a long number which is the system call to be called.

IOCTL calls are system calls for device specific operations which take device specific request codes. These are similar to system calls but since that are many different devices it is not possible to implement them all as system calls.

Firmware

In the previous section we saw instructions that loaded the firmware to the device:

    // Upload firmware.                                                     
    self.core_disable(Core::WLAN).await;
    self.core_reset(Core::SOCSRAM).await;
    self.bp_write32(CHIP.socsram_base_address + 0x10, 3).await;
    self.bp_write32(CHIP.socsram_base_address + 0x44, 0).await;

    let ram_addr = CHIP.atcm_ram_base_address;

    info!("loading fw");
    self.bp_write(ram_addr, firmware).await;

So what does this firmware contain?

There is a WiFi Host Driver that contains a set of APIs and there are binary distributions of these for specific devices from Infineon. The one used by embassy-rs/cyw43 is COMPONENT_43439.

There is an API ref guide which contains details of WiFi Host Driver (WHD). So we are writing this binary onto the cyw43 device.

Next, we have the following which is writing nvram directly after the binary blob:

     info!("loading nvram");
     // Round up to 4 bytes.
     let nvram_len = (NVRAM.len() + 3) / 4 * 4;
     self.bp_write(ram_addr + CHIP.chip_ram_size - 4 - nvram_len as u32, NVRAM)
         .await;

      let nvram_len_words = nvram_len as u32 / 4;                             
      let nvram_len_magic = (!nvram_len_words << 16) | nvram_len_words;       
      self.bp_write32(ram_addr + CHIP.chip_ram_size - 4, nvram_len_magic)     
          .await;

Now, that gets written is the following:

static NVRAM: &'static [u8] = &*nvram!(
    b"NVRAMRev=$Rev$",
    b"manfid=0x2d0",
    b"prodid=0x0727",
    b"vendid=0x14e4",
    b"devid=0x43e2",
    b"boardtype=0x0887",
    b"boardrev=0x1100",
    b"boardnum=22",
    b"macaddr=00:A0:50:b5:59:5e",
    b"sromrev=11",
    b"boardflags=0x00404001",
    b"boardflags3=0x04000000",
    b"xtalfreq=37400",
    b"nocrc=1",
    b"ag0=255",
    b"aa2g=1",
    b"ccode=ALL",
    b"pa0itssit=0x20",
    b"extpagain2g=0",
    b"pa2ga0=-168,6649,-778",
    b"AvVmid_c0=0x0,0xc8",
    b"cckpwroffset0=5",
    b"maxp2ga0=84",
    b"txpwrbckof=6",
    b"cckbw202gpo=0",
    b"legofdmbw202gpo=0x66111111",
    b"mcsbw202gpo=0x77711111",
    b"propbw202gpo=0xdd",
    b"ofdmdigfilttype=18",
    b"ofdmdigfilttypebe=18",
    b"papdmode=1",
    b"papdvalidtest=1",
    b"pacalidx2g=45",
    b"papdepsoffset=-30",
    b"papdendidx=58",
    b"ltecxmux=0",
    b"ltecxpadnum=0x0102",
    b"ltecxfnsel=0x44",
    b"ltecxgcigpio=0x01",
    b"il0macaddr=00:90:4c:c5:12:38",
    b"wl0id=0x431b",
    b"deadman_to=0xffffffff",
    b"muxenab=0x100",
    b"spurconfig=0x3",
    b"glitch_based_crsmin=1",
    b"btc_mode=1",
);

And if we search for one of these variables/values we can see that the firmware blob contains them:

$ strings firmware/43439A0.bin | grep macaddr
macaddr=%s

So what I think this is just a way to configure the blob with values external to the blob.

To understand more I'm going to go through some of the code in embassy-rs/cwy43.

In the example we have:

   let state = singleton!(cyw43::State::new());
   let (mut control, runner) = cyw43::new(state, pwr, spi, fw).await;

   spawner.spawn(wifi_task(runner)).unwrap();

Now, lets take a look at cyw43::State:

pub struct State {                                                              
    ioctl_state: Cell<IoctlState>,

    tx_channel: Channel<NoopRawMutex, PacketBuf, 8>,
    rx_channel: Channel<NoopRawMutex, PacketBuf, 8>,
    link_up: AtomicBool,
}

ioctl_state is of type Cell as this field needs to be modified without the reference to the struct having to be mutable. And IoctlState looks like this:

enum IoctlState {
    Idle,
    Pending {
        kind: u32,
        cmd: u32,
        iface: u32,
        buf: *mut [u8],
    },
    Sent {
        buf: *mut [u8],
    },
    Done {
        resp_len: usize,
    },
}

A Channel which is part of the embassy-sync crate is a queue for sending values between async tasks. It can have multiple producers and multiple consumers.

pub struct Channel<M, T, const N: usize>
where M: RawMutex, {
    inner: Mutex<M, RefCell<ChannelState<T, N>>>,
}

N is the size of the double ended queue (deque):

struct ChannelState<T, const N: usize> {
    queue: heapless::Deque<T, N>,
    receiver_waker: WakerRegistration,
    senders_waker: WakerRegistration,
}

A State instance has one channel for transmitting, tx_channel, and one for receiving, rx_channel. Finally there is an AtomicBool with the name link_up.

Notice that in the example we are spanning a new task that will be executed by Embassy's executor via the wifi_task.

  #[embassy_executor::task]
  async fn wifi_task(
      runner: cyw43::Runner<'static, Output<'static, PIN_23>, ExclusiveDevice<MySpi, Output<'static, PIN_25>>>,) -> ! {
      runner.run().await
  }

The Runner struct looks like this:

pub struct Runner<'a, PWR, SPI> {
    state: &'a State,

    pwr: PWR,
    spi: SPI,

    ioctl_id: u16,
    sdpcm_seq: u8,
    backplane_window: u32,

    sdpcm_seq_max: u8,
}

Now if we look in runner.run() we have:

    pub async fn run(mut self) -> ! {
        let mut buf = [0; 512];
        loop {
            // Send stuff
            // TODO flow control not yet complete
            if !self.has_credit() {
                warn!("TX stalled");
            } else {
                // Match the current ioctl_state and if there are any pending
                // commands send them to the device.
                if let IoctlState::Pending { kind, cmd, iface, buf } = self.state.ioctl_state.get() {
                    self.send_ioctl(kind, cmd, iface, unsafe { &*buf }).await;
                    self.state.ioctl_state.set(IoctlState::Sent { buf });
                }
                if !self.has_credit() {
                    warn!("TX stalled");
                } else {
                    if let Ok(p) = self.state.tx_channel.try_recv() {
                        self.send_packet(&p).await;
                    }
                }
            }

            // Receive stuff
            // REG_BUS_INTERRUPT is const REG_BUS_INTERRUPT: u32 = 0x04;
            // which is the address of the Interrupt Register (see gSPI registers above)
            let irq = self.read16(FUNC_BUS, REG_BUS_INTERRUPT).await;

            // const IRQ_F2_PACKET_AVAILABLE: u16 = 0x0020;
            // which is 0000000000100000 and if we look at the gSPI registers
            // above we find the bit 5 is 'F2 packet available'. We AND this
            // and if bit nr 5 was set (it will then be greater than 0) then
            // a package is available. Also recall that F2 is the WLAN function.
            if irq & IRQ_F2_PACKET_AVAILABLE != 0 {
                let mut status = 0xFFFF_FFFF;
                while status == 0xFFFF_FFFF {
                    // REG_BUS_STATUS is gSPI register 8 which contains the
                    // status bit definitions (32-bits)
                    status = self.read32(FUNC_BUS, REG_BUS_STATUS).await;
                }

                // STATUS_F2_PKT_AVAILBLE is bit 8 (100000000, 0x00000100) in
                // the status.
                if status & STATUS_F2_PKT_AVAILABLE != 0 {
                    let len = (status & STATUS_F2_PKT_LEN_MASK) >> STATUS_F2_PKT_LEN_SHIFT;

                    let cmd = cmd_word(false, true, FUNC_WLAN, 0, len);

                    self.spi
                        .transaction(|bus| {
                            let bus = unsafe { &mut *bus };
                            async {
                                bus.write(&[cmd]).await?;
                                bus.read(&mut buf[..(len as usize + 3) / 4]).await?;
                                Ok(())
                            }
                        })
                        .await
                        .unwrap();

                    trace!("rx {:02x}", &slice8_mut(&mut buf)[..(len as usize).min(48)]);

                    self.rx(&slice8_mut(&mut buf)[..len as usize]);
                }
            }

            // TODO use IRQs
            // yield to allow other task to make progress.
            yield_now().await;
        }
    }

I've added comments above to try to clarify what this functions is doing but bascially it will loop forever, first trying to send data from the host to the device is there are any pending commands to be sent. It will then see if there is anything to receive from Function 2 which is the WLAN function. If there is something to receive it will be read using SPI into the buf which will then by calling the self.rx method:

    fn rx(&mut self, packet: &[u8]) {
        let sdpcm_header = SdpcmHeader::from_bytes(packet[..SdpcmHeader::SIZE].try_into().unwrap());
        trace!("rx {:?}", sdpcm_header);
        if sdpcm_header.len != !sdpcm_header.len_inv {
            warn!("len inv mismatch");
            return;
        }
        if sdpcm_header.len as usize != packet.len() {
            // TODO: is this guaranteed??
            warn!("len from header doesn't match len from spi");
            return;
        }

        self.update_credit(&sdpcm_header);

        let channel = sdpcm_header.channel_and_flags & 0x0f;

        let payload = &packet[sdpcm_header.header_length as _..];

So notice that the byte array is passed in, packet is first turned into a SdpcmHeader which is just a transmute which is bacially a way of reinterpreting the bits as a SdpcmHeader.

This header is defined in the SDPCM Header section:

typedef struct
{
  uint16_t  frametag[2];      // SDPCM packet size
  uint8_t sequence;           // Sequence number of pkt
  uint8_t channel_and_flags;  // IOCTL/IOVAR or User Data or Event
  uint8_t next_length;
  uint8_t header_length;      // Offset to BDC or CDC header
  uint8_t wireless_flow_control;
  uint8_t bus_data_credit;    // Credit from WLAN Chip
  uint8_t _reserved[2];
} sdpcm_header_t;

Note that frametag is an array and is represented as two fields in embassy:

#[repr(C)]
pub struct SdpcmHeader {
    pub len: u16,
    pub len_inv: u16,
    /// Rx/Tx sequence number
    pub sequence: u8,
    ///  4 MSB Channel number, 4 LSB arbitrary flag
    pub channel_and_flags: u8,
    /// Length of next data frame, reserved for Tx
    pub next_length: u8,
    /// Data offset
    pub header_length: u8,
    /// Flow control bits, reserved for Tx
    pub wireless_flow_control: u8,
    /// Maximum Sequence number allowed by firmware for Tx
    pub bus_data_credit: u8,
    /// Reserved
    pub reserved: [u8; 2],
}

Notice that that the field channel_and_flags contains 4 bits that indicate what type of channel this packet belongs to. These can be:

0000 = Control
0001 = Event
0010 = Data

For control packets there is a header named cdc_header:

typedef struct
{
    uint32_t cmd;     // ioctl command value
    uint32_t len;     // lower 16: output buflen; upper 16: input buflen (excludes header)
    uint32_t flags;   // flag defns given in bcmcdc.h
    uint32_t status;  // status code returned from the device
} cdc_header_t;

This header will be appended to the SdpcmHeader:

  +-------------+------------+---------------------+
  | SdpcmHeader | cdc_header | IOCTL/IOVAR Message |
  +-------------+------------+---------------------+

So it is the Bus Layer of the WHD that assembles/de-assembles these headers, and also handles communication with the device. This is what the firmware blob implementation does.

Next, in the example we have the update_credit call:

        self.update_credit(&sdpcm_header);

This credit based flow control and is communicated by the device to the host to indicate/check that it is alright to send to it. The WHD has a packet queue and it might not have room for more entries without having to overwrite packets in the queue which would cause loss of packets. Instead it can send command messages with a credit, or the credit can be piggybacked on in user data packets.

    fn update_credit(&mut self, sdpcm_header: &SdpcmHeader) {
        // channel_and_flags contain both channels and flags, here we mask out
        // the channel part and compare the lower 4 bits (the flags) that the
        // value is greater than 0011, in which case it indicates this
        // sdpcm_header contains bus_data_credit.
        if sdpcm_header.channel_and_flags & 0xf < 3 {
            // Get the value of of bus_data_credit from the header
            let mut sdpcm_seq_max = sdpcm_header.bus_data_credit;
            // if the value sent from the device - the current sequence is 
            // greater than 0x40 (64), use self.sdpcm_seq + 2 as the new
            // self.sdpcm_seq_max value.
            if sdpcm_seq_max.wrapping_sub(self.sdpcm_seq) > 0x40 {
                sdpcm_seq_max = self.sdpcm_seq + 2;
            }
            self.sdpcm_seq_max = sdpcm_seq_max;
        }
    }

I've not been able to figure out where the channel_and_flags flags, like the value 3 above, are defined and why it being greater than 3 indicates that the bus_data_credit field was send from the device.

    fn has_credit(&mut self) -> bool {
        self.sdpcm_seq != self.sdpcm_seq_max && self.sdpcm_seq_max.wrapping_sub(self.sdpcm_seq) & 0x80 == 0
    }

sdpcm_seq and sdpcm_seq_max are both fields in Runner and are of type u8:

   sdpcm_seq: u8,
   sdpcm_seq_max: u8,

The check above is using AND with the second operad being 0x80 (128 decimal), and 10000000 in binary and making sure that it is less than 128.

     10000000    10000000
   & 10000000    01000000
    ---------   ---------
     10000000    00000000

My initial thought was why not just compare using < 128 but thinking a little more it is probably more efficient to use a bitwise and operation and compare to zero that comparing with a non-zero value.

ioctl (input/output control)

Is a system call for device specific I/O operations. This takes a parameter specifying the operation/command.

The firmware the we download onto the chip is what handles these calls.

When writing these data packets is like writing them to the WiFi RAM, waiting for a repsonse an ack, and then reading the response.

I've added the IoctlType:

    let res_len = self
         .ioctl(IoctlType::Get, IOCTL_CMD_GET_VAR, 0, &mut buf[..total_len])
         .await;

Now, this reads a little strange, like if we know that the command is IOCTL_CMD_GET_VAR why do we have to specify IoctlCmdType:Get?
Well, the command sent is just an integer and it is whatever the device implementation says it should be. It does not say anything about if this is reading something from WiFi RAM or writing to it.

To explain this lets take a look at the ioctl in Linux.

# include <sys/ioctl.h>

# ioctl(int fd, unsigned long request, ...);

Where the fd would be open file descriptor to the device in question. The request is a device specific request code. This request code has the direction of the request encoded in it, whether it is a read/write (get/set). The request is usually defined using on of the macros _IO, _IOR, IOW, or IORW and this is what takes care of the direction.

And this is the reason for us specifying the direction using the IoctlType enum:

.359543 TRACE tx SdpcmHeader { len: 64, len_inv: 65471, sequence: 32, channel_and_flags: 0, next_length: 0, header_length: 12, wireless_flow_control: 0, bus_data_credit: 0, reserved: [0, 0] }
└─ cyw43::{impl#4}::send_ioctl::{async_fn#0} @ /home/danielbevenius/work/iot/embassy/cyw43/src/fmt.rs:112
5.359598 TRACE     CdcHeader { cmd: 26, len: 36, flags: 2, id: 33, status: 0 }

The command in question, 26 is IOCTL_CMD_SET_SSID and comes from this line:

   self.ioctl(CdcCmdType::Set, IOCTL_CMD_SET_SSID, 0, &mut i.to_bytes())   
             .await; // set_ssid 

Notice also that flags is set to 2 which is the CdcHeader::flags field and this is the direction write/set.

Lets take a look at the parameters of send_ioctl:

async fn send_ioctl(&mut self, kind: IoctlType, cmd: u32, iface: u32, data: &[u8]) {

We discussed kind and cmd earlier and data is the data to be sent. That leaves iface which will become part of the CdcHeader field flags:

 let cdc_header = CdcHeader {
       cmd: cmd,
       len: data.len() as _,
       flags: kind as u16 | (iface as u16) << 12,
       id: self.ioctl_id,
       status: 0,
 };

Just like the IoctlType, the iface is part of the flags field and specific to the device (think of the request code in the linux ioctl call.

IOVARs

IOVARs are really a way of calling IOCTL using one of two commands, IOCTL_CMD_SET_VAR and IOCTL_CMD_GET_VAR. For example in embassy cyw43 we have:

    self.set_iovar_u32("bus:txglom", 0).await;

which calls set_iovar_u32:

async fn set_iovar_u32(&mut self, name: &str, val: u32) {
     self.set_iovar(name, &val.to_le_bytes()).await
}

async fn set_iovar(&mut self, name: &str, val: &[u8]) { 
     ...
    self.ioctl(IoctlType::Set, IOCTL_CMD_SET_VAR, 0, &mut buf[..total_len]) 
        .await
}

So the name and value of the variable to set is passed as an argument to command.

tx gloming

Glom is short for "conglomerate" which means "gather together into a compact mass" For the SDIO driver it adds support for txglom feature, which transfers multiple transmit packets in one MMC request

Clocks

  • Idle Low Power (ILP) Generated by either a low-power oscillator (LPO) or by dividing the ALP clock frequency by a programmable value. Use of this clock maximizes power savings during idle states.

  • Active Low Power (ALP) Supplied by an internal or external oscillator. This clock is requested by cores when accessing backplane registers in other cores or when performing minor computations. When an external crystal is used to provide reference clock, ALP clock frequency is determined by the frequency of the external oscillator. A 37.4 MHz reference clock is recommended.

  • High Throughput (HT) Supplied by an on-chip PLL. This clock is requested by cores when they transfer blocks of data to or from memory, perform computation-intensive operations, or need to meet the requirements of external devices. Cores that cannot tolerate operations at less than the HT clock frequency, such as the memory controller, may assert the HT clock request continuously.

WLAN Software Architecture

The host driver (on the cyw43 device) provides a connection between the host (in our case the Pico PI W). This is presented as a network driver interface over a SDIO bus.