Follow-up items from PR #132 (merge raft_proto into nvforest namespace)

## Overview

This issue tracks unresolved review comments from PR #132 (https://github.com/rapidsai/nvforest/pull/132) that were flagged during code review but not addressed before merge.

---

## Critical

### `cpp/include/nvforest/buffer.hpp` — size_t underflow in bounds check (lines 329-339, 361-372)
The `copy` overloads subtract `size_t` values before verifying offsets are in-range. If `src_offset > src.size()` or `dst_offset > dst.size()`, the subtraction wraps and the out-of-bounds copy proceeds silently.

**Fix:** Add `src_offset > src.size() || dst_offset > dst.size()` as pre-conditions before the subtraction, in both the lvalue and rvalue overloads.

---

## Major

### `cpp/include/nvforest/decision_forest.hpp` — buffers copied instead of moved (line 163)
The constructor accepts rvalue references (`buffer<T>&&`) but the member initializers use the parameter names as lvalues, causing deep copies of every model buffer on import.

**Fix:** Use `std::move(nodes)`, `std::move(root_node_indexes)`, etc. in the member initializer list.

### `cpp/include/nvforest/decision_forest.hpp` — categorical storage size formula bug (line 463)
`detail::ceildiv(max_num_categories, max_local_categories) + 1 * num_categorical_nodes` evaluates as `bins_per_node + num_categorical_nodes` due to operator precedence. This underestimates the required index range.

**Fix:** Add parentheses: `(detail::ceildiv(max_num_categories, max_local_categories) + 1) * num_categorical_nodes`.

### `cpp/include/nvforest/detail/infer/gpu.cuh` — no early return for `row_count == 0` (lines 190-193, 229-320)
When `row_count == 0`, `num_blocks` becomes 0 and the kernel launches / workspace allocations below are invalid.

**Fix:** Add `if (row_count == 0) { return; }` before any buffer allocation or kernel launch.

### `cpp/include/nvforest/detail/infer/gpu.cuh` — divide-by-zero when `shared_mem_per_block == 0` (lines 153-163)
`ceildiv(max_shared_mem_per_sm, shared_mem_per_block)` and the subsequent "divide shared mem evenly" step divide by zero when `shared_mem_per_block == 0`.

**Fix:** Guard the occupancy math with `if (shared_mem_per_block != 0)` and fall back to `resident_blocks_per_sm = max_resident_blocks` when zero.

### `cpp/include/nvforest/cuda_stream.hpp` — CUDA sync error silently dropped (lines 16-20)
`synchronize(cuda_stream)` calls `cudaStreamSynchronize` and discards the `cudaError_t` return value.

**Fix:** Capture the result and call `detail::cuda_check` (or equivalent) to propagate failures.

### `cpp/include/nvforest/handle.hpp` — null `raft_handle_` crashes in GPU builds (lines 16-29)
The default constructor allows `handle_ptr = nullptr`, but methods dereference it unconditionally. `nvforest::handle_t{}` will crash on the first inference in GPU builds.

**Fix:** Remove the default argument and throw `std::invalid_argument` when `handle_ptr == nullptr`, or delete the default constructor for GPU builds.

### `cpp/include/nvforest/treelite_importer.hpp` — null `TreeliteModelHandle` not guarded (lines 515-529)
`import_from_treelite_handle` dereferences `tl_handle` without a null check.

**Fix:** Add a null check and throw `model_import_error` when `tl_handle == nullptr`.

### `python/nvforest/nvforest/detail/forest_inference.pyx` — unknown `device` strings silently fall back to CPU (line 106)
Any `device` value other than `"gpu"` is silently mapped to CPU.

**Fix:** Raise `ValueError(f"device must be 'cpu' or 'gpu', got {device!r}")` for unknown values.

---

## Minor

### `cpp/include/nvforest/detail/infer/gpu.hpp` — missing `<type_traits>` include (lines 13-24)
The header uses `std::enable_if_t` without including `<type_traits>`.

**Fix:** Add `#include <type_traits>`.

### `cpp/include/nvforest/README.md` — inconsistent parameter name (lines 46-70)
The example uses `mem_type` but the parameter description uses `dev_type`.

**Fix:** Use one name consistently.

---

_Reported by CodeRabbit during review of PR #132. Requested by @hcho3._

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Follow-up items from PR #132 (merge raft_proto into nvforest namespace) #133

Overview

Critical

`cpp/include/nvforest/buffer.hpp` — size_t underflow in bounds check (lines 329-339, 361-372)

Major

`cpp/include/nvforest/decision_forest.hpp` — buffers copied instead of moved (line 163)

`cpp/include/nvforest/decision_forest.hpp` — categorical storage size formula bug (line 463)

`cpp/include/nvforest/detail/infer/gpu.cuh` — no early return for `row_count == 0` (lines 190-193, 229-320)

`cpp/include/nvforest/detail/infer/gpu.cuh` — divide-by-zero when `shared_mem_per_block == 0` (lines 153-163)

`cpp/include/nvforest/cuda_stream.hpp` — CUDA sync error silently dropped (lines 16-20)

`cpp/include/nvforest/handle.hpp` — null `raft_handle_` crashes in GPU builds (lines 16-29)

`cpp/include/nvforest/treelite_importer.hpp` — null `TreeliteModelHandle` not guarded (lines 515-529)

`python/nvforest/nvforest/detail/forest_inference.pyx` — unknown `device` strings silently fall back to CPU (line 106)

Minor

`cpp/include/nvforest/detail/infer/gpu.hpp` — missing `<type_traits>` include (lines 13-24)

`cpp/include/nvforest/README.md` — inconsistent parameter name (lines 46-70)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Follow-up items from PR #132 (merge raft_proto into nvforest namespace) #133

Description

Overview

Critical

cpp/include/nvforest/buffer.hpp — size_t underflow in bounds check (lines 329-339, 361-372)

Major

cpp/include/nvforest/decision_forest.hpp — buffers copied instead of moved (line 163)

cpp/include/nvforest/decision_forest.hpp — categorical storage size formula bug (line 463)

cpp/include/nvforest/detail/infer/gpu.cuh — no early return for row_count == 0 (lines 190-193, 229-320)

cpp/include/nvforest/detail/infer/gpu.cuh — divide-by-zero when shared_mem_per_block == 0 (lines 153-163)

cpp/include/nvforest/cuda_stream.hpp — CUDA sync error silently dropped (lines 16-20)

cpp/include/nvforest/handle.hpp — null raft_handle_ crashes in GPU builds (lines 16-29)

cpp/include/nvforest/treelite_importer.hpp — null TreeliteModelHandle not guarded (lines 515-529)

python/nvforest/nvforest/detail/forest_inference.pyx — unknown device strings silently fall back to CPU (line 106)

Minor

cpp/include/nvforest/detail/infer/gpu.hpp — missing <type_traits> include (lines 13-24)

cpp/include/nvforest/README.md — inconsistent parameter name (lines 46-70)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`cpp/include/nvforest/buffer.hpp` — size_t underflow in bounds check (lines 329-339, 361-372)

`cpp/include/nvforest/decision_forest.hpp` — buffers copied instead of moved (line 163)

`cpp/include/nvforest/decision_forest.hpp` — categorical storage size formula bug (line 463)

`cpp/include/nvforest/detail/infer/gpu.cuh` — no early return for `row_count == 0` (lines 190-193, 229-320)

`cpp/include/nvforest/detail/infer/gpu.cuh` — divide-by-zero when `shared_mem_per_block == 0` (lines 153-163)

`cpp/include/nvforest/cuda_stream.hpp` — CUDA sync error silently dropped (lines 16-20)

`cpp/include/nvforest/handle.hpp` — null `raft_handle_` crashes in GPU builds (lines 16-29)

`cpp/include/nvforest/treelite_importer.hpp` — null `TreeliteModelHandle` not guarded (lines 515-529)

`python/nvforest/nvforest/detail/forest_inference.pyx` — unknown `device` strings silently fall back to CPU (line 106)

`cpp/include/nvforest/detail/infer/gpu.hpp` — missing `<type_traits>` include (lines 13-24)

`cpp/include/nvforest/README.md` — inconsistent parameter name (lines 46-70)