Skip to content

Commit ca8a194

Browse files
authored
GH-50063: [C++] Validate buffer size for row-major tensors (#50064)
### Rationale for this change `ValidateTensorParameters` in cpp/src/arrow/tensor.cc only runs the `CheckTensorStridesValidity` buffer-overrun guard when strides are passed explicitly. With implicit (row-major) strides it computes strides for overflow but never checks the data buffer is large enough for the shape, so a tensor whose shape exceeds its buffer is accepted and later read out of bounds. This is reachable from IPC `ReadTensor`, where the shape comes from the flatbuffer and the body size is independent of it. ### What changes are included in this PR? Run `CheckTensorStridesValidity` on the computed row-major strides too. ### Are these changes tested? Added a case to `TestTensor.MakeFailureCases`. ### Are there any user-facing changes? No. **This PR contains a "Critical Fix".** Crafted IPC tensor metadata (or any caller building a row-major tensor over an undersized buffer) bypassed the bounds check, enabling an out-of-bounds read. * GitHub Issue: #50063 Authored-by: metsw24-max <metsw24@gmail.com> Signed-off-by: Rok Mihevc <rok@mihevc.org>
1 parent d9bc3b9 commit ca8a194

2 files changed

Lines changed: 4 additions & 0 deletions

File tree

cpp/src/arrow/tensor.cc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,7 @@ Status ValidateTensorParameters(const std::shared_ptr<DataType>& type,
216216
std::vector<int64_t> tmp_strides;
217217
RETURN_NOT_OK(ComputeRowMajorStrides(checked_cast<const FixedWidthType&>(*type),
218218
shape, &tmp_strides));
219+
RETURN_NOT_OK(CheckTensorStridesValidity(data, shape, tmp_strides, type));
219220
}
220221
if (dim_names.size() > shape.size()) {
221222
return Status::Invalid("too many dim_names are supplied");

cpp/src/arrow/tensor_test.cc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,9 @@ TEST(TestTensor, MakeFailureCases) {
268268
ASSERT_RAISES(Invalid, Tensor::Make(float64(), data, shape,
269269
{sizeof(double) * 12, sizeof(double)}));
270270

271+
// row-major (implicit strides) shape larger than the backing buffer
272+
ASSERT_RAISES(Invalid, Tensor::Make(float64(), data, {3, 100}));
273+
271274
// too many dim_names are supplied
272275
ASSERT_RAISES(Invalid, Tensor::Make(float64(), data, shape, {}, {"foo", "bar", "baz"}));
273276
}

0 commit comments

Comments
 (0)