[tmva] Implementation of a new shuffling strategy in RBatchGenerator #20071

martinfoell · 2025-10-09T12:23:26Z

This Pull request:

Introduces a new shuffling strategy for creating training batches, ensuring that each batch consists of data from different parts of the RDataFrame. Each chunk loaded into memory, which is used to create batches, now contains blocks of data drawn from different parts of the dataframe.

… in the dataframe

…dding

github-actions · 2025-10-09T18:45:08Z

Test Results

22 files 22 suites 3d 16h 2m 25s ⏱️
3 692 tests 3 691 ✅ 0 💤 1 ❌
79 273 runs 79 268 ✅ 0 💤 5 ❌

For more details on these failures, see this check.

Results for commit 57396ba.

♻️ This comment has been updated with latest results.

vepadulano

Thanks a lot for this great work! This is a first iteration of comments from my side.

bindings/pyroot/pythonizations/python/ROOT/_pythonization/_tmva/_batchgenerator.py

bindings/pyroot/pythonizations/test/rbatchgenerator_completeness.py

tmva/tmva/inc/TMVA/BatchGenerator/RBatchGenerator.hxx

tmva/tmva/inc/TMVA/BatchGenerator/RChunkConstructor.hxx

martinfoell · 2025-10-17T14:33:37Z

Thank you for the review @vepadulano! I have implemented the changes that you suggested and left some comments where it was unclear what the code was doing.

Martin Foll added 12 commits October 9, 2025 14:10

Add RChunkConstructor.hxx for constructing chunks from blocks of data…

805b23a

… in the dataframe

Add RChunkConstructor.hxx to CMakeLists.txt

6712fcd

Update RChunkLoader.hxx for loading chunks into memory

a0c2503

Update RBatchLoader.hxx for creating batches from the chunks

da69018

Update RBatchGenerator.hxx for generating batches from a dataframe

b3e3e53

Update pythonization of RBatchGenerator

d854092

Adjust RBatchGenerator tests and add tests for set_seed and vector pa…

b9782ef

…dding

Update RBatchGenerator tutorials

f8c0904

Fix typos and clean up

18936f7

Add documentation and comments

13eef9d

Optimization of loading chunks by sorting BlocksInChunks first

4d27b1e

Enable generating batches from RNTuple

1b14def

martinfoell self-assigned this Oct 9, 2025

martinfoell requested review from bellenot, couet, dpiparo, lmoneta and vepadulano as code owners October 9, 2025 12:23

vepadulano requested changes Oct 17, 2025

View reviewed changes

Add comments, remove unused variables and clean up code

57396ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[tmva] Implementation of a new shuffling strategy in RBatchGenerator #20071

[tmva] Implementation of a new shuffling strategy in RBatchGenerator #20071

Uh oh!

martinfoell commented Oct 9, 2025

Uh oh!

github-actions bot commented Oct 9, 2025 •

edited

Loading

Uh oh!

vepadulano left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

martinfoell commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[tmva] Implementation of a new shuffling strategy in RBatchGenerator #20071

Are you sure you want to change the base?

[tmva] Implementation of a new shuffling strategy in RBatchGenerator #20071

Uh oh!

Conversation

martinfoell commented Oct 9, 2025

This Pull request:

Uh oh!

github-actions bot commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

vepadulano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

martinfoell commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Oct 9, 2025 •

edited

Loading