-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monte carlo block added #113
base: main
Are you sure you want to change the base?
Conversation
…-carlo-block-added
…-carlo-block-added
…-carlo-block-added
…-carlo-block-added
…-carlo-block-added
…-carlo-block-added
…-carlo-block-added
…-carlo-block-added
…-carlo-block-added
@@ -166,7 +168,19 @@ impl Regressor { | |||
if mi.ffm_k > 0 { | |||
let block_ffm = block_ffm::new_ffm_block(&mut bg, mi).unwrap(); | |||
let triangle_ffm = block_misc::new_triangle_block(&mut bg, block_ffm).unwrap(); | |||
output = block_misc::new_join_block(&mut bg, vec![output, triangle_ffm]).unwrap(); | |||
if mi.ffm_mc_iteration_count == 0 || mi.ffm_mc_dropout_rate <= 0.0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Negative dropout rate makes little sense imo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does indeed not make sense, but people make mistakes ;) Thats why I am creating the tiangulation there
dropout_rate: f32, | ||
) -> Result<graph::BlockPtrOutput, Box<dyn Error>> { | ||
let num_inputs = bg.get_num_output_values(vec![&input]); | ||
assert_ne!(num_inputs, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why debug, this is when creating? Should break also when running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions during runtime cost something - I guess this is fine as it's just during init?
num_inputs, | ||
}); | ||
let mut block_outputs = bg.add_node(block, vec![input])?; | ||
assert_eq!(block_outputs.len(), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions - perhaps explicit to dbg mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why debug, this is when creating? Should break also when running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer as above -- as this is just init it's fine, for hotter part of source assertions would imply some kind of perf. penalty
src/block_monte_carlo.rs
Outdated
} | ||
|
||
fn create_seed_from_input_tape(input_tape: &[f32]) -> u64 { | ||
(input_tape.iter().sum::<f32>() * 1000.0) as u64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add constant to the top of the file + a mini explanation what it is exactly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted to constant & added explanation. I will keep it close to the actual usage otherwise you constantly need to jump up & down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting view on that -- I guess the initial repo's convention is constants on top (if you check older files), even though both approaches are fine
) { | ||
unsafe { | ||
if self.number_of_inputs_to_skip == 0 { | ||
self.copy_input_tape_to_output_tape(pb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afair "direct" implementation of this idea does not require this copy - is there an overhead associated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We added an intermediate step, so we still need to copy the input tape to the output tape so it is used in the next step. Its copying values so not that large.
} | ||
|
||
fn fill_stats(&self, pb: &mut PortBuffer) { | ||
let mean: f32 = pb.observations.iter().sum::<f32>() / self.num_iterations as f32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 5 iterations we get to 0.5 overhead per prediction. So not that much :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, assuming we stay in that range it's fine then
} | ||
let stats_slice = std::slice::from_raw_parts_mut(stats, stats_size); | ||
*stats_slice.get_unchecked_mut(0) = prediction_stats.mean; | ||
*stats_slice.get_unchecked_mut(1) = prediction_stats.variance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to report std and var, var is enough ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually need std dev. But I am reporting both just incase we mess up with the calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merely saying that stddev is just sqrt(var), computable more or less everywhere (e.g., during analysis)
if cmd_arguments.is_present("ffm_mc_iteration_count") { | ||
if let Some(val) = cmd_arguments.value_of("ffm_mc_iteration_count") { | ||
let hvalue = val.parse::<>()?; | ||
mi.ffm_mc_iteration_count = hvalue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is the meaning of the two new hpos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ffm_mc_iteration_count
- number of iteration that the monte carlo will be run
ffm_mc_dropout_rate
- dropout rate for it
Or did I misunderstand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok cool, just wanted to make sure I understood :) Intuitively mc_iteration_count=1
+ mc_dropout_rate=0.0
imply current mode of operation then -- this is not entirely true anymore though right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current = current main
Created a separate MonteCarlo block which executes the run of a specific continuation of blocks
n
times (wheren
is the number of iterations), where a specific number of inputs (also configurable) is masked with a random function where the seed is based on the sum of the input. For each run we pick which index should be masked instead of iterating over the whole array & checking if it should be masked or not.In the case of FW this block can be configurable run as part of the FFM after the triangle block, where it re-uses the input from the FFM/triangle on each run, which reduces the CPU & time of execution.
The execution of the sub runs is stored into a separate
PredictionStats
struct, which contains the mean, standard deviation, variance and number of iterations.Additionally we allow the exposing of the
PredictionStats
information via FFI as an array of outputs.