Conversation
klemenkenda
left a comment
There was a problem hiding this comment.
Great pull request making the code much more consistent. Some minor changes are requested.
| | **retrain_period**| integer | None | A number of received samples after which the model will be re-trained. This is an optional parameter. If it is not specified no re-training will be done.| | ||
| | **samples_for_retrain**| integer | None | A number of samples that will be used for re-training. If retrain_period is not specified this parameter will be ignored. This is an optional parameter. If it is not specified (and retrain_period is) the re-train will be done on all samples received since the component was started.| | ||
| | **time_offset**| string | H | [String alias](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases) to define the data time offsets. The aliases used in training and topic names are lowercase for backwards compatibility.| | ||
| | **learning_rate**| float| 4E-5 | Learning rate for the torch model.| |
There was a problem hiding this comment.
Separate PyTorch specific configuration parameters.
README.md
Outdated
| | **evaluation_split_point**| float | 0.8 | Define training and testing splitting point in the dataset, for model evaluation during learning phase (fit takes twice as long time).| | ||
| | **retrain_period**| integer | None | A number of received samples after which the model will be re-trained. This is an optional parameter. If it is not specified no re-training will be done.| | ||
| | **samples_for_retrain**| integer | None | A number of samples that will be used for re-training. If retrain_period is not specified this parameter will be ignored. This is an optional parameter. If it is not specified (and retrain_period is) the re-train will be done on all samples received since the component was started.| | ||
| | **time_offset**| string | H | [String alias](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases) to define the data time offsets. The aliases used in training and topic names are lowercase for backwards compatibility.| |
There was a problem hiding this comment.
This is not very clear to me since it is new, does it affect ScikitLearn models?
There was a problem hiding this comment.
Is this just the unit for prediction_horizons? If so - move these descriptions closer together.
There was a problem hiding this comment.
Yes. It is as you said a unit for prediction_horizons. I will also write about time_offset in prediction_horizons and move them together.
The sk-learn models can now also use different time offsets but if nothing is specified it defaults to hours for backwards compatibility.
| ## Requirements | ||
|
|
||
| * Python 3.6+ | ||
| * Python 3.9+ |
There was a problem hiding this comment.
3.9 is the version which I have installed and the version on which the code was tested. The sk-learn code should work on both versions as I didn't change that but I can't guarantee for any of new pip packages used. Do I find compatible versions of packages for 3.6 python and update this also along with requirements.txt?
src/lib/predictive_model.py
Outdated
|
|
||
| def __init__(self, algorithm, sensor, prediction_horizon, evaluation_periode, error_metrics, split_point, | ||
| retrain_period = None, samples_for_retrain = None, retrain_file_location = None): | ||
| def rmse(self, true, pred): |
There was a problem hiding this comment.
There is a file called regression_metrics.py. We should include this in there?
src/lib/predictive_model.py
Outdated
| #print "Loaded model from", filename | ||
| # print "Loaded model from", filename | ||
|
|
||
| class TorchNetwork: |
There was a problem hiding this comment.
Should we put this in a separate file?
src/main.py
Outdated
| port= 3001 | ||
| path= "/ping?id=5&secret=b9347c25aba4d3ba6e8f61d05fd1c011" | ||
| port = 3001 | ||
| path = "/ping?id=5&secret=b9347c25aba4d3ba6e8f61d05fd1c011" |
There was a problem hiding this comment.
Can we put secret in some config file?
There was a problem hiding this comment.
I don't know the purpose of the watchdog in this module but I moved all the hardcoded parameters to the config file.
src/main.py
Outdated
| timestamp = rec['timestamp'] | ||
| ftr_vector = rec['ftr_vector'] | ||
| measurement = ftr_vector[0] # first feature is the target measurement | ||
| # try: |
There was a problem hiding this comment.
try-catch was there for a reason ... probably because of potential crash due to misformatted messages. Please add it back.
There was a problem hiding this comment.
Yeah, that was an oversight. I wanted full error stack while debugging so I just put the try block in comments and forgot to uncomment after will fix it.
Two major updates.