File tree 1 file changed +36
-14
lines changed
1 file changed +36
-14
lines changed Original file line number Diff line number Diff line change 5
5
"metadata" : {},
6
6
"source" : [
7
7
" <h3>可用于产品的线性回归算法建流程</h3>\n " ,
8
- " 1. 载入数据集\n " ,
9
- " 2. 数据预览\n " ,
10
- " 3. 获取特征数据和输出数据\n " ,
11
- " 4. 缺失值处理,异常值处理(若有)\n " ,
12
- " 5. 划分数据集为训练集和验证集\n " ,
13
- " 6. 训练集数据的标准化\n " ,
14
- " 7. 使用不同的参数在训练集上训练出模型(使用训练集交叉验证是可选的)\n " ,
15
- " 8. 观察训练集的MSE和RMSE\n " ,
16
- " 9. 验证集数据的标准化\n " ,
17
- " 10. 使用训练出的模型在验证集上做预测,并观察验证集的MSE和RMSE\n " ,
18
- " 11. 画图观察(可选)\n " ,
19
- " 12. 重复步骤7-11,选择验证集上MSE或RMSE最小的模型作为模型输出\n " ,
20
- " 13. 保存模型,加载模型做预测,共3种方法\n " ,
8
+ " <h4>数据准备阶段</h4>\n " ,
9
+ " \n " ,
10
+ " 1.载入数据集\n " ,
11
+ " \n " ,
12
+ " 2.数据预览\n " ,
13
+ " \n " ,
14
+ " 3.获取特征数据和输出数据\n " ,
15
+ " \n " ,
16
+ " <h4>特征工程阶段</h4>\n " ,
17
+ " \n " ,
18
+ " 4.缺失值处理,异常值处理,高级特征生成(若有)\n " ,
19
+ " \n " ,
20
+ " 5.划分数据集为训练集和验证集\n " ,
21
+ " \n " ,
22
+ " 6.训练集数据的标准化\n " ,
23
+ " \n " ,
24
+ " <h4>模型训练阶段</h4>\n " ,
25
+ " \n " ,
26
+ " 7.使用不同的参数在训练集上训练出模型(使用训练集交叉验证是可选的)\n " ,
27
+ " \n " ,
28
+ " 8.观察训练集的MSE和RMSE\n " ,
29
+ " \n " ,
30
+ " <h4>模型验证阶段</h4>\n " ,
31
+ " \n " ,
32
+ " 9.验证集数据的标准化\n " ,
33
+ " \n " ,
34
+ " 10.使用训练出的模型在验证集上做预测,并观察验证集的MSE和RMSE\n " ,
35
+ " \n " ,
36
+ " 11.画图观察(可选)\n " ,
37
+ " \n " ,
38
+ " 12.重复步骤7-11,选择验证集上MSE或RMSE最小的模型作为模型输出\n " ,
39
+ " \n " ,
40
+ " <h4>算法上线阶段</h4>\n " ,
41
+ " \n " ,
42
+ " 13.保存模型,加载模型做预测,共3种方法\n " ,
21
43
" * 使用python pickle API 保存\n " ,
22
44
" * 使用python 加载pickle文件预测\n " ,
23
45
" * 使用sklearn joblib API 保存\n " ,
24
46
" * 使用sklearn joblib API 加载sklearn模型预测\n " ,
25
47
" * 保存为PMML文件\n " ,
26
48
" * 使用PMML Java API加载模型预测\n " ,
27
49
" \n " ,
28
- " <h5>by Pinard Liu</h5>"
50
+ " <h5>by Pinard Liu @ 20190131 </h5>"
29
51
]
30
52
},
31
53
{
You can’t perform that action at this time.
0 commit comments