Skip to content

Commit 421f0b4

Browse files
author
pinard.liu
committed
update some readme
1 parent d055c1b commit 421f0b4

File tree

1 file changed

+36
-14
lines changed

1 file changed

+36
-14
lines changed

classic-machine-learning/regression_production_example.ipynb

+36-14
Original file line numberDiff line numberDiff line change
@@ -5,27 +5,49 @@
55
"metadata": {},
66
"source": [
77
"<h3>可用于产品的线性回归算法建流程</h3>\n",
8-
"1. 载入数据集\n",
9-
"2. 数据预览\n",
10-
"3. 获取特征数据和输出数据\n",
11-
"4. 缺失值处理,异常值处理(若有)\n",
12-
"5. 划分数据集为训练集和验证集\n",
13-
"6. 训练集数据的标准化\n",
14-
"7. 使用不同的参数在训练集上训练出模型(使用训练集交叉验证是可选的)\n",
15-
"8. 观察训练集的MSE和RMSE\n",
16-
"9. 验证集数据的标准化\n",
17-
"10. 使用训练出的模型在验证集上做预测,并观察验证集的MSE和RMSE\n",
18-
"11. 画图观察(可选)\n",
19-
"12. 重复步骤7-11,选择验证集上MSE或RMSE最小的模型作为模型输出\n",
20-
"13. 保存模型,加载模型做预测,共3种方法\n",
8+
"<h4>数据准备阶段</h4>\n",
9+
"\n",
10+
"1.载入数据集\n",
11+
"\n",
12+
"2.数据预览\n",
13+
"\n",
14+
"3.获取特征数据和输出数据\n",
15+
"\n",
16+
"<h4>特征工程阶段</h4>\n",
17+
"\n",
18+
"4.缺失值处理,异常值处理,高级特征生成(若有)\n",
19+
"\n",
20+
"5.划分数据集为训练集和验证集\n",
21+
"\n",
22+
"6.训练集数据的标准化\n",
23+
"\n",
24+
"<h4>模型训练阶段</h4>\n",
25+
"\n",
26+
"7.使用不同的参数在训练集上训练出模型(使用训练集交叉验证是可选的)\n",
27+
"\n",
28+
"8.观察训练集的MSE和RMSE\n",
29+
"\n",
30+
"<h4>模型验证阶段</h4>\n",
31+
"\n",
32+
"9.验证集数据的标准化\n",
33+
"\n",
34+
"10.使用训练出的模型在验证集上做预测,并观察验证集的MSE和RMSE\n",
35+
"\n",
36+
"11.画图观察(可选)\n",
37+
"\n",
38+
"12.重复步骤7-11,选择验证集上MSE或RMSE最小的模型作为模型输出\n",
39+
"\n",
40+
"<h4>算法上线阶段</h4>\n",
41+
"\n",
42+
"13.保存模型,加载模型做预测,共3种方法\n",
2143
" * 使用python pickle API 保存\n",
2244
" * 使用python 加载pickle文件预测\n",
2345
" * 使用sklearn joblib API 保存\n",
2446
" * 使用sklearn joblib API 加载sklearn模型预测\n",
2547
" * 保存为PMML文件\n",
2648
" * 使用PMML Java API加载模型预测\n",
2749
"\n",
28-
"<h5>by Pinard Liu</h5>"
50+
"<h5>by Pinard Liu @ 20190131</h5>"
2951
]
3052
},
3153
{

0 commit comments

Comments
 (0)