https://github.com/lambdaji/tf_repos/tree/master/DeepMTL/Feature_pipeline #29

Outstandingwinner · 2019-09-02T08:04:06Z

这个路径下有一堆特征处理脚本，特别乱，看的头都大了，请问这些脚本的具体执行顺序是怎么样的？get_join_sample.sh得到的是特征频次没过滤的libsvm格式，要得到最终的特征频次过滤后的libsvm格式，这些脚本应该按什么顺序执行？

Outstandingwinner · 2019-09-02T08:05:37Z

请问get_feat_cnts.py这个脚本的数据输入源是哪个文件？

hexingjay · 2019-09-12T16:06:51Z

同样求问，做完第一步get_join_sample 就做不下去了

hexingjay · 2019-09-12T16:07:25Z

@Outstandingwinner 你有解了吗，这个天池数据集很难用啊

lambdaji · 2019-11-23T02:58:08Z

#step1 log to libsvm sample
sh get_join_sample.sh

#step2 stat sample & feature（可以跳过）
sh get_stat_feat.sh

#step3 remap feat_id（去掉低频特征，可以跳过）
sh get_remap_fid.sh

#step4 libsvm to tfrecords
python get_tfrecord.py --threads=10 --input_dir=./ --output_dir=./

Provide feedback