Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

https://github.com/lambdaji/tf_repos/tree/master/DeepMTL/Feature_pipeline #29

Open
Outstandingwinner opened this issue Sep 2, 2019 · 4 comments

Comments

@Outstandingwinner
Copy link

Outstandingwinner commented Sep 2, 2019

这个路径下有一堆特征处理脚本,特别乱,看的头都大了,请问这些脚本的具体执行顺序是怎么样的?get_join_sample.sh得到的是特征频次没过滤的libsvm格式,要得到最终的特征频次过滤后的libsvm格式,这些脚本应该按什么顺序执行?

@Outstandingwinner
Copy link
Author

请问get_feat_cnts.py这个脚本的数据输入源是哪个文件?

@hexingjay
Copy link

同样求问,做完第一步get_join_sample 就做不下去了

@hexingjay
Copy link

@Outstandingwinner 你有解了吗 ,这个天池数据集 很难用啊

@lambdaji
Copy link
Owner

#step1 log to libsvm sample
sh get_join_sample.sh

#step2 stat sample & feature(可以跳过)
sh get_stat_feat.sh

#step3 remap feat_id(去掉低频特征,可以跳过)
sh get_remap_fid.sh

#step4 libsvm to tfrecords
python get_tfrecord.py --threads=10 --input_dir=./ --output_dir=./

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants