We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
运行DCN模型跑下面这个数据集时候有些疑问: http://labs.criteo.com/2014/02/download-kaggle-display-advertising-challenge-dataset/ Kaggle Display Advertising Challenge Dataset 我看里面数据格式是: The columns are tab separeted with the following schema: <integer feature 1> ... <integer feature 13> <categorical feature 1> ... <categorical feature 26> 并没有区分用户id、商品id,那这样如何给用户做推荐呢?而且我看get_criteo_feature.py处理的时候,很多categorical 类型数据直接被截断没了,那如何区分开用户呢? parser.add_argument( "--cutoff", type=int, default=200, help="cutoff long-tailed categorical values" )
谢谢!
The text was updated successfully, but these errors were encountered:
切断是为了控制ids类特征做embedding的长度, 让长尾的ID都索引到0的位置,如果你知道怎么用参数服务器处理大规模稀疏ID特征,也可以所以的都加入训练
Sorry, something went wrong.
No branches or pull requests
运行DCN模型跑下面这个数据集时候有些疑问:
http://labs.criteo.com/2014/02/download-kaggle-display-advertising-challenge-dataset/
Kaggle Display Advertising Challenge Dataset
我看里面数据格式是:
The columns are tab separeted with the following schema:
<integer feature 1> ... <integer feature 13> <categorical feature 1> ... <categorical feature 26>
并没有区分用户id、商品id,那这样如何给用户做推荐呢?而且我看get_criteo_feature.py处理的时候,很多categorical 类型数据直接被截断没了,那如何区分开用户呢?
parser.add_argument(
"--cutoff",
type=int,
default=200,
help="cutoff long-tailed categorical values"
)
谢谢!
The text was updated successfully, but these errors were encountered: