Skip to content

DLRM OneEmbedding Graph Train and Eval#323

Merged
ShawnXuan merged 199 commits intomainfrom
dev_dlrm_graph_train
Mar 30, 2022
Merged

DLRM OneEmbedding Graph Train and Eval#323
ShawnXuan merged 199 commits intomainfrom
dev_dlrm_graph_train

Conversation

@ShawnXuan
Copy link
Contributor

@ShawnXuan ShawnXuan commented Mar 20, 2022

OneFlow DLRM model with OneEmbedding train and evaluation in graph mode.

  • README
  • parquet dataset script

tail = [rglist[i][pos:] for i in range(self.C_end)]


class Dense(nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我有个想法(但我不确定这么做可不可以),就是不需要 Dense 了,默认都用 FusedMLP,如果环境不够 FusedMLP 内部其实也做了相应处理,用多个矩阵乘拼接。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove dense soon when we feel fully confident with more tests.

)

def forward(self, dense_fields, sparse_fields) -> flow.Tensor:
dense_fields = flow.log(dense_fields + 1.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个我记得是不是和数据集强相关的,需要特殊处理么

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we add a literal 1 for criteo1t and 3 for criteo kaggle dataset in data preprocess.

@@ -0,0 +1,3 @@
petastorm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有没有具体的版本限制

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worked on latest version

@ShawnXuan ShawnXuan marked this pull request as ready for review March 26, 2022 05:18
@ShawnXuan ShawnXuan merged commit 09d520d into main Mar 30, 2022
@ShawnXuan ShawnXuan deleted the dev_dlrm_graph_train branch March 30, 2022 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants