-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to replicate performance #4
Comments
Thanks for sharing your results. I didn't run the experiment with HyperOpt. Does your result stay stable under these params? The reg weight for the best performance from my experience should be around 1e-2. |
I realized that I wasn't disabling the dropout during evaluation. Fixed it and ran Hyperopt for 100 iterations this time, to be extra sure. Results are about the same: DGCF
LightGCN
I'm using Hyperopt (instead of a naive grid search) to speed up the evaluation. Reg weight of around 1e-2 was tested during Hyperopt's search. It's possible there is some other mistake in my implementation, I'm just not sure what it could be. |
Actually, I use early-stop to control the training. Also, from my experience, the results are unstable for the ml100k dataset. Don't know how hyperopt solves this problem. The reported results are the best ones for all different models. |
I've attempted a reimplementation in PyTorch for the recsys framework RecBole here, RUCAIBox/RecBole#594 so that it's convenient to compare with other algorithms, etc.
I replicated your experiment almost exactly afaict: MovieLens100k, 70-20-10 split, early stopping with Recall@20. The only difference I see is that I didn't remove users with few interactions as you say in the paper
I used HyperOpt to do a search on the hyperparameter ranges specified in the paper (with an added option for dropout probability between 0.1 and 0.5) limited to 50 trials.
DGCF results:
I did the same for LightGCN
LightGCN results:
These figures are quite different from your paper, the ndcg especially, but in particular LightGCN is winning in every metric.
Is there anything not written in the paper that I might be missing in my implementation?
And, btw, are you applying node dropout to LightGCN (even though it wasn't a part of the algorithm originally, afaik)?
Thanks for any help!
The text was updated successfully, but these errors were encountered: