[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of temperature param #148

Open
rishabhm12 opened this issue Nov 28, 2024 · 1 comment
Open

Use of temperature param #148

rishabhm12 opened this issue Nov 28, 2024 · 1 comment

Comments

@rishabhm12
Copy link

Hey folks,
Have we tested the use of temperature param? I don't see it in the loss function nor in the colbert's official implementation.

@rishabhm12
Copy link
Author
rishabhm12 commented Nov 28, 2024

One reason I am trying to think is, multitoken similarity might produce scores that are significantly different for positive and negative items for a given query after few steps in training (positives having hire score). Dividing the logits by temperature might further skew the distribution, making probas after softmax very high for positive query-item pair, lowering the loss significantly thereby making the model not learn meaningful embeddings. The model would think that it's already producing meaningful embeddings and not end up learning efficiently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant