Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Scorer Bugs #40

Open
hijohnnylin opened this issue Nov 22, 2024 · 3 comments
Open

Minor Scorer Bugs #40

hijohnnylin opened this issue Nov 22, 2024 · 3 comments

Comments

@hijohnnylin
Copy link
Contributor

hijohnnylin commented Nov 22, 2024

Hey all! Working on getting deeper integration for this into Neuronpedia. Ran into a few things (not blockers) but may be good to resolve at some point:

  1. When FuzzingScorer or DetectionScorer receives fewer activations than the batch_size, all predictions seem to be -1 and cause the resulting dataframe to be empty:
Empty DataFrame
Columns: []
Index: []
  1. When all activations passed into FuzzingScorer are zero activation value, we get a "division by zero" error:
  File "sae-auto-interp/.venv/lib/python3.11/site-packages/sae_auto_interp/scorers/classifier/fuzz.py", line 41, in average_n_activations
    avg = sum(
          ^^^^
ZeroDivisionError: division by zero

Thanks for all your work on this!

@SrGonao
Copy link
Collaborator

SrGonao commented Nov 25, 2024

In regard to the first point. I think I want this to be the intended behavior.
Can you explain the use case where you are always sending less than batch_size examples to the scorer? We are currently using batch_size to guarantee the size of the response is the expected size, but I see that that could be changed.

On the second point, is it because you are passing no test examples and only extra examples? I can only see that the division would be zero without test examples. If that is the case, we actually had an assert that is not doing anything and could be moved. We are using the examples with non-zero activations to decide how many tokens to highlight for that feature, so it is expected that you never pass only zero activations.

@hijohnnylin
Copy link
Contributor Author

Thanks for the quick response! Again I agree both of these are minor issues and may not even qualify as bugs.

In regard to the first point. I think I want this to be the intended behavior. Can you explain the use case where you are always sending less than batch_size examples to the scorer? We are currently using batch_size to guarantee the size of the response is the expected size, but I see that that could be changed.

  • Good point. It's pretty rare that we would do this. I encountered this when I was doing manual testing of the method.
  • It might be good to return an explicit error when number of activations is lower than the batch size, so that the user knows the result was not processed, rather than getting an empty dataframe.
  • I'm not sure if there are any related issues with processing the last batch if the number of activations is not an exact factor of the batch size. Eg, if I give it 13 activations with a batch size of 5, the remainder of 3 will be processed, right? I have not checked this explicitly and apologies if it's not an issue.

On the second point, is it because you are passing no test examples and only extra examples? I can only see that the division would be zero without test examples. If that is the case, we actually had an assert that is not doing anything and could be moved. We are using the examples with non-zero activations to decide how many tokens to highlight for that feature, so it is expected that you never pass only zero activations.

  • Ah yeah again this is sort of an edge case triggered by my manual testing. You are correct that in this case it would be passing zero test examples and only non-activating examples. If I were to be nitpicking I'd add a warning message or as you suggested, something similar to an assert instead of getting a divide by zero.

@SrGonao
Copy link
Collaborator

SrGonao commented Nov 27, 2024

I think your suggestions make sense. Before, we were throwing errors and stopping the program when some of these things happened, but when we wanted to make it go through tons of latents we just logged the errors and kept going with default outputs.
I think I do want to change that and make it ignore errors just if you throw in a flag somewhere. Some of the logging that was in place also seems to have been displaced, and I think it should be fixed.
I will work on this for the v0.2 version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants