-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend PrepPipeline to allow performing PREP in stages #73
Comments
I agree. Usually I dont do the remove line noise at all because the signals were already filtered. |
I think I forgot how the robust re referencing works ... I am surprised that it can work without interpolation. Can you explain? Other than that, the api proposal sounds good to me. As long as we have good documentation an warnings in place to prevent users from accidentally screwing up their data, it should be fine. |
I understood that @a-hurst meant without the "final" interpolation. Not "without any interpolation at all". Now, I do wonder how it could be done that way: The interpolation inside the robust referencing is always done, but it doesn't affect the real signal but a dummy-temporal one , that interpolation is only done to calculate the robust reference. Line 116 in 051edd6
Line 318 in 051edd6
What I get is that at the end of robust referencing PREP once again searches for bad channels but this time he interpolates it on the "real" signal . These channels then become the "interpolated channels": Line 153 in 051edd6
Then prep once again searches for bad channels and what he finds is named "still_noisy_channels". |
Yep, that's what I was referring to! As you said, during referencing NoisyChannels only ever does interpolation to make predictions during RANSAC. Then once it does its loops of NoisyChannels, it interpolates all 'bad' channels on a dummy copy of the data and uses that to calculate an average reference signal free of bad channels, and then re-references the data. After that, NoisyChannels is run again, and any channels that are still flagged as 'bad' get interpolated. The signal is then average-referenced to account for the interpolated channels, and final run of NoisyChannels is used to detect any 'remaining bad' channels even after interpolating bads. What I'm proposing is to have a method EDIT: Nevermind, I totally missed it! See the lines here: Lines 315 to 325 in 051edd6
I guess that means reference signal calculation is going to depend on MNE's interpolation code being identical to EEGLAB's (which, confusingly, uses a different function than interpolation does during RANSAC). Still, that doesn't change that initial re-referencing and final interpolation can be done separately (which is what I originally meant). |
@sappelhoff @yjmantilla So I started taking a crack at this thinking it might simplify #95 a bit, but I've run into a couple snags I'd like to get your thoughts on. For one, a lot of the pipeline attributes are named in a way that assumes interpolation will always happen: pyprep/pyprep/prep_pipeline.py Lines 226 to 237 in 1c9589c
With final interpolation of bads being made optional, should all the Relatedly, I'm trying to decide whether it makes sense for |
I think that the upcoming release should be the one where we go wild in terms of API changes, with the hopes of having a much more stable API after 0.4.0 ... and eventually a fixed API (with deprecation cycles if necessary) starting at 1.0.0 --> so I agree with renaming attributes that don't make sense semantically (any more). Re: Your second question I cannot think of a use case right now 🤔 if a user really wanted to do that wouldn't they be able to pass |
Regarding the renaming to "reference"... Yeap, I dont see any more proper name, after all the reference does indeed split those two "before" and "after" objects regardless the interpolation being done or not.
I dont know if this is too hacky or confusing but what about using a decorator to redirect EEG_X_interpolation to EEG_X_reference (the later would be the one in the code). It would be similar to what was done here: pyprep/pyprep/prep_pipeline.py Lines 168 to 175 in 1c9589c
That way we wouldn't break the previous notation... Downside is that it may be confusing (but could be just clarified in the docstring).
I agree with @sappelhoff , in that case he would pass false and then interpolate with his preferred way. Now, I understand dummy.interpolate() in the following code is just mne stuff, so maybe we can expose the _eeglab_interpolate_bads ; I mean, this function is what diverges from MNE interpolation so in a way is the only thing I see we could provide as a "different interpolation" from what is already on MNE. Nevertheless , this would be a too technical use-case and an hypothetical user that needs it wouldn't have trouble with just using the private function. So, I think overall is not necessary to make interpolated_bads a method. Lines 151 to 154 in 1c9589c
|
@sappelhoff @yjmantilla Okay, I've got a rough draft for reorganized/renamed attributes for the # NOTE: 'original' refers to before initial average reference, not first
# pass afterwards. Not necessarily comparable to later values?
self.noisy_info = {
"original": None, "post-reference": None, "post-interpolation": None
}
self.bad_channels = {
"original": None, "post-reference": None, "post-interpolation": None
}
self.reference_signals = {"robust": None, "post-interpolation": None}
self.robust_reference_signal = None
self._interpolated_reference_signal = None
@property
def current_noisy_info(self):
post_ref = self.noisy_info["post-reference"]
post_interp = self.noisy_info["post-interpolation"]
return post_interp if post_interp else post_ref
@property
def remaining_bad_channels(self):
post_ref = self.bad_channels["post-reference"]
post_interp = self.bad_channels["post-interpolation"]
return post_interp if post_interp else post_ref
@property
def current_reference_signal(self):
post_ref = self.robust_reference_signal
post_interp = self._interpolated_reference_signal
return post_interp if post_interp else post_ref Basically I've grouped all the bad channel lists & noisy channels dicts into dictionaries with keys for each major timepoint, with the values being
I was thinking about that: it's definitely possible (including with the above reorganization), I'm just not sure whether the project has enough users at this stage to warrant properly deprecating methods/attributes instead of just dropping them. Do we have any PyPi download stats that would give us an idea?
In the |
https://pypistats.org/packages/pyprep But any person who reasonably uses pyprep currently is not installing it via pip, I think - but rather directly from GitHub.
IMHO we should not properly deprecate ... let's do what we think is best without looking back - for now. We have a prominent warning in the README that this will be the case (alpha software). Regarding the proposed getter methods: I think as a user I'd like to have access to all data, and not "being served a part of the data based on the state my object is in", while the other data is present, but hidden from me. Or at least, I'd like to have control over what data I can get (with reasonable defaults so that I don't have to give it much thought). |
I was thinking that the If you think that separate attributes for the original/post-reference/post-interpolation info makes for a better API than the dict approach though I'll change it accordingly 🙂 |
ahh I see, makes sense - thanks for the explanation :) |
While chugging along on the PyPREP/MatPREP comparison unit tests, I had a thought about the PyPREP API: since there are more-or-less three distinct stages to the PREP pipeline (adaptive line noise removal, robust re-referencing, interpolation of remaining bad channels), what if the
PrepPipeline
object had separate methods for each of these? Thefit()
method would still exist, but just run all of these in sequence.Separate methods would be handy for situations like my own, where I basically copy/pasted the guts of
fit()
into my own script in order to do progress messages and plotting between CleanLine and robust referencing. Also for my script the project lead wanted the option to ignore bad channels instead of interpolating them, so I had to write extra code to revert interpolation and the names of bad channels instead of having the option to not interpolate in the first place.Here's my basic proposal for
PrepPipeline
methods:fit
: Performs the whole PREP pipeline, like ususalremove_line_noise
: Performs adaptive line noise removal.robust_reference
: Performs robust re-referencing without interpolation. Will default to printing a warning ifremove_line_noise
isn't done first.interpolate_bads
: Interpolates any remaining bad channels and adjusts the average reference accordingly.Let me know what you think!
EDIT: With the addition of #96, a PR addressing this issue should also add MATLAB comparison tests for final bad channel interpolation.
The text was updated successfully, but these errors were encountered: