-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a new fitting toolkit and examples of how to use it with Screen Profiles. #143
base: main
Are you sure you want to change the base?
Conversation
…rojection_fit and roi), also updated image_processing.py
…_measurements.py
…plan is to build out screen_beam_profile and make test notebook a unit test.
…dded to the test and example for screen_beam_profile_measurement.py
…nce there is a conflict with python 3.8
…nce there is a conflict with python 3.8
Thanks for all your hard work on this! Some initial comments after looking at a few files:
Thanks again, great work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as above. I'll look more carefully at the measurement and test files after we chat.
|
||
@abstractmethod | ||
def find_init_values(self, data: np.ndarray) -> list: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change all 'pass'es to NotImplementedError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even for abstract methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to make sure no one calls this w/o using something that overwrites the behavior? So there should be a catch for users that call this but it does not behave how they expect? I haven't used these a lot though, so maybe it's fine, we can discuss at the meeting.
|
||
def plot_init_values(self): | ||
"""Plots init values as a function of forward and visually compares it to the initial distribution""" | ||
fig, axs = plt.subplots(1, 1, figsize=(10, 5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of hard coding figure subplots and sizes. Can we either rework this to be more general, or remove the plotting part as a function somewhere else?
def plot_priors(self): | ||
"""Plots prior distributions for each param in param_names""" | ||
num_plots = len(self.priors) | ||
fig, axs = plt.subplots(num_plots, 1, figsize=(10, 10)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here with hard coding fig size etc. idk if this belongs in a base model?
pass | ||
|
||
def log_likelihood(self, x, y, params): | ||
return -np.sum((y - self.forward(x, params)) ** 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses a function that is not implemented? Can we get this from scipy instead of calc ourselves? Something like: https://stackoverflow.com/questions/59869759/log-likelihood-function-generated-by-scipy-stats-rv-continuous-fit ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into this.
loss_temp = -self.log_likelihood(x, y, params) | ||
if use_priors: | ||
loss_temp = loss_temp - self.log_prior(params) | ||
return loss_temp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these functions for ML related calcs? Loss/log liklihood/priors, etc?
If so, maybe the belong in an ML specific file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes these functions are essentially what we want to minimize, the minimization tells us the best fit parameters. In the case where log prior is used our minimization problem turns into a bayesian regression problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically, the use of priors/Bayesian inference isn't really ML and doesn't have that much overhead, I'm not sure if we should separate into different classes. We should discuss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The matlab2py file still exist, so this file should still exist too. May need to get updated or renamed, but the functions being tested here are still in the repo? Unless I'm missing something?
|
||
class ROI(BaseModel, ABC): | ||
roi_type: str | ||
center: List[PositiveFloat] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see from the indexing below that center is two floats (which makes sense x/y), maybe show that shape here or comment. Should we force this to always be a list of two floats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder ask ryan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the ROI for an image? If so, should these be integers so you get the centre x/y pixel? or if it's in millimeters I would add the units to the variable name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: look at this file again after next commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: look at this file again after next commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a test for these?
…oi and image processor
…sing and test_roi
…ile_data, working through request to change self.init_values in methodBase from a list a dictionary
self.param_names: list = None | ||
self.param_bounds: np.ndarray = None | ||
self.init_values: dict = None | ||
self.init_values_list: list = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recall that projection_fit.py needs a list in order to use fit_model. If you change the type of self.init_values to a dict and unpack it to an ordered list inside fit model then now ProjectionFit becomes model dependent instead of a template. (knowledge of the gaussian model params should not be in projection_fit). Recommending instead make a dictionary and a list of param values inside MethodBase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my other comment on why init_values_list is not needed
if not isinstance(profile_data, np.ndarray): | ||
raise TypeError("Input must be ndarray") | ||
self._profile_data = profile_data | ||
self.find_init_values(self._profile_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed function call here from self.find_priors to self.find_init_values. Please see the follow comment for changes in self.find_init_values
mean = np.argmax(gaussian_filter(data, sigma=5)) / (len(data)) | ||
sigma = 0.1 | ||
self.init_values_list = np.array([amplitude, mean, sigma, offset]) | ||
self.init_values = {"amplitude":amplitude,"mean":mean,"sigma":sigma,"offset":offset} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was requested that users might like the option to grab init values and not have to have knowledge of what parameters belonged to the ordered np.array. So self.init_values was changed from a list to a dictionary. However, to run scipy.optimize.minimize an np.array is still needed so that is saved in a different variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't save it in a different variable, we should just do the mapping inside the scipy.optimize.minimize call, should be a one-liner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize, I did nor realize python dictionaries are ordered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while technically dicts are ordered we don't have to rely on that, we can use
np.array([self.init_values[name] for name in self.param_names])
self.init_values_list = np.array([amplitude, mean, sigma, offset]) | ||
self.init_values = {"amplitude":amplitude,"mean":mean,"sigma":sigma,"offset":offset} | ||
# if use_priors = True in projection_fit then find priors? use case where projection fit is instantiated with use_priors = False then flag is changed but you have no priors | ||
self.find_priors() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I highly recommend calling self.find_priors when init_values are found.
Lets consider the workflow for projection_fit.fit_projection(some_projection_data).
1.) This method passes normalized projection data to projection.setup_model.
2.) projection.setup_model calls the setter method for gaussian_model.profile_data
3.) The setter method for gaussian_model.profile_data triggers self.find_init_values
4.) those init values are used in projection_fit.fit_model()
5.) now consider the case when projection_fit.use_priors = True, every thing works smoothly right? Well no.
6.) projection_fit.use_priors = True can be set upon initialization or after. Also, the advantage of having gaussian_model.profile_data be a @Property is that it can be updated easily. Meaning that we can call projection_fit.fit_projection( with_any_1d_array).
7.) So use_priors = True but priors have not changed. Which is not good. Calling gauss_model.find_priors inside
projection_fit.fit_model() is also bad.
edit: I am not sure why stuff is bolded.
#making an ordered list here is making this fit_model model dependent since it needs knowledge of the params list ordering | ||
|
||
res = scipy.optimize.minimize( | ||
self.model.loss, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a see a potential problem with making forward and log prior receive a dictionary. Model.loss calls forward and log prior. The problem is that it is scipy.optimize.minimize would pass an unpacked list of length == len(model.init_values) to model.loss on reiterations.
I can't think of how I would be able to resolve this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the calls in model.loss to model._forward and model._log_prior which take np arrays instead. The public versions can take dicts as arguments
… to working code on main
… will be used in merge
…d maybe make norm a private variable of gaussModel.py and stop using _forward as a staticmethod
…iob pydantic fields.
…on to fit without displaying normalized values, changed fit_model method in scipy.optimize.minimize from default (BFGS) to Powell
@phys-cgarnier @eloiseyang should we close this PR now that it's been reworked/merged elsewhere? |
Let's leave it for now. I might need to use this branch to pull out the last of the changes we haven't yet. Once everything is merged we can close. |
Updating fitting tools, image processing, and screen beam profile measurement classes, draft PR