-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fitting Base Class and Gaussian Model from PR 143 #172
Conversation
Added GaussianModel class using MethodBase Added ProjectionFit class Added tests for all the above including datasets Separated from 8c85a56 on phys-cgarnier/lcls-tools/dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fitting classes were written with ML in mind. Can we find the intersection between the ML and physics needs? Otherwise, we may have to separate out the ML fitting tools and the physics fitting tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should complete this before merging.
weighted_sigma = np.sqrt(np.cov(x, aweights=data)) | ||
|
||
self.init_values = { | ||
self.param_names[0]: weighted_mean, # data.mean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text keys are more clear than the numbered indices here. I don't imagine these param names are going to change all that often. Maybe a pydantic structure would be a better way of accomplishing what this does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We can discuss what that would look like in our meeting.
# Creating a normal distribution of points around initial offset. | ||
offset_prior = norm(self.init_values["offset"], 0.5) | ||
self.priors = { | ||
self.param_names[0]: mean_prior, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numbered indices are not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe Nicole did not want hardcoded names though, so the compromise was we made a class attribute than contained the param names as a list that wouldn't get overwritten easily. I agree the using indices on that list might be confusing to a reviewer though, I am not sure what the best way to resolve both problems is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that we want to avoid hard coded EPICS names. Hard coded keys of a predefined dictionary should be okay. I think defining a struct-like class could be a good solution here.
def _forward(self, x: np.array, params: np.array): | ||
# Load distribution parameters | ||
# needs to be array for scipy.minimize | ||
mean = params[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numbered indices are not clear. What is the params list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Params list is the list of parameters the fit function uses. At the method class level, they need to be reference arbitrarily Furthermore, scipy.optimize.minimize requires they be stored in an ordered list.
params_dict = self.unnormalize_model_params(fitted_params_dict, projection_data) | ||
return params_dict | ||
|
||
def plot_fit(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should separate out plotting functions.
self.assertAlmostEqual(fit.min(), 0.0, places=self.decimals) | ||
return | ||
|
||
# def super_gaussian(self, x, amp, mu, sig, P, offset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should delete this test before we push.
and upper bound on for acceptable values of each parameter) | ||
""" | ||
|
||
param_names: list = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design pattern is exactly the kind of thing pydantic is meant to simplify. Can we a simpler way to accomplish this?
…lving params, might have to remove priors still but its working
made changes as requested in the comment (minus removing _forward)
Created Pydantic Fitting Parameters Class
@phys-cgarnier added some changes to the way we set parameters in the model. Waiting on a single function fitting tool for physics applications before merging. |
… have a default field, also set method in projection_fit.py to have a default file. This will make using the tool much easier. All that has to be down is call gaussian fit and give it an image to use
Gaussian FIt
Thanks for pushing the beamsize code, @phys-cgarnier. It looks good to me, we just have some flake8 tests failing. If you could fix those I think we're ready to push this branch. We'll want to write some regression tests for the gaussian fit and beamsize scripts too, but I think that can wait for now. @nneveu can you take a look too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much better, thanks for the the edits @phys-cgarnier and @eloiseyang! I'd like to see a few more comments for functions, remove the commented out test, and an issue written on circling back to the naming conventions, but otherwise ok to merge. Thanks so much!
} | ||
return result | ||
else: | ||
centroid = beamspot_chars['centroid'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we comment the two cases here for calc beamsize, i.e. you do the first if the intensity is low and the else if intensity is high enough? and how was the default threshold picked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will comment both on both cases. The default was selected by the ML group, I just used theirs so I am sure how the decided on it.
"Sx": np.NaN, | ||
"Sy": np.NaN, | ||
"bb_penalty": np.NaN, | ||
"total_intensity": beamspot_chars["total_intensity"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is repeated except the 10**? maybe combine with above?
|
||
|
||
print(pts) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to leave this print?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No thats a typo, lol.
n_stds = self.bounding_box_half_width | ||
|
||
pts = np.array( | ||
( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we be more descriptive than pts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will rename it
temp = pts - np.array((roi_radius, roi_radius)) | ||
distances = np.linalg.norm(temp, axis=1) | ||
# subtract radius to get penalty value | ||
bb_penalty = np.max(distances) - roi_radius |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bounding box penalty or something else? can you add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I will add a comment with more description.
return priors | ||
|
||
def _forward(self, x: np.array, method_parameter_list: np.array): | ||
# Load distribution parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the function doing the fit right? can we rename it to something with 'fit' in the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a #TODO comment to revisit would be enough for now.
return normalized_data | ||
|
||
def unnormalize_model_params( | ||
self, method_params_dict: dict, projection_data: np.ndarray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does 'model' need to be in this function name?
"""sets up the model and init_values/priors""" | ||
self.model.profile_data = projection_data | ||
|
||
def fit_model(self) -> scipy.optimize._optimize.OptimizeResult: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to say fit_model or fit_method here?
def fit_model(self) -> scipy.optimize._optimize.OptimizeResult: | ||
""" | ||
Fits model params to distribution data and plots the fitted params | ||
as a function of the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we still plotting? maybe update this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the empty file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know I believe there was a test there originally. I will remove it then write the test in another PR
Added MethodBase as an abstract class for all fit methods.
Added GaussianModel class using MethodBase.
Added ProjectionFit class.
Added tests for all the above including datasets.
Separated from pull request 143.