-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a new fitting toolkit and examples of how to use it with Screen Profiles. #143
Draft
phys-cgarnier
wants to merge
41
commits into
slaclab:main
Choose a base branch
from
phys-cgarnier:dev
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
79ac28f
init commit of screen_beam_profile.py and supporting python files (p…
phys-cgarnier 49088e7
same
phys-cgarnier eacda2a
updated measurement.py, renamed test_screen_beam_profile, added test…
phys-cgarnier 204e14d
initial commit of screen_beam_profile measurement and test notebook, …
phys-cgarnier 4545cc3
gave docstring to measurement.py simplified test_measurement.ipynb
phys-cgarnier 8a6adcd
updating doc strings, and very minor details of python files
phys-cgarnier 732a972
updated return structure of screen_beam_profile_measurement.py, and a…
phys-cgarnier a402fd8
Merge branch 'dev' of https://github.com/phys-cgarnier/lcls-tools int…
phys-cgarnier a9ed3d6
removed unused dependencies in an attemp to resolve some flake8 errors
phys-cgarnier 26fc338
more issues with flake8
phys-cgarnier 4b0523b
more flake8
phys-cgarnier b14877e
black was unresolving some flake8 errors.... fixed that
phys-cgarnier f2f0cb3
changed return type type hinting for projection_fit.fit_projection si…
phys-cgarnier 50d6263
changed return type type hinting for projection_fit.fit_projection si…
phys-cgarnier c4b347a
cleaning up code for PR
phys-cgarnier f955c01
added docstrings in screenbeamprofile measurement started tests for r…
phys-cgarnier 44223a3
updating unittests for test_image_processing and test_roi, still not …
phys-cgarnier 363ed99
updated files to have TODO comments, also completed test_image_proces…
phys-cgarnier 713c324
updating code to no longer use distribution_data but instead use prof…
phys-cgarnier a576c0a
implemented init_values dictionary correctly.
phys-cgarnier 89dba3d
code is in broken state made a branch to save changes but revert back…
phys-cgarnier f343813
cleaning example notebook
phys-cgarnier 9ee6c08
committing working version (plots included) resolved bug, next commit…
phys-cgarnier 36a46bb
setup fitting_tool.plot_fit()
phys-cgarnier 76af0d6
merged feature requests discussed in PR notes
phys-cgarnier 9ad190e
moved example_measurement to examples folder
phys-cgarnier 92f4011
changed model._forward to use scipy.stats.norm, fit looks okay, shoul…
phys-cgarnier 631541d
updated test_image_processing to include clipping, removed visualizat…
phys-cgarnier e5cb4bf
ran black
phys-cgarnier 69349b8
removed whitespace and unused dependencies
phys-cgarnier 5be4f7c
reorg fit methods, changed base model init to require data when class…
nneveu 84abf86
fixing formatting
nneveu 8b8b645
one more rename of fitting files before PR
nneveu 611eda1
gaussian fit updates + test
nneveu 29eef37
made changes to GaussianModel find init values.
phys-cgarnier 0523fe2
skipped tests in test_methods.py will fix
phys-cgarnier 382e502
fixed flake8 syntax errors
phys-cgarnier 4ad8f45
fixed flake8 syntax errors
phys-cgarnier 964bcd8
fixed flake8 syntax errors
phys-cgarnier f686f49
committing changes made during previous meeting
phys-cgarnier 8c85a56
added kwargs to projection.plot_fit so that user can compare projecti…
phys-cgarnier File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
from abc import ABC, abstractmethod | ||
import numpy as np | ||
from matplotlib import pyplot as plt | ||
|
||
|
||
class MethodBase(ABC): | ||
""" | ||
Base abstract class for all fit methods, which serves as the bare minimum skeleton code needed. | ||
Should be used only as a parent class to all method models. | ||
--------------------------- | ||
Arguments: | ||
param_names: list (list all of param names that the model will contain) | ||
param_guesses: np.ndarray (array that contains a guess as to what | ||
each param value is organized in the same order as param_names) | ||
param_bounds: np.ndarray (array that contains the lower | ||
and upper bound on for acceptable values of each parameter) | ||
""" | ||
|
||
param_names: list = None | ||
param_bounds: np.ndarray = None | ||
|
||
def __init__(self): | ||
|
||
self.init_values: dict = None | ||
self.fitted_params_dict: dict = None | ||
|
||
@abstractmethod | ||
def find_init_values(self) -> list: | ||
... | ||
|
||
@abstractmethod | ||
def find_priors(self, data: np.ndarray) -> dict: | ||
... | ||
|
||
# TODO: move to plotting file | ||
def plot_init_values(self): | ||
init_values = np.array(list(self.init_values.values())) | ||
"""Plots init values as a function of forward and visually compares it to the initial distribution""" | ||
fig, axs = plt.subplots(1, 1) | ||
x = np.linspace(0, 1, len(self.profile_data)) | ||
y_fit = self._forward(x, init_values) | ||
axs.plot(x, self.profile_data, label="Projection Data") | ||
axs.plot(x, y_fit, label="Initial Guess Fit Data") | ||
axs.set_xlabel("x") | ||
axs.set_ylabel("Forward(x)") | ||
axs.set_title("Initial Fit Guess") | ||
return fig, axs | ||
|
||
# TODO: move to plotting file | ||
def plot_priors(self): | ||
"""Plots prior distributions for each param in param_names""" | ||
num_plots = len(self.priors) | ||
fig, axs = plt.subplots(num_plots, 1) | ||
for i, (param, prior) in enumerate(self.priors.items()): | ||
x = np.linspace(0, self.param_bounds[i][-1], len(self.profile_data)) | ||
axs[i].plot(x, prior.pdf(x)) | ||
axs[i].axvline( | ||
self.param_bounds[i, 0], | ||
ls="--", | ||
c="k", | ||
) | ||
axs[i].axvline(self.param_bounds[i, 1], ls="--", c="k", label="bounds") | ||
axs[i].set_title(param + " prior") | ||
axs[i].set_ylabel("Density") | ||
axs[i].set_xlabel(param) | ||
fig.tight_layout() | ||
return fig, axs | ||
|
||
def forward(self, x: np.ndarray, params: dict) -> np.ndarray: | ||
# TODO:test new usage | ||
|
||
params_list = np.array([params[name] for name in self.param_names]) | ||
return self._forward(x, params_list) | ||
|
||
@staticmethod | ||
@abstractmethod | ||
def _forward(x: np.ndarray, params: np.ndarray) -> np.ndarray: | ||
... | ||
|
||
def log_prior(self, params: dict): | ||
# TODO:test new usage | ||
print(params) | ||
params_list = np.array([params[name] for name in self.param_names]) | ||
return self._log_prior(params_list) | ||
|
||
@abstractmethod | ||
def _log_prior(self, params: np.ndarray): | ||
... | ||
|
||
def log_likelihood(self, x: np.ndarray, y: np.ndarray, params: dict): | ||
# TODO:test new usage | ||
params_list = np.array([params[name] for name in self.param_names]) | ||
return self._log_likelihood(x, y, params_list) | ||
|
||
def _log_likelihood(self, x: np.ndarray, y: np.ndarray, params: np.ndarray): | ||
# reducing error between data and fit | ||
return -np.sum((y - self._forward(x, params)) ** 2) | ||
|
||
def loss(self, params, x, y, use_priors=False): | ||
# TODO:implement using private functions _log_likelihood and _log_prior | ||
# ML group way of iterating over priors/ reducing difference in fit and data | ||
loss_temp = -self._log_likelihood(x, y, params) | ||
if use_priors: | ||
loss_temp = loss_temp - self._log_prior(params) | ||
return loss_temp | ||
|
||
@property | ||
def priors(self): | ||
""" | ||
Initial Priors store in a dictionary where the keys are the | ||
complete set of parameters of the Model | ||
""" | ||
return self._priors | ||
|
||
@priors.setter | ||
def priors(self, priors): | ||
if not isinstance(priors, dict): | ||
raise TypeError("Input must be a dictionary") | ||
self._priors = priors | ||
|
||
@property | ||
def profile_data(self): | ||
"""1D array typically projection data""" | ||
return self._profile_data | ||
|
||
@profile_data.setter | ||
def profile_data(self, profile_data): | ||
if not isinstance(profile_data, np.ndarray): | ||
raise TypeError("Input must be ndarray") | ||
self._profile_data = profile_data | ||
# should change these to private methods only called when profile data is set | ||
self.find_init_values() | ||
self.find_priors() | ||
self.fitted_params_dict = {} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
import numpy as np | ||
from scipy.stats import norm, gamma | ||
from lcls_tools.common.data_analysis.fit.method_base import MethodBase | ||
|
||
|
||
class GaussianModel(MethodBase): | ||
""" | ||
GaussianModel Class that finds initial parameter values for gaussian distribution | ||
and builds probability density functions for the likelyhood a parameter | ||
to be that value based on those initial parameter values. | ||
Passing this class the variable profile_data automatically updates | ||
the initial values and and probability density functions to match that data. | ||
""" | ||
|
||
param_names: list = ["mean", "sigma", "amplitude", "offset"] | ||
param_bounds: np.ndarray = np.array( | ||
[ | ||
[0.01, 1.0], | ||
[0.01, 5.0], | ||
[0.01, 1.0], | ||
[0.01, 1.0], | ||
] | ||
) | ||
|
||
def find_init_values(self) -> dict: | ||
"""Fit data without optimization, return values.""" | ||
|
||
data = self._profile_data | ||
x = np.linspace(0, 1, len(data)) | ||
# init_fit = norm.pdf(data) | ||
offset = data.min() | ||
amplitude = data.max() - offset | ||
|
||
weighted_mean = np.average(x, weights=data) | ||
weighted_sigma = np.sqrt(np.cov(x, aweights=data)) | ||
|
||
self.init_values = { | ||
self.param_names[0]: weighted_mean, # data.mean() | ||
self.param_names[1]: weighted_sigma, # data.std() | ||
self.param_names[2]: amplitude, | ||
self.param_names[3]: offset, | ||
} | ||
return self.init_values | ||
|
||
def find_priors(self) -> dict: | ||
"""Do initial guesses based on data and make distribution from that guess.""" | ||
# Creating a gamma distribution around the inital amplitude. | ||
# TODO: add to comments on why gamma vs. normal dist used for priors. | ||
amplitude_mean = self.init_values["amplitude"] | ||
amplitude_var = 0.05 | ||
amplitude_alpha = (amplitude_mean**2) / amplitude_var | ||
amplitude_beta = amplitude_mean / amplitude_var | ||
amplitude_prior = gamma(amplitude_alpha, loc=0, scale=1 / amplitude_beta) | ||
|
||
# Creating a normal distribution of points around the inital mean. | ||
mean_prior = norm(self.init_values["mean"], 0.1) | ||
# TODO: remove hard coded numbers? | ||
sigma_alpha = 2.5 | ||
sigma_beta = 5.0 | ||
sigma_prior = gamma(sigma_alpha, loc=0, scale=1 / sigma_beta) | ||
|
||
# Creating a normal distribution of points around initial offset. | ||
offset_prior = norm(self.init_values["offset"], 0.5) | ||
self.priors = { | ||
self.param_names[0]: mean_prior, | ||
self.param_names[1]: sigma_prior, | ||
self.param_names[2]: amplitude_prior, | ||
self.param_names[3]: offset_prior, | ||
} | ||
return self.priors | ||
|
||
def _forward(self, x: np.array, params: np.array): | ||
# Load distribution parameters | ||
# needs to be array for scipy.minimize | ||
mean = params[0] | ||
sigma = params[1] | ||
amplitude = params[2] | ||
offset = params[3] | ||
return ( | ||
np.sqrt(2 * np.pi) * amplitude * norm.pdf((x - mean) / sigma) + offset | ||
) # norm.pdf(x, loc=mean, scale=sigma) #( | ||
|
||
# TODO: remove when above is confirmed the same/below not needed. | ||
# @staticmethod | ||
# def _forward(x: np.ndarray, params_list: np.ndarray): | ||
# amplitude = params_list[0] | ||
# mean = params_list[1] | ||
# sigma = params_list[2] | ||
# offset = params_list[3] | ||
# normal = norm() | ||
# return ((np.sqrt(2 * np.pi)) * amplitude) * normal.pdf( | ||
# (x - mean) / sigma | ||
# ) + offset | ||
|
||
def _log_prior(self, params: np.ndarray) -> float: | ||
return np.sum( | ||
[ | ||
prior.logpdf(params[i]) | ||
for i, (key, prior) in enumerate(self.priors.items()) | ||
] | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
import numpy as np | ||
import scipy.optimize | ||
import scipy.signal | ||
from matplotlib import pyplot as plt | ||
from pydantic import BaseModel, ConfigDict | ||
from lcls_tools.common.data_analysis.fit.method_base import MethodBase | ||
|
||
|
||
class ProjectionFit(BaseModel): | ||
""" | ||
1d fitting class that allows users to choose the model with which the fit | ||
is performed, and if prior assumptions (bayesian regression) about | ||
the data should be used when performing the fit. | ||
Additionally there is an option to visualize the fitted data and priors. | ||
-To perform a 1d fit, call fit_projection(projection_data={*data_to_fit*}) | ||
------------------------ | ||
Arguments: | ||
model: MethodBase (this argument is a child class object of method base | ||
e.g GaussianModel & DoubleGaussianModel) | ||
visualize_priors: bool (shows plots of the priors and init guess | ||
distribution before fit) | ||
use_priors: bool (incorporates prior distribution information into fit) | ||
visualize_fit: bool (visualize the parameters as a function of the | ||
forward function | ||
from our model compared to distribution data) | ||
""" | ||
|
||
# TODO: come up with better name | ||
model_config = ConfigDict(arbitrary_types_allowed=True) | ||
model: MethodBase | ||
use_priors: bool = False | ||
|
||
def normalize(self, data: np.ndarray) -> np.ndarray: | ||
""" | ||
Normalize a 1d array by scaling and shifting data | ||
s.t. data is between 0 and 1 | ||
""" | ||
data_copy = data.copy() | ||
normalized_data = (data_copy - np.min(data)) / (np.max(data) - np.min(data)) | ||
return normalized_data | ||
|
||
def unnormalize_model_params( | ||
self, params_dict: dict, projection_data: np.ndarray | ||
) -> np.ndarray: | ||
""" | ||
Takes fitted and normalized params and returns them | ||
to unnormalized values i.e the true fitted values of the distribution | ||
""" | ||
|
||
projection_data_range = np.max(projection_data) - np.min(projection_data) | ||
length = len(projection_data) | ||
for key, val in params_dict.items(): | ||
if "sigma" in key or "mean" in key: | ||
true_fitted_val = val * length | ||
elif "offset" in key: | ||
true_fitted_val = val * projection_data_range + np.min(projection_data) | ||
else: | ||
true_fitted_val = val * projection_data_range | ||
temp = {key: true_fitted_val} | ||
params_dict.update(temp) | ||
return params_dict | ||
|
||
def model_setup(self, projection_data=np.ndarray) -> None: | ||
"""sets up the model and init_values/priors""" | ||
self.model.profile_data = projection_data | ||
|
||
def fit_model(self) -> scipy.optimize._optimize.OptimizeResult: | ||
""" | ||
Fits model params to distribution data and plots the fitted params | ||
as a function of the model. | ||
Returns optimizeResult object | ||
""" | ||
x = np.linspace(0, 1, len(self.model.profile_data)) | ||
y = self.model.profile_data | ||
|
||
init_values = np.array( | ||
[self.model.init_values[name] for name in self.model.param_names] | ||
) | ||
|
||
res = scipy.optimize.minimize( | ||
self.model.loss, | ||
init_values, | ||
args=(x, y, self.use_priors), | ||
bounds=self.model.param_bounds, | ||
method="Powell" | ||
) | ||
return res # TODO:optional argument to return fig,ax | ||
|
||
def fit_projection(self, projection_data: np.ndarray) -> dict: | ||
""" | ||
type is dict[str, float] | ||
Wrapper function that does all necessary steps to fit 1d array. | ||
Returns a dictionary where the keys are the model params and their | ||
values are the params fitted to the data | ||
""" | ||
assert len(projection_data.shape) == 1 | ||
fitted_params_dict = {} | ||
normalized_data = self.normalize(projection_data) | ||
self.model_setup(projection_data=normalized_data) | ||
res = self.fit_model() | ||
for i, param in enumerate(self.model.param_names): | ||
fitted_params_dict[param] = (res.x)[i] | ||
self.model.fitted_params_dict = fitted_params_dict.copy() | ||
params_dict = self.unnormalize_model_params(fitted_params_dict, projection_data) | ||
return params_dict | ||
|
||
def plot_fit(self, **kwargs): | ||
""" | ||
Plots the normalized fitted forward function against the normed data. | ||
To plot unnormalized projection data, pass the unnormalized forward | ||
function params and data as kwargs. | ||
""" | ||
fitted_params_dict = {} | ||
projection_data = [] | ||
if kwargs: | ||
for key, val in kwargs.items(): | ||
if key in self.model.fitted_params_dict: | ||
fitted_params_dict[key] = val | ||
else: | ||
projection_data = val | ||
x = np.linspace(0, len(projection_data), len(projection_data)) | ||
else: | ||
fitted_params_dict.update(self.model.fitted_params_dict) | ||
projection_data = self.model.profile_data | ||
x = np.linspace(0, 1, len(self.model.profile_data)) | ||
|
||
y = projection_data | ||
fig, ax = plt.subplots() | ||
y_fit = self.model.forward(x, fitted_params_dict) | ||
ax.plot(x, y, label="data") | ||
ax.plot(x, y_fit, label="fit") | ||
|
||
textstr = "\n".join( | ||
[ | ||
r"$\mathrm{%s}=%.2f$" % (key, val) | ||
for key, val in fitted_params_dict.items() | ||
] | ||
) | ||
ax.text( | ||
0.05, | ||
0.95, | ||
textstr, | ||
transform=ax.transAxes, | ||
verticalalignment="top", | ||
) | ||
return fig, ax |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the extra () here? Was it added by black/flake8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a typo on my end, I don't remember editing this file, thank you.