Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enthalpic and Entropic Contributions from BAR #540

Closed
wants to merge 3 commits into from

Conversation

jaclark5
Copy link
Contributor

This PR adds a function to calculate the enthalpic and entropic contributions of the free energy from BAR according to a previously published derivation.

@mrshirts
Copy link
Collaborator

mrshirts commented Sep 18, 2024

The derivation in this paper is not really correct, or rather, it has a chunk of extra stuff whose expectation is zero and adds noise to the calculation (I talked to them a while back about this).

Are there reasons the existing MBAR tools for entropy and enthalpy aren't useful/working? Or just not exposed correctly?

(i.e. MBAR with 2 states gives the same expectations as BAR, you use the 2 state MBAR entropy/enthalpy calculations).

@jaclark5
Copy link
Contributor Author

jaclark5 commented Sep 18, 2024

@mrshirts It's exposed fine, but I have a use case where the free estimate from MBAR isn't aligning with BAR and TI estimates which are in close agreement to each other. I then decided to stick with BAR but still want the enthalpy/entropy contributions.

That's a good point about the 2-state MBAR, I'll check that out.

@jaclark5
Copy link
Contributor Author

jaclark5 commented Sep 18, 2024

This PR also contains some small changes to improve documentation and removing someone's .vscode settings. I can close this PR and open another one with those.

Alternatively, if the derivation is somewhat correct, can we adapt it to be completely correct?

@mrshirts
Copy link
Collaborator

but I have a use case where the free estimate from MBAR isn't aligning with BAR and TI estimates which are in close agreement to each other. I then decided to stick with BAR but still want the enthalpy/entropy contributions.

BAR and MBAR not agreeing in value would generally indicate there is either something suspicious with the data or the solver isn't working correctly. I would be happy to take a look if that's the case!

Also BAR and MBAR have DIFFERENT error estimates (i.e. if you do 2-state MBAR and BAR you should get the SAME number, but DIFFERENT error estimates, especially for poor overlap), and that is something that I have been working to address. Bootstrap errors will be the same (and somewhere in between the analytical error estimates)

@mrshirts
Copy link
Collaborator

mrshirts commented Sep 18, 2024

Alternatively, if the derivation is somewhat correct, can we adapt it to be completely correct?

IIRC (it's been quite a while since I worked through it), it has a term whose expectation is zero - if you leave that term out, it's the same as the MBAR 2-state entropy/enthalpy estimate.

@mrshirts
Copy link
Collaborator

This PR also contains some small changes to improve documentation and removing someone's .vscode settings. I can close this PR and open another one with those.

That would be great!

@jaclark5 jaclark5 closed this Sep 18, 2024
@ijpulidos
Copy link
Contributor

IIRC (it's been quite a while since I worked through it), it has a term whose expectation is zero - if you leave that term out, it's the same as the MBAR 2-state entropy/enthalpy estimate.

That does sound like a good check. Makes me wonder if this is something that we are testing in our CI or unit tests. If not, maybe that's also worth adding to the tests.

@jaclark5
Copy link
Contributor Author

jaclark5 commented Sep 19, 2024

@mrshirts @ijpulidos I implemented the option to output the enthalpy/entropy for MBAR and BAR (via 2-state MBAR) in a PR to alchemlyb.

From the solvated benzene example in the tests, the free energies for MBAR and stitched 2-state MBAR are close:
MBAR: [0.0, 1.619069, 2.557990, 2.986302, 3.041156]
"BAR": [0.0, 1.609778, 2.547866, 2.984183, 3.044385]

The entropies and enthalpies are less satisfying.
MBAR U: [0.0, 1.241970, 1.695000, 1.706555, 1.388906]
"BAR" U: [0.0, 1.337356, 2.140400, 2.512384, 2.563935]
MBAR S: [0.0, -0.377099, -0.862990, -1.279746, -1.652250]
"BAR" S: [0.0, -0.272422, -0.407467, -0.471798, -0.480450]

@mrshirts
Copy link
Collaborator

Entropy and enthalpy are always statistically far more noisy than free energy. I wrote about 60% of paper on this a few years ago that never got finished. Did you calculate the uncertainty for these quantities, preferably by bootstrap?

@jaclark5
Copy link
Contributor Author

jaclark5 commented Sep 19, 2024

@mrshirts The previous numbers were cumulative across states. Here is the output for adjacent state values and uncertainties (default without bootstrapping), they're pretty comparable, where "BAR" is a series of 2-state MBAR computations (I haven't coded up the error propagation for BAR with bootstrapping, I would love if someone has that):
MBAR F: [-1.619069, -0.938921, -0.428311, -0.054854] [0.008802, 0.006642, 0.005362, 0.005133]
"BAR" F: [-1.609778, -0.938088, -0.436317, -0.060202] [0.009879, 0.008740, 0.007372, 0.006381]
MBAR U: [-1.241970, -0.453030, -0.011556, 0.317650] [0.01160, 0.008035, 0.005988, 0.005625]
"BAR" U: [-1.337356, -0.803044, -0.371984, -0.051550] [0.011062, 0.008801, 0.007036, 0.006106]
MBAR S: [0.377099, 0.485891, 0.416756, 0.372504] [0.012242, 0.009487, 0.007131, 0.005362]
"BAR" S: [0.272422, 0.135045, 0.064332, 0.008652] [0.008932, 0.006298, 0.004190, 0.002941]

Here are the values and uncertainties with n_bootstrapping = 100:
MBAR F: [-1.619069, -0.938921, -0.428311, -0.054854] [0.008993, 0.00693 , 0.005482, 0.004979]
"BAR" F: [-1.609778, -0.938088, -0.436317, -0.060202] [0.010079, 0.007803, 0.007863, 0.006922]
MBAR U: [-1.24197 , -0.45303 , -0.011556, 0.31765 ] [0.011248, 0.006933, 0.005399, 0.00513 ]
"BAR" U: [-1.337356, -0.803044, -0.371984, -0.051550] [0.011885, 0.009307, 0.008083, 0.006947]
MBAR S: [0.377099, 0.485891, 0.416756, 0.372504] [0.011365, 0.007181, 0.005037, 0.003795]
"BAR" S: [0.272422, 0.135045, 0.064332, 0.008652] [0.008418, 0.005638, 0.003662, 0.002183]

@jaclark5
Copy link
Contributor Author

Entropy and enthalpy are always statistically far more noisy than free energy. I wrote about 60% of paper on this a few years ago that never got finished. Did you calculate the uncertainty for these quantities, preferably by bootstrap?

The paper referenced in this PR covers that ;)

@mrshirts
Copy link
Collaborator

(I haven't coded up the error propagation for BAR with bootstrapping

You should not need to do error propagation for BAR with bootstrapping. You calculate the values (given the ENTIRE PROCESS of adding up N-1 BAR estimates) with N bootstrap samples. One of the biggest advantages of bootstrapping is avoiding error propagation.

@mrshirts
Copy link
Collaborator

Something is up with those uncertainties. They should be larger than the free energy uncertainties. There are correlations between the U and the S that should be showing up.

Free energies are all behaving really nicely (within uncertainties between BAR and MBAR), U and S are not.

I probably won't have time to look at it today, maybe Friday or the weekend.

@jaclark5
Copy link
Contributor Author

jaclark5 commented Sep 19, 2024

You should not need to do error propagation for BAR with bootstrapping. You calculate the values (given the ENTIRE PROCESS of adding up N-1 BAR estimates) with N bootstrap samples. One of the biggest advantages of bootstrapping is avoiding error propagation.

I meant that I don't have the code to populate the cumulative FE matrix in alchemlyb, which is why I put the adjacent states.

One can reproduce these numbers using my branch for this alchemlyb PR and the following code:

import numpy as np
import alchemlyb
from alchemlyb.parsing import gmx
from alchemlyb.estimators import MBAR, BAR
from alchemtest.gmx import load_benzene

n_bootstrap = 100

dataset = load_benzene()
u_nk = alchemlyb.concat([gmx.extract_u_nk(file, T=300) for file in dataset["data"]["Coulomb"]])

mbar = MBAR(n_bootstraps=n_bootstrap).fit(u_nk, compute_entropy_enthalpy=True)
lx = len(mbar.delta_f_) - 1
print("MBAR F: {} {}".format(
    np.round([mbar.delta_f_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
    np.round([mbar.d_delta_f_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
))
print("MBAR U: {} {}".format(
    np.round([mbar.delta_h_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
    np.round([mbar.d_delta_h_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
))
print("MBAR S: {} {}".format(
    np.round([mbar.delta_s_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
    np.round([mbar.d_delta_s_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
))

bar = BAR(n_bootstraps=n_bootstrap).fit(u_nk, compute_entropy_enthalpy=True, use_mbar=True)
print("BAR F: {} {}".format(
    np.round([bar.delta_f_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
    np.round([bar.d_delta_f_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
))
print("BAR U: {} {}".format(
    np.round([bar.delta_h_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
    np.round([bar.d_delta_h_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
))
print("BAR S: {} {}".format(
    np.round([bar.delta_s_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
    np.round([bar.d_delta_s_.to_numpy()[i + 1, i] for i in range(lx)], decimals=6),
))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants