Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs missing tutorial using LMDBDatabase and custom properties #788

Open
zulissimeta opened this issue Aug 1, 2024 · 4 comments
Open

Docs missing tutorial using LMDBDatabase and custom properties #788

zulissimeta opened this issue Aug 1, 2024 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@zulissimeta
Copy link
Collaborator

zulissimeta commented Aug 1, 2024

#787 highlights that our docs have a hole for users who want to train on molecule properties with custom outputs like homo-lumo gaps.

We should add a simple example to the tutorials, perhaps:

  1. download qm9
  2. write an ASE db
  3. fine-tune a checkpoint for homo-lumo
@zulissimeta zulissimeta added the enhancement New feature or request label Aug 1, 2024
@siyu-g
Copy link

siyu-g commented Aug 2, 2024

Hi Zach,

Thanks for your quick response. Long time no see! This is Bruno from Noa's group speaking. I would be happy contribute to the tutorial. And I believe during my intern last year, I was able to write the data preprocessing script/documentation to convert both QM9 and OE62 data to the LMDBs, I am just write to ask if the scripts and docs are still available. If so, it would make it a lot easier for me to generate the LMDB, make tutorials, and use the ocp models in further applications.

Thanks,
Bruno

@siyu-g
Copy link

siyu-g commented Aug 8, 2024

Hi,
I am just following up the previous message. Is there any file that I can refer to when trying to train a molecular property?

Bruno

@zulissimeta
Copy link
Collaborator Author

Hi, I am just following up the previous message. Is there any file that I can refer to when trying to train a molecular property?

Bruno

Sorry I missed this!

To write an ASE LMDB:

from fairchem.core.datasets.datasets.lmdb_database import LMDBDatabase

with LMDBDatabase('my_dataset.aselmdb') as db:
    for atoms in atoms_list:
        db.write(atoms)
        # optionally db.write(atoms, data=atoms.info) if you want to store info as data

Then refer to this https://fair-chem.github.io/core/ase_dataset_creation.html for training. We should definitely iterate on this!

Copy link

github-actions bot commented Sep 8, 2024

This issue has been marked as stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Sep 8, 2024
@lbluque lbluque assigned lbluque and unassigned misko Sep 13, 2024
@github-actions github-actions bot removed the stale label Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants