Translating Cancer Data Commons (CDA) to 🔥 FHIR (Fast Healthcare Interoperability Resources) format.
- from source
# clone repo & setup virtual env
python3 -m venv venv
. venv/bin/activate
pip install -e .
To run the transformer, ensure that CDA raw data is located in the ./data/raw/ directory. If you need to retrieve the raw data, please contact cancerdataaggregator @ gmail.
Usage: cda2fhir transform [OPTIONS]
Options:
-s, --save Save FHIR ndjson to CDA2FHIR/data/META folder.
[default: True]
-v, --verbose
-ns, --n_samples TEXT Number of samples to randomly select - max 100.
-nd, --n_diagnosis TEXT Number of diagnosis to randomly select - max 100.
-p, --path TEXT Path to save the FHIR NDJSON files. default is
CDA2FHIR/data/META.
--help Show this message and exit.
- example
cda2fhir transform
Current integration testing runs on all data and may take approximately 2 hours.
pytest -cov
For FHIR data validation please run:
g3t meta validate <path to data/META folder with ndjson files>
>>>> resources={'summary': {'Specimen': 715864, 'Observation': 724999, 'ResearchStudy': 423, 'BodyStructure': 180, 'Condition': 95288, 'ResearchSubject': 160662, 'Patient': 137522}}
NOTE: This process may take more than 5 minutes due to the size of the current data.