From 61cab04941da823ee61caa41c3f6be468f569303 Mon Sep 17 00:00:00 2001
From: GitHub Action Points of Contactkabilar@datajoint.com)
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the DANDI POC is Dr.Satrajit Ghosh (satra@mit.edu)
To achieve the aims of coordinated development, the principal developers conduct a diff --git a/partnerships/facemap/index.html b/partnerships/facemap/index.html index 836a199..05ef601 100644 --- a/partnerships/facemap/index.html +++ b/partnerships/facemap/index.html @@ -15,7 +15,7 @@ - + @@ -1329,7 +1329,7 @@
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the Facemap POC is Dr. Carsen Stringer (stringerc@janelia.hhmi.org)
To achieve the aims of coordinated development, the principal developers conduct a joint diff --git a/partnerships/incf/index.html b/partnerships/incf/index.html index 2a1961c..45862da 100644 --- a/partnerships/incf/index.html +++ b/partnerships/incf/index.html @@ -15,7 +15,7 @@ - + diff --git a/partnerships/nwb/index.html b/partnerships/nwb/index.html index ad6ff2a..153e732 100644 --- a/partnerships/nwb/index.html +++ b/partnerships/nwb/index.html @@ -15,7 +15,7 @@ - + @@ -1313,7 +1313,7 @@
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the NWB POC is Dr. Ryan Ly (Lawrence Berkeley National Laboratory)
To achieve the aims of coordinated development, the principal developers conduct a joint diff --git a/partnerships/openephysgui/index.html b/partnerships/openephysgui/index.html index 2e4c678..7fa5e40 100644 --- a/partnerships/openephysgui/index.html +++ b/partnerships/openephysgui/index.html @@ -15,7 +15,7 @@ - + diff --git a/partnerships/suite2p/index.html b/partnerships/suite2p/index.html index 1c7b308..c77097e 100644 --- a/partnerships/suite2p/index.html +++ b/partnerships/suite2p/index.html @@ -15,7 +15,7 @@ - + @@ -1329,7 +1329,7 @@
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the Suite2p POC is Dr. Carsen Stringer (stringerc@janelia.hhmi.org)
To achieve the aims of coordinated development, the principal developers conduct a joint diff --git a/projects/index.html b/projects/index.html index 47b25b3..719743a 100644 --- a/projects/index.html +++ b/projects/index.html @@ -15,7 +15,7 @@ - + diff --git a/projects/publications/index.html b/projects/publications/index.html index 5b5840e..b6c0e55 100644 --- a/projects/publications/index.html +++ b/projects/publications/index.html @@ -15,7 +15,7 @@ - + diff --git a/projects/teams/index.html b/projects/teams/index.html index 4657a45..3047204 100644 --- a/projects/teams/index.html +++ b/projects/teams/index.html @@ -15,7 +15,7 @@ - + diff --git a/search/search_index.json b/search/search_index.json index 9441a68..5d0b41c 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+"},"docs":[{"title":"Welcome to the DataJoint Documentation","text":"
Open-source framework for defining, operating, and querying data pipelines
Learn more
Open-source implementation of data pipelines for neuroscience studies
Learn more
A cloud platform for automated analysis workflows. It relies on DataJoint Python and DataJoint Elements.
Learn more | Sign-in
Projects and research teams supported by DataJoint software
Learn more
A collection of additional open-source tools for building and operating scientific data pipelines.
","location":"additional-resources/"},{"title":"APIs","text":"A MATLAB client for defining, operating, and querying data pipelines.
Legacy docs | Source code
A REST API server for interacting with DataJoint pipelines.
Docs | Source code
A browser-based graphical user interface for data entry and navigation.
Legacy docs | Source code
A framework for making low-code web apps for data visualization.
Legacy docs | Source code
graph\n %% Give short names\n dj[\"datajoint/datajoint\"]\n base[\"datajoint/djbase\"]\n lab[\"datajoint/djlab\"]\n hub[\"datajoint/djlabhub\"]\n test[\"datajoint/djtest\"]\n conda3[\"datajoint/miniconda3\"]\n mysql[\"datajoint/mysql\"]\n %% Define connections\n conda3 --> base --> test;\n base --> dj;\n base --> lab --> hub;\n %% Add all to class\n class dj,base,lab,hub,test,conda3,mysql boxes;\n classDef boxes stroke:#333; %% Grey stroke for class
MySQL server configured to work with DataJoint.
Docker image | Source code
Minimal Python Docker image with conda.
Docker image | Legacy docs | Source code
Minimal base Docker image with DataJoint Python dependencies installed.
Docker image | Legacy docs | Source code
Docker image for running tests related to DataJoint Python.
Docker image | Legacy docs | Source code
Official DataJoint Docker image.
Docker image | Source code
Docker image optimized for running a JupyterLab environment with DataJoint Python.
Docker image | Legacy docs | Source code
Docker image optimized for deploying to JupyterHub a JupyterLab environment with DataJoint Python.
Docker image | Legacy docs | Source code
Find us at the following workshops and conferences!
linking_module
, and key
management in make
functions.If your work uses the DataJoint Python, MATLAB, or Elements, please cite the respective manuscripts and Research Resource Identifiers (RRIDs).
","location":"about/citation/"},{"title":"DataJoint Python and MATLAB","text":"Thank you for your interest in contributing! \ud83e\udd1d
To help keep everyone in alignment and coordinated in the community effort, we\u2019ve created this document. It serves as the contribution guideline that outlines how open-source software development is to be conducted. Any software development that makes reference to this document can be assumed to adopt the policies outlined below. We\u2019ve structured the guideline in a FAQ (frequently asked questions) format to make it easier to digest. Feel free to review the questions below to determine any specific policy.
The principal maintainer of DataJoint and associated tools is the DataJoint company. The pronouns \u201cwe\u201d and \u201cus\u201d in this guideline refer to the principal maintainers. We invite reviews and contributions of the open-source software. We compiled these guidelines to make this work clear and efficient.
","location":"about/contribute/"},{"title":"Feedback","text":"DataJoint APIs, DataJoint Web GUIs, and DataJoint Elements are supported by NIH grant U24 NS116470 for disseminating open-source software for neuroscience research. Your feedback is essential for continued funding. Your feedback also helps shape the technology development roadmap for the DataJoint ecosystem. Please tell us about your projects by filling out the DataJoint Census.
","location":"about/contribute/#feedback"},{"title":"1) Which issue should I contribute towards?","text":"There are three primary things to consider when looking to contribute.
Availability: An indication of whether anyone is currently working on a fix for the given issue. Availability is indicated by who is assigned
. Issues that are unassigned
mean that there is no one yet working on resolving the issue and the issue is available for someone to work on. If an issue has been assigned, then any additional work on that issue should be coordinated with the assignee.
Specification: In order for issues to be properly addressed, the requirements of satisfying and closing the issue should be clear. If it is not, a label will be added as unspecified
. This could be due to more debug info being necessary, more details on intended behavior, or perhaps that further discussion is required to determine a good solution. Feel free to help us arrive at a proper specification.
Priority: As a community, we work on a concerted effort to bring about the realization of the milestones. We utilize milestones as a planning tool to help focus a group of changes around a release. To determine the priority of issues, simply have a look at the next milestone that is expected to arrive. Therefore, each milestone following this can be understood as lower in priority respectively. Bear in mind that much like a hurricane forecast, the execution plan is much more likely to be accurate the closer to today\u2019s date as opposed to milestones further out. Extremely low priority issues are assigned to the Backburner
milestone. Since Backburner
does not have a target date, this indicates that its issues may be deferred indefinitely. Occasionally the maintainers will move issues from Backburner
as it makes sense to address them within a release. Also, issues unassigned
to a milestone can be understood as new issues which have not been triaged.
After considering the above, you may comment on the issue you\u2019d like to help fix and a maintainer will assign it to you.
","location":"about/contribute/#1-which-issue-should-i-contribute-towards"},{"title":"2) What is the proper etiquette for proposing changes as contribution?","text":"What is generally expected from new contributions are the following:
Any proposed contributor changes should be introduced in the form of a pull request (PR) from their fork.
Proper branch target specified. The following are the generally the available branches that can be targeted:
main
or master
: Represents the single source of truth and the latest in completed development.pre
: Represents the source at the point of the last stable release.For larger more involved changes, a maintainer may determine it best to create a feature-specific branch and adjust the PR accordingly.
A summary description that describes the overall intent behind the PR.
Proper links to the issue(s) that the PR serves to resolve.
Newly introduced changes must pass any required checks. Typically as it relates to tests, this means:
Additional documentation to reflect new feature or behavior introduced.
Necessary updates to the changelog following Keep a Changelog convention.
A contributor should not approve or merge their own PR.
Reviewer suggestions or feedback should not be directly committed to a branch on a contributor\u2019s fork. A less intrusive way to collaborate would be for the reviewer to PR to the contributor\u2019s fork/branch that is associated with the main PR currently in review.
Maintainers will also ensure that PR\u2019s have the appropriate assignment for reviewer, milestone, and project.
","location":"about/contribute/#2-what-is-the-proper-etiquette-for-proposing-changes-as-contribution"},{"title":"3) How can I track the progress of an issue that has been assigned?","text":"Since milestones represent the development plan, projects represent the actual execution. Projects are typically fixed-time sprints (1-2 weeks). A \u2018workable\u2019 number of issues that have been assigned to developers and assigned to the next milestone are selected and tracked in each project to provide greater granularity in the week-to-week progress. Automation is included observing the Automated kanban with reviews
template. Maintainers will adjust the project assignment to reflect the order in which to resolve the milestone issues.
Releases follow the standard definition of semantic versioning. Meaning:
MAJOR
. MINOR
. PATCH
MAJOR
version when you make incompatible API changes,MINOR
version when you add functionality in a backwards compatible manner, andPATCH
version when you make backwards compatible bug fixes.Each release requires tagging the commit appropriately and is then issued in the normal medium for release e.g. PyPi, NPM, YARN, GitHub Release, etc.
Minor releases are triggered when all the issues assigned to a milestone are resolved and closed. Patch releases are triggered periodically from main
or master
after a reasonable number of PR merges have come in.
In order to follow the appropriate process and setting, please reference the following flow for your desired mode of engagement:
","location":"about/contribute/#5-i-am-not-yet-too-comfortable-contributing-but-would-like-to-engage-the-community-what-is-the-policy-on-community-engagement"},{"title":"5a) Generally, how do I perform ____?","text":"If the documentation does not provide clear enough instruction, please see StackOverflow posts related to the datajoint tag or ask a new question tagging it appropriately. You may refer to our datajoint tag wiki for more details on its proper use.
","location":"about/contribute/#5a-generally-how-do-i-perform-____"},{"title":"5b) I just encountered this error, how can I resolve it?","text":"Please see StackOverflow posts related to the datajoint tag or ask a new question tagging it appropriately. You may refer to our datajoint tag wiki for more details on its proper use.
","location":"about/contribute/#5b-i-just-encountered-this-error-how-can-i-resolve-it"},{"title":"5c) I just encountered this error and I am sure it is a bug, how do I report it?","text":"Please file it under the issue tracker associated with the open-source software.
","location":"about/contribute/#5c-i-just-encountered-this-error-and-i-am-sure-it-is-a-bug-how-do-i-report-it"},{"title":"5d) I have an idea or new feature request, how do I submit it?","text":"Please file it under the issue tracker associated with the open-source software.
","location":"about/contribute/#5d-i-have-an-idea-or-new-feature-request-how-do-i-submit-it"},{"title":"5e) I am curious why the maintainers choose to ____? i.e. questions that are \u2018opinionated\u2019 in nature with answers that some might disagree.","text":"Please join the community on the DataJoint Slack and ask on the most relevant channel. There, you may engage directly with the maintainers for proper discourse.
","location":"about/contribute/#5e-i-am-curious-why-the-maintainers-choose-to-____-ie-questions-that-are-opinionated-in-nature-with-answers-that-some-might-disagree"},{"title":"5f) What is the timeline or roadmap for the release of certain supported features?","text":"Please refer to milestones and projects associated with the open-source software.
","location":"about/contribute/#5f-what-is-the-timeline-or-roadmap-for-the-release-of-certain-supported-features"},{"title":"5g) I need urgent help best suited for live debugging, how can I reach out directly?","text":"Please join the community on the DataJoint Slack and ask on the most relevant channel. Please bear in mind that as open-source community software, availability of the maintainers might be limited.
","location":"about/contribute/#5g-i-need-urgent-help-best-suited-for-live-debugging-how-can-i-reach-out-directly"},{"title":"Team","text":"The project is performed by DataJoint with Dimitri Yatsenko as Principal Investigator.
","location":"about/datajoint-team/"},{"title":"Scientists","text":"The first-person pronouns \"we\" and \"our\" in these documents refer to those listed above.
","location":"about/datajoint-team/#past-contributors"},{"title":"External contributors","text":"The principal components of the Resource are developed and distributed as open-source projects and external contributions are welcome. We have adopted a Contribution Guide for DataJoint, DataJoint Elements, and related open-source tools.
","location":"about/datajoint-team/#external-contributors"},{"title":"History","text":"Dimitri Yatsenko began development of DataJoint in Andreas S.Tolias' lab in the Neuroscience Department at Baylor College of Medicine in the fall of 2009. Initially implemented as a thin MySQL API in MATLAB, it defined the major principles of the DataJoint model.
Many students and postdocs in the lab as well as collaborators and early adopters have contributed to the project. Jacob Reimer and Emmanouil Froudarakis became early adopters in Andreas Tolias' Lab and propelled development. Alexander S. Ecker, Philipp Berens, Andreas Hoenselaar, and R. James Cotton contributed to the formulation of the overall requirements for the data model and critical reviews of DataJoint development.
Outside the Tolias lab, the first labs to adopt DataJoint (approx. 2010) were the labs of Athanassios G. Siapas at CalTech, Laura Busse and Steffen Katzner at the University of T\u00fcbingen.
In 2015, the Python implementation gained momentum with Edgar Y. Walker and Fabian Sinz joining as principal contributors.
In 2016, Andreas Tolias Lab joined the MICrONS project, using DataJoint to process volumes of neurophysiology and neuroanatomical data shared across large teams.
In 2016, Vathes LLC was founded to provide support to groups using DataJoint.
In 2017, DARPA awarded a small-business innovation research grant to Vathes LLC (Contract D17PC00162) to further develop and publicize the DataJoint framework.
In June 2018, the Princeton Neuroscience Institute, under the leadership of Prof. Carlos Brody, began funding a project to generate a detailed DataJoint user manual.
","location":"about/history/"},{"title":"DataJoint Elements for Neurophysiology","text":"DataJoint Elements provides an efficient approach for neuroscience labs to create and manage scientific data workflows: the complex multi-step methods for data collection, preparation, processing, analysis, and modeling that researchers must perform in the course of an experimental study. Elements are a collection of curated modules for assembling workflows for several modalities of neurophysiology experiments and are designed for ease of integration into diverse custom workflows. This work is derived from the developments in leading neuroscience projects and uses the DataJoint API for defining, deploying, and sharing their data workflows.
An overview of the principles of DataJoint workflows and the goals of DataJoint Elements are described in the position paper \"DataJoint Elements: Data Workflows for Neurophysiology\".
Below are the projects that make up the family of open-source DataJoint Elements:
A data pipeline for calcium imaging microscopy.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for Neuropixels probes.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for electrode localization of Neuropixels probes.
Docs
A data pipeline for miniscope calcium imaging.
Interactive tutorial
Docs
A data pipeline for segmenting volumetric microscopy data with Cellpose, uploading to BossDB, and visualizing with Neuroglancer.
Interactive tutorial
Docs
A data pipeline for pose estimation with DeepLabCut.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for motion sequencing with Keypoint-MoSeq.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for pose estimation with Facemap.
Docs
A data pipeline for managing data from optogenetics experiments.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for visual stimulation with Psychtoolbox.
Docs
A data pipeline for lab management.
Docs
A data pipeline for subject management.
Docs
A data pipeline for session management.
Docs
A data pipeline for event- and trial-based experiments.
Docs
Common functions for the DataJoint Elements.
Docs
The following conventions describe the DataJoint Python API implementation.
","location":"elements/concepts/"},{"title":"DataJoint Schemas","text":"The DataJoint Python API allows creating database schemas, which are namespaces for collections of related tables.
The following commands declare a new schema and create the object named schema
to reference the database schema.
import datajoint as dj\nschema = dj.schema('<schema_name>')\n
We follow the convention of having only one schema defined per Python module. Then such a module becomes a DataJoint schema comprising a Python module with a corresponding database schema.
The module's schema
object is then used as the decorator for classes that define tables in the database.
dj.createSchema\n
In Matlab, we list one table per file and place schemas in folders.
","location":"elements/concepts/#datajoint-schemas"},{"title":"Elements","text":"An Element is a software package defining one or more DataJoint schemas serving a particular purpose. By convention, such packages are hosted in individual GitHub repositories. For example, Element element_calcium_imaging
is hosted at this GitHub repository and contains two DataJoint schemas: scan
and imaging
.
The following YouTube videos provide information on basic design principles and file organization.
Some videos feature outdated versions of the respective GitHub repositories. For the most updated information, check the documentation page for the corresponding Element.
","location":"elements/concepts/#youtube-tutorials"},{"title":"Deferred schemas","text":"A deferred schema is one in which the name of the database schema name is not specified. This module does not declare schema and tables upon import. Instead, they are declared by calling schema.activate('<schema_name>')
after import.
By convention, all modules corresponding to deferred schema must declare the function activate
which in turn calls schema.activate
.
Thus, Element modules begin with:
import datajoint as dj\nschema = dj.schema()\n\ndef activate(schema_name):\nschema.activate(schema_name)\n
However, many activate functions perform other work associated with activating the schema such as activating other schemas upstream.
","location":"elements/concepts/#deferred-schemas"},{"title":"Linking Module","text":"To make the code more modular with fewer dependencies, Element modules do not import
upstream schemas directly. Instead, all required classes and functions must be defined in a linking_module
and passed to the module's activate
function. By keeping all upstream requirements in the linking module, all Elements can be activated as part of any larger pipeline.
For instance, the Scan module receives its required functions from the linking module passed into the module's activate
function. See the example notebooks for an example of how the linking module is passed into the Element's module.
workflow-array-ephys
) and their dependencies (e.g., element-array-ephys
).bash cd ~/Projects
)requirements.txt
in the workflow for the list of Elements to clone and install as editable. You will also need to install element-interface
deps=(\"lab\" \"animal\" \"session\" \"interface\" \"<others>\")\nfor repo in $deps # clone each\ndo \n git clone https://github.com/datajoint/element-$repo\ndone\nfor repo in $(ls -d ./{element,workflow}*) # editable install \ndo \n pip install -e ./$repo\ndone\n
If you need to drop all schemas to start fresh, you'll need to do following the dependency order. Refer to the workflow's notebook (notebooks/06-drop-optional.ipynb
) for the drop order.
/tmp/testset
)..env
file within the docker
directory with the following content. Replace /tmp/testset
with the directory where you have the test dataset downloaded. TEST_DATA_DIR=/tmp/testset
element
or your fork of an element
or the workflow
, within the Dockerfile
uncomment the lines from the different options presented. This will allow you to install the repositories of interest and run the integration tests on those packages. Be sure that the element
package version matches the version in the requirements.txt
of the workflow
.docker-compose -f ./docker/docker-compose-test.yaml up --build\n
The following document describes how to setup a development environment and connect to a database so that you can use the DataJoint Elements to build and run a workflow on your local machine.
Any of the DataJoint Elements can be combined together to create a workflow that matches your experimental setup. We have a number of example workflows to get you started. Each focuses on a specific modality, but they can be adapted for your custom workflow.
Getting up and running will require a couple items for a good development environment. If any of these items are already familiar to you and installed on your machine, you can skip the corresponding section.
1. Python
2. Conda
3. Integrated Development Environment
4. Version Control (git)
5. Visualization packages
Next, you'll need to download one of the example workflows and corresponding example data.
Finally, there are a couple different approaches to connecting to a database. Here, we highlight three approaches:
1. First Time: Beginner. Temporary storage to learn the ropes.
2. Local Database: Intermediate. Deployed on local hardware, managed by you.
3. Central Database: Advanced: Deployed on dedicated hardware.
This diagram describes the general components for a local DataJoint environment.
flowchart LR\n py_interp -->|DataJoint| db_server[(\"Database Server\\n(e.g., MySQL)\")]\n subgraph conda[\"Conda environment\"]\n direction TB\n py_interp[Python Interpreter]\n end\n subgraph empty1[\" \"] %% Empty subgraphs prevent overlapping titles\n direction TB\n style empty1 fill:none, stroke-dasharray: 0 1\n conda\n end\n subgraph term[\"Terminal or Jupyter Notebook\"]\n direction TB\n empty1\n end\n subgraph empty2[\" \"] %% Empty subgraphs prevent overlapping titles\n direction TB\n style empty2 fill:none, stroke-dasharray: 0 1\n term\n end\n class py_interp,conda,term,ide,db_server,DataJoint boxes;\n classDef boxes fill:#ddd, stroke:#333;
","location":"elements/user-guide/#development-environment"},{"title":"Python","text":"DataJoint Elements are written in Python. The DataJoint Python API supports Python versions 3.7 and up. We recommend downloading the latest stable release of 3.9 here, and following the install instructions.
","location":"elements/user-guide/#python"},{"title":"Conda","text":"Python projects each rely on different dependencies, which may conflict across projects. We recommend working in a Conda environment for each project to isolate the dependencies. For more information on why Conda, and setting up the version of Conda that best suits your needs, see this article.
To get going quickly, we recommend you ...
Download Miniconda and go through the setup, including adding Miniconda to your PATH
(full instructions here).
Declare and initialize a new conda environment with the following commands. Edit <name>
to reflect your project.
conda create --name datajoint-workflow-<name> python=3.9 \nconda activate datajoint-workflow-<name> \n
Running analyses with Element DeepLabCut or Element Calcium imaging may require tensorflow, which can cause issues on M1 machines. By saving the yaml
file below, this environment can be loaded with conda create -f my-file.yaml
. If you encounter errors related to clang
, try launching xcode and retrying.
name: dj-workflow-<name>\nchannels:\n - apple \n - conda-forge\n - defaults\ndependencies:\n - tensorflow-deps\n - opencv\n - python=3.9\n - pip>=19.0 \n - pip:\n - tensorflow-macos\n - tensorflow-metal\n - datajoint\n
","location":"elements/user-guide/#conda"},{"title":"Integrated Development Environment (IDE)","text":"Development and use can be done with a plain text editor in the terminal. However, an integrated development environment (IDE) can improve your experience. Several IDEs are available. We recommend Microsoft's Visual Studio Code, also called VS Code. To set up VS Code with Python for the first time, follow this tutorial.
","location":"elements/user-guide/#integrated-development-environment-ide"},{"title":"Version Control (git)","text":"Table definitions and analysis code can change over time, especially with multiple collaborators working on the same project. Git is an open-source, distributed version control system that helps keep track of what changes where made when, and by whom. GitHub is a platform that hosts projects managed with git. The example DataJoint Workflows are hosted on GitHub, we will use git to clone (i.e., download) this repository.
git --version
in a terminal window.To run the demo notebooks and generate visualizations associated with an example workflow, you'll need a couple extra packages.
Jupyter Notebooks help structure code (see here for full instructions on Jupyter within VS Code).
Install Jupyter packages
conda install jupyter ipykernel nb_conda_kernels\n
Ensure your VS Code python interpreter is set to your Conda environment path.
Click to expand more details.
DataJoint Diagrams rely on additional packages. To install these packages, enter the following command...
conda install graphviz python-graphviz pydotplus\n
","location":"elements/user-guide/#visualization-packages-jupyter-notebooks-datajoint-diagrams"},{"title":"Example Config, Workflows and Data","text":"Of the options below, pick the workflow that best matches your needs.
Change the directory to where you want to download the workflow.
cd ~/Projects\n
Clone the relevant repository, and change directories to this new directory.
git clone https://github.com/datajoint/<repository>\ncd <repository>\n
Install this directory as editable with the -e
flag.
pip install -e .\n
Why editable? Click for details This lets you modify the code after installation and experiment with different designs or adding additional tables. You may wish to edit pipeline.py
or paths.py
to better suit your needs. If no modification is required, using pip install .
is sufficient. Install element-interface
, which has utilities used across different Elements and Workflows.
pip install \"element-interface @ git+https://github.com/datajoint/element-interface\"\n
\u200bSet up a local DataJoint config file by saving the following block as a json in your workflow directory as dj_local_conf.json
. Not sure what to put for the < >
values below? We'll cover this when we connect to the database
{\n \"database.host\": \"<hostname>\",\n \"database.user\": \"<username>\",\n \"database.password\": \"<password>\",\n \"loglevel\": \"INFO\",\n \"safemode\": true,\n \"display.limit\": 7,\n \"display.width\": 14,\n \"display.show_tuple_count\": true,\n \"custom\": {\n \"database.prefix\": \"<username_>\"\n }\n}\n
An example workflow for session management.
Clone from GitHub
An example workflow for Neuropixels probes.
Clone from GitHub
An example workflow for calcium imaging microscopy.
Clone from GitHub
An example workflow for miniscope calcium imaging.
Clone from GitHub
An example workflow for pose estimation with DeepLabCut.
Clone from GitHub
The first notebook in each workflow will guide you through downloading example data from DataJoint's AWS storage archive. You can also process your own data. To use the example data, you would ...
Install djarchive-client
pip install git+https://github.com/datajoint/djarchive-client.git\n
Use a python terminal to import the djarchive
client and view available datasets, and revisions.
import djarchive_client\nclient = djarchive_client.client()\nlist(client.datasets()) # List available datasets, select one\nlist(client.revisions()) # List available revisions, select one\n
Prepare a directory to store the download data, for example in /tmp
, then download the data with the djarchive
client. This may take some time with larger datasets.
import os\nos.makedirs('/tmp/example_data/', exist_ok=True)\nclient.download(\n '<workflow-dataset>',\n target_directory='/tmp/example_data',\n revision='<revision>'\n)\n
The example subject6/session1
data was recorded with SpikeGLX and processed with Kilosort2.
/tmp/example_data/\n- subject6\n- session1\n - towersTask_g0_imec0\n - towersTask_g0_t0_nidq.meta\n - towersTask_g0_t0.nidq.bin\n
Element and Workflow Array Ephys also support data recorded with OpenEphys. Calcium Imaging: Click to expand details The example subject3
data was recorded with Scanbox. The example subject7
data was recorded with ScanImage. Both datasets were processed with Suite2p.
/tmp/example_data/\n- subject3/\n - 210107_run00_orientation_8dir/\n - run00_orientation_8dir_000_000.sbx\n - run00_orientation_8dir_000_000.mat\n - suite2p/\n - combined\n - plane0\n - plane1\n - plane2\n - plane3\n- subject7/\n - session1\n - suite2p\n - plane0\n
Element and Workflow Calcium Imaging also support data collected with ... - Nikon - Prairie View - CaImAn DeepLabCut: Click to expand details The example data includes both training data and pretrained models.
/tmp/test_data/from_top_tracking/\n- config.yml\n- dlc-models/iteration-0/from_top_trackingFeb23-trainset95shuffle1/\n - test/pose_cfg.yaml\n - train/\n - checkpoint\n - checkpoint_orig\n \u2500 learning_stats.csv\n \u2500 log.txt\n \u2500 pose_cfg.yaml\n \u2500 snapshot-10300.data-00000-of-00001\n \u2500 snapshot-10300.index\n \u2500 snapshot-10300.meta # same for 103000\n- labeled-data/\n - train1/\n - CollectedData_DJ.csv\n - CollectedData_DJ.h5\n - img00674.png # and others\n - train2/ # similar to above\n- videos/\n - test.mp4\n - train1.mp4\n
FaceMap: Click to expand details Associated workflow still under development
Some of the workflows carry some assumptions about how your file directory will be organized, and how some files are named.
Array Ephys: Click to expand detailsIn your DataJoint config, add another item under custom
, ephys_root_data_dir
, for your local root data directory. This can include multiple roots.
\"custom\": {\n \"database.prefix\": \"<username_>\",\n \"ephys_root_data_dir\": [\"/local/root/dir1\", \"/local/root/dir2\"]\n}\n
subject
directory names must match the subject IDs in your subjects table. The ingest.py
script ( demo ingestion notebook ) can help load these values from ./user_data/subjects.csv
.session
directories can have any naming convention, but must be specified in the session table (see also demo ingestion notebook ). probe
directory names must end in a one-digit number corresponding to the probe number.probe
directory should contain: - One neuropixels meta file named *[0-9].ap.meta
- Optionally, one Kilosort output folderFolder structure:
<ephys_root_data_dir>/\n\u2514\u2500\u2500\u2500<subject1>/ # Subject name in `subjects.csv`\n\u2502 \u2514\u2500\u2500\u2500<session0>/ # Session directory in `sessions.csv`\n\u2502 \u2502 \u2514\u2500\u2500\u2500imec0/\n\u2502 \u2502 \u2502 \u2502 *imec0.ap.meta\n\u2502 \u2502 \u2502 \u2514\u2500\u2500\u2500ksdir/\n\u2502 \u2502 \u2502 \u2502 spike_times.npy\n\u2502 \u2502 \u2502 \u2502 templates.npy\n\u2502 \u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500imec1/\n\u2502 \u2502 \u2502 *imec1.ap.meta\n\u2502 \u2502 \u2514\u2500\u2500\u2500ksdir/\n\u2502 \u2502 \u2502 spike_times.npy\n\u2502 \u2502 \u2502 templates.npy\n\u2502 \u2502 \u2502 ...\n\u2502 \u2514\u2500\u2500\u2500<session1>/\n\u2502 \u2502 \u2502 ...\n\u2514\u2500\u2500\u2500<subject2>/\n\u2502 \u2502 ...\n
Calcium Imaging: Click to expand details Note: While Element Calcium Imaging can accommodate multiple scans per session, Workflow Calcium Imaging assumes there is only one scan per session.
In your DataJoint config, add another item under custom
, imaging_root_data_dir
, for your local root data directory.
\"custom\": {\n \"database.prefix\": \"<username_>\",\n \"imaging_root_data_dir\": \"/local/root/dir1\"\n}\n
subject
directory names must match the subject IDs in your subjects table. The ingest.py
script ( tutorial notebook ) can help load these values from ./user_data/subjects.csv
.session
directories can have any naming convention, but must be specified in the session table (see also [tutorial notebook])(https://github.com/datajoint/element-calcium-imaging/blob/main/notebooks/tutorial.ipynb) . session
directory should contain: - All .tif
or .sbx
files for the scan, with any naming convention. - One suite2p
subfolder, containing the analysis outputs in the default naming convention. - One caiman
subfolder, containing the analysis output .hdf5
file, with any naming convention.Folder structure:
imaging_root_data_dir/\n\u2514\u2500\u2500\u2500<subject1>/ # Subject name in `subjects.csv`\n\u2502 \u2514\u2500\u2500\u2500<session0>/ # Session directory in `sessions.csv`\n\u2502 \u2502 \u2502 scan_0001.tif\n\u2502 \u2502 \u2502 scan_0002.tif\n\u2502 \u2502 \u2502 scan_0003.tif\n\u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500suite2p/\n\u2502 \u2502 \u2502 ops1.npy\n\u2502 \u2502 \u2514\u2500\u2500\u2500plane0/\n\u2502 \u2502 \u2502 \u2502 ops.npy\n\u2502 \u2502 \u2502 \u2502 spks.npy\n\u2502 \u2502 \u2502 \u2502 stat.npy\n\u2502 \u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500plane1/\n\u2502 \u2502 \u2502 ops.npy\n\u2502 \u2502 \u2502 spks.npy\n\u2502 \u2502 \u2502 stat.npy\n\u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500caiman/\n\u2502 \u2502 \u2502 analysis_results.hdf5\n\u2502 \u2514\u2500\u2500\u2500<session1>/ # Session directory in `sessions.csv`\n\u2502 \u2502 \u2502 scan_0001.tif\n\u2502 \u2502 \u2502 scan_0002.tif\n\u2502 \u2502 \u2502 ...\n\u2514\u2500\u2500\u2500<subject2>/ # Subject name in `subjects.csv`\n\u2502 \u2502 ...\n
DeepLabCut: Click to expand details Note: Element DeepLabCut assumes you've already used the DeepLabCut GUI to set up your project and label your data. This can include multiple roots.
custom
, dlc_root_data_dir
, for your local root data directory. \"custom\": {\n \"database.prefix\": \"<username_>\",\n \"dlc_root_data_dir\": [\"/local/root/dir1\", \"/local/root/dir2\"]\n}\n
yaml
files reflect the current folder structure.pickle
and mat
training files. If not, follow the DeepLabCut guide to create a training datasetFolder structure:
/dlc_root_data_dir/your_project/\n- config.yaml # Including correct path information\n- dlc-models/iteration-*/your_project_date-trainset*shuffle*/\n - test/pose_cfg.yaml # Including correct path information\n - train/pose_cfg.yaml # Including correct path information\n- labeled-data/any_names/*{csv,h5,png}\n- training-datasets/iteration-*/UnaugmentedDataSet_your_project_date/\n - your_project_*shuffle*.pickle\n - your_project_scorer*shuffle*.mat\n- videos/any_names.mp4\n
Miniscope: Click to expand details In your DataJoint config, add another item under custom
, miniscope_root_data_dir
, for your local root data directory.
\"custom\": {\n \"database.prefix\": \"<username_>\",\n \"miniscope_root_data_dir\": \"/local/root/dir\"\n}\n
DataJoint helps you connect to a database server from your programming environment (i.e., Python or MATLAB), granting a number of benefits over traditional file hierarchies (see YouTube Explainer). We offer two options:
Temporary storage. Not for production use.
json
file called dj_local_conf.json
using your DataJoint account information and tutorial-db.datajoint.io
as the host. {\n \"database.host\": \"tutorial-db.datajoint.io\",\n \"database.user\": \"<datajoint-username>\",\n \"database.password\": \"<datajoint-password>\",\n \"loglevel\": \"INFO\",\n \"safemode\": true,\n \"display.limit\": 7,\n \"display.width\": 14,\n \"display.show_tuple_count\": true,\n \"custom\": {\n \"database.prefix\": \"<datajoint-username_>\"\n }\n}\n
Note: Your database prefix must begin with your username in order to have permission to declare new tables.Install Docker. Why Docker? Click for details. Docker makes it easy to package a program, including the file system and related code libraries, in a container. This container can be distributed to any machine, both automating and standardizing the setup process.
Test that docker has been installed by running the following command:
docker run --rm hello-world\n
docker run -p 3306:3306 -e MYSQL_ROOT_PASSWORD=tutorial datajoint/mysql\n
What's this doing? Click for details. json
file called dj_local_conf.json
using the following details. The prefix can be set to any value. {\n \"database.host\": \"localhost\",\n \"database.password\": \"tutorial\",\n \"database.user\": \"root\",\n \"database.port\": 3306,\n \"loglevel\": \"INFO\",\n \"safemode\": true,\n \"display.limit\": 7,\n \"display.width\": 14,\n \"display.show_tuple_count\": true,\n \"custom\": {\n \"database.prefix\": \"neuro_\"\n }\n}\n
This document is written to apply to all example workflows. Many have a docker folder used by developers to set up both a database and a local environment for integration tests. Simply docker compose up
the relevant file and docker exec
into the relevant container.
To set up a database on dedicated hardware may require expertise to set up and maintain. DataJoint's MySQL Docker image project provides all the information required to set up a dedicated database.
","location":"elements/user-guide/#central-database"},{"title":"Interacting with the Workflow","text":"","location":"elements/user-guide/#interacting-with-the-workflow"},{"title":"In Python","text":"Connect to the database and import tables
from <relevant-workflow>.pipeline import *\n
View the declared tables. For a more in depth explanation of how to run the workflow and explore the data, refer to the Jupyter notebooks in the workflow directory. Array Ephys: Click to expand details
subject.Subject()\nsession.Session()\nephys.ProbeInsertion()\nephys.EphysRecording()\nephys.Clustering()\nephys.Clustering.Unit()\n
Calcium Imaging: Click to expand details subject.Subject()\nsession.Session()\nscan.Scan()\nscan.ScanInfo()\nimaging.ProcessingParamSet()\nimaging.ProcessingTask()\n
DeepLabCut: Click to expand details subject.Subject()\nsession.Session()\ntrain.TrainingTask()\nmodel.VideoRecording.File()\nmodel.Model()\nmodel.PoseEstimation.BodyPartPosition()\n
DataJoint LabBook is a graphical user interface to facilitate data entry for existing DataJoint tables.
tutorial-db
) and you have access, you can view the contents here.You have several options for adopting DataJoint workflows for your own experiments.
","location":"elements/management/adoption/"},{"title":"Adopt independently","text":"DataJoint Elements are designed for adoption by independent users with moderate software development skills, good understanding of DataJoint principles, and adequate IT expertise or support.
If you have not yet used DataJoint, we recommend completing our online training tutorials or attending a workshop either online or in person. Interactive tutorials can be found on the DataJoint Tutorials repository.
","location":"elements/management/adoption/#adopt-independently"},{"title":"Support from DataJoint","text":"Our team provides support to labs to adopt DataJoint workflows in their research.
This includes:
These services may be subsidized by grant funding for qualified research groups.
","location":"elements/management/adoption/#support-from-datajoint"},{"title":"Dissemination Plan","text":"","location":"elements/management/dissemination/"},{"title":"1. Dissemination","text":"We conduct activities to disseminate Resource components for adoption in diverse neuroscience labs. These activities include
In order to measure the effectiveness of the Resource, we conduct several activities to estimate the adoption and use of the Resource:
This Resource is supported by the National Institute Of Neurological Disorders And Stroke of the National Institutes of Health under Award Number U24NS116470. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
","location":"elements/management/governance/#funding"},{"title":"Scientific Steering Group","text":"The project oversight and guidance is provided by the Scientific Steering Group comprising
Broad engagement with the neuroscience community is necessary for the optimization, integration, and adoption of the Resource components.
We conduct five types of outreach activities that require different approaches:
","location":"elements/management/outreach/"},{"title":"1. Precursor Projects","text":"Our Selection Process requires a \"Precursor Project\" for any new experiment modality to be included in DataJoint Elements. A precursor project is a project that develops a DataJoint pipeline for its own experiments either independently or in collaboration with our team. We reach out to teams who develop DataJoint pipelines for new experiment paradigms and modalities to identify essential design motifs, analysis tools, and related tools and interfaces. We interview the core team to learn about their collaborative culture, practices, and procedures. We jointly review their open-source code and their plans for dissemination. In many cases, our team already collaborates with such teams through our other projects and we have a good understanding of their process. As we develop a new Element to support the new modality, we remain in contact with the team to include their contribution, solicit feedback, and evaluate design tradeoffs. When the new Element is released, a full attribution is given to the Precursor Project.
Rationale: The Resource does not aim to develop fundamentally new solutions for neurophysiology data acquisition and analysis. Rather it aims to systematize and disseminate existing open-source tools proven in leading research projects.
","location":"elements/management/outreach/#1-precursor-projects"},{"title":"2. Tool Developers","text":"DataJoint pipelines rely on analysis tools, atlases, data standards, archives and catalogs, and other neuroinformatics resources developed and maintained by the broader scientific community. To ensure sustainability of the Resource, we reach out to the tool developer to establish joint sustainability roadmaps.
","location":"elements/management/outreach/#2-tool-developers"},{"title":"Management Plan","text":"DataJoint Elements has established a Resource Management Plan to select projects for development, to assure quality, and to disseminate its output as summarized in the figure below:
The following sections provide detailed information.
DataJoint and DataJoint Elements serve as a framework and starting points for numerous new projects, setting the standard of quality for data architecture and software design. To ensure higher quality, the following policies have been adopted into the software development lifecycle (SDLC).
","location":"elements/management/quality-assurance/"},{"title":"Coding Standards","text":"When writing code, the following principles should be observed.
All components and their revisions must include appropriate automated software testing to be considered for release. The core framework must undergo thorough performance evaluation and comprehensive integration testing.
Generally, this includes tests related to:
When introducing new code to the code base, the following will be required for acceptance by DataJoint core team into the main code repository.
main
branch once ready for review.Upon satisfactory adherence to the above Coding Standards, Automated Testing, and Code Reviews:
Major.Minor.Patch
number.main
branch.For external research teams that reach out to us, we will provide engineering support to help users adopt the updated software, collect feedback, and resolve issues following the processes described in the section below. If the updates require changes in the design of the database schema or formats, a process for data migration will be provided upon request.
","location":"elements/management/quality-assurance/#release-process"},{"title":"User Feedback & Issue Tracking","text":"All components will be organized in GitHub repositories with guidelines for contribution, feedback, and issue submission to the issue tracker. For more information on the general policy around issue filing, tracking, and escalation, see the DataJoint Open-Source Contribute policy. For research groups that reach out to us, our team will work closely to collect feedback and resolve issues. Typically issues will be prioritized based on their criticality and impact. If new feature requirements become apparent, this may trigger the creation of a separate workflow or a major revision of an existing workflow.
","location":"elements/management/quality-assurance/#user-feedback-issue-tracking"},{"title":"Project Selection Process","text":"The project milestones are set annually by the team under the stewardship of the NIH programmatic staff and with the guidance of the project's Scientific Steering Group
We have adopted the following general criteria for selecting and accepting new projects to be included in the Resource.
Open Precursor Projects
At least one open-source DataJoint-based precursor project must exist for any new experiment modality to be accepted for support as part of the Resource. The precursor project team must be open to interviews to describe in detail their process for the experiment workflow, tools, and interfaces.
The precursor projects must provide sample data for testing during development and for tutorials. The precursor projects will be acknowledged in the development of the component.
Rationale: This Resource does not aim to develop fundamentally new solutions for neurophysiology data acquisition and analysis. Rather it seeks to systematize and disseminate existing open-source tools proven in leading research projects.
Impact
New components proposed for support in the project must be shown to be in demand by a substantial population or research groups, on the order of 100+ labs globally.
Sustainability
For all third-party tools or resources included in the proposed component, their long-term maintenance roadmap must be established. When possible, we will contact the developer team and work with them to establish a sustainability roadmap. If no such roadmap can be established, alternative tools and resources must be identified as replacement.
","location":"partnerships/dandi/"},{"title":"Aim","text":"
DataJoint Elements and The DANDI Archive (DANDI) are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/dandi/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/dandi/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/dandi/#datajoint"},{"title":"Distributed Archives for Neurophysiology Data Integration (DANDI)","text":"DANDI - https://dandiarchive.org \u2014 is an archive for neurophysiology data, providing neuroscientists with a common platform to share, archive, and process data. The project is funded by the NIH grant R24 MH117295 and led by Dr. Satrajit S. Ghosh and Dr. Yaroslav O. Halchenko.
The principal developers of DANDI are at the Massachusetts Institute of Technology, Dartmouth College, Catalyst Neuro, and Kitware.
","location":"partnerships/dandi/#distributed-archives-for-neurophysiology-data-integration-dandi"},{"title":"General Principles","text":"","location":"partnerships/dandi/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/dandi/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and DANDI.
For 2022, the DataJoint Elements POC is Dr. Kabilar Gunalan (kabilar@datajoint.com)
For 2022, the DANDI POC is Dr.Satrajit Ghosh (satra@mit.edu)
","location":"partnerships/dandi/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/dandi/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and DANDI.
","location":"partnerships/dandi/#licensing"},{"title":"Development Roadmap","text":"","location":"partnerships/facemap/"},{"title":"Aim","text":"
DataJoint Elements and Facemap are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/facemap/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/facemap/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/facemap/#datajoint"},{"title":"Facemap","text":"Facemap - https://github.com/MouseLand/facemap \u2014 is a pipeline for processing imaging data. The project is funded by HHMI Janelia Research Campus and led by Dr. Carsen Stringer and Atika Syeda.
The principal developers of Facemap are at the Janelia Research Campus.
","location":"partnerships/facemap/#facemap"},{"title":"General Principles","text":"","location":"partnerships/facemap/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/facemap/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and Facemap.
For 2022, the DataJoint Elements POC is Dr. Kabilar Gunalan (kabilar@datajoint.com)
For 2022, the Facemap POC is Dr. Carsen Stringer (stringerc@janelia.hhmi.org)
","location":"partnerships/facemap/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/facemap/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and Facemap.
","location":"partnerships/facemap/#licensing"},{"title":"Development Roadmap","text":"If you use Facemap please cite Stringer, Pachitariu, et al., Science 2019 in your publications.
","location":"partnerships/facemap/#citation"},{"title":"INCF","text":"DataJoint is a company member of the INCF.
","location":"partnerships/incf/"},{"title":"Sustainability Roadmap between DataJoint Elements and Neurodata Without Borders","text":"","location":"partnerships/nwb/"},{"title":"Aim","text":"
DataJoint Elements and Neurodata Without Borders (NWB) are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/nwb/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/nwb/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/nwb/#datajoint"},{"title":"Neurodata without Borders (NWB)","text":"NWB - https://www.nwb.org \u2014 is a data standard for neurophysiology, providing neuroscientists with a common standard to share, archive, use, and build analysis tools for neurophysiology data. The project is funded by the NIH grant U24 NS120057 and led by Dr. Oliver Rubel (Lawrence Berkeley National Laboratory) and Dr. Benjamin Dichter (Catalyst Neuro).
The principal developers of NWB are the Lawrence Berkeley National Laboratory and Catalyst Neuro.
","location":"partnerships/nwb/#neurodata-without-borders-nwb"},{"title":"General Principles","text":"","location":"partnerships/nwb/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/nwb/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and NWB.
For 2022, the DataJoint Elements POC is Dr. Kabilar Gunalan (kabilar@datajoint.com)
For 2022, the NWB POC is Dr. Ryan Ly (Lawrence Berkeley National Laboratory)
","location":"partnerships/nwb/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/nwb/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements workflows and NWB utilities.
","location":"partnerships/nwb/#licensing"},{"title":"Sustainability Roadmap between DataJoint Elements and Open Ephys GUI","text":"","location":"partnerships/openephysgui/"},{"title":"Aim","text":"
DataJoint Elements and Open Ephys GUI are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/openephysgui/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/openephysgui/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint Core \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and DataJoint Core is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/openephysgui/#datajoint"},{"title":"Open Ephys GUI","text":"Open Ephys GUI \u2014 https://open-ephys.org/gui \u2014 is an open-source, plugin-based application for processing, visualizing, and recording data from extracellular electrodes. The project is funded by the NIH grant U24 NS109043 and led by Dr. Josh Siegle.
The principal developers of the Open Ephys GUI are at the Allen Institute.
","location":"partnerships/openephysgui/#open-ephys-gui"},{"title":"General Principles","text":"","location":"partnerships/openephysgui/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/openephysgui/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and Open Ephys GUI.
For 2023, the DataJoint Elements POC is Dr. Thinh Nguyen (thinh@datajoint.com).
For 2023, the Open Ephys GUI POC is Dr. Josh Siegle (joshs@alleninstitute.org).
","location":"partnerships/openephysgui/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/openephysgui/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and Open Ephys GUI.
","location":"partnerships/openephysgui/#licensing"},{"title":"Development Roadmap","text":"If you use this package, please cite the Open Ephys paper in your publications.
","location":"partnerships/openephysgui/#citation"},{"title":"Sustainability Roadmap between DataJoint Elements and Suite2p","text":"","location":"partnerships/suite2p/"},{"title":"Aim","text":"
DataJoint Elements and Suite2p are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/suite2p/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/suite2p/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/suite2p/#datajoint"},{"title":"Suite2p","text":"Suite2p \u2014 https://www.suite2p.org \u2014 is a pipeline for processing calcium imaging data. The project is funded by HHMI Janelia Research Campus and led by Dr. Carsen Stringer and Dr. Marius Pachitariu.
The principal developers of Suite2p are at the Janelia Research Campus.
","location":"partnerships/suite2p/#suite2p"},{"title":"General Principles","text":"","location":"partnerships/suite2p/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/suite2p/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and Suite2p.
For 2022, the DataJoint Elements POC is Dr. Kabilar Gunalan (kabilar@datajoint.com)
For 2022, the Suite2p POC is Dr. Carsen Stringer (stringerc@janelia.hhmi.org)
","location":"partnerships/suite2p/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/suite2p/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and Suite2p.
","location":"partnerships/suite2p/#licensing"},{"title":"Development Roadmap","text":"If you use Suite2p please cite Pachitariu et al., bioRxiv 2017 in your publications.
","location":"partnerships/suite2p/#citation"},{"title":"Project Showcase","text":"Catalog
Teams
Publications
The following publications relied on DataJoint open-source software for data analysis. If your work uses DataJoint or DataJoint Elements, please cite the respective manuscripts and RRIDs.
","location":"projects/publications/"},{"title":"2024","text":"DataJoint was originally developed by working systems neuroscientists at Baylor College of Medicine to meet the needs of their own research. Below is a partial list of known teams who use DataJoint.
","location":"projects/teams/#projects"},{"title":"Multi-lab collaboratives","text":"Open-source framework for defining, operating, and querying data pipelines
Learn more
Open-source implementation of data pipelines for neuroscience studies
Learn more
A cloud platform for automated analysis workflows. It relies on DataJoint Python and DataJoint Elements.
Learn more | Sign-in
Projects and research teams supported by DataJoint software
Learn more
A collection of additional open-source tools for building and operating scientific data pipelines.
","location":"additional-resources/"},{"title":"APIs","text":"A MATLAB client for defining, operating, and querying data pipelines.
Legacy docs | Source code
A REST API server for interacting with DataJoint pipelines.
Docs | Source code
A browser-based graphical user interface for data entry and navigation.
Legacy docs | Source code
A framework for making low-code web apps for data visualization.
Legacy docs | Source code
graph\n %% Give short names\n dj[\"datajoint/datajoint\"]\n base[\"datajoint/djbase\"]\n lab[\"datajoint/djlab\"]\n hub[\"datajoint/djlabhub\"]\n test[\"datajoint/djtest\"]\n conda3[\"datajoint/miniconda3\"]\n mysql[\"datajoint/mysql\"]\n %% Define connections\n conda3 --> base --> test;\n base --> dj;\n base --> lab --> hub;\n %% Add all to class\n class dj,base,lab,hub,test,conda3,mysql boxes;\n classDef boxes stroke:#333; %% Grey stroke for class
MySQL server configured to work with DataJoint.
Docker image | Source code
Minimal Python Docker image with conda.
Docker image | Legacy docs | Source code
Minimal base Docker image with DataJoint Python dependencies installed.
Docker image | Legacy docs | Source code
Docker image for running tests related to DataJoint Python.
Docker image | Legacy docs | Source code
Official DataJoint Docker image.
Docker image | Source code
Docker image optimized for running a JupyterLab environment with DataJoint Python.
Docker image | Legacy docs | Source code
Docker image optimized for deploying to JupyterHub a JupyterLab environment with DataJoint Python.
Docker image | Legacy docs | Source code
Find us at the following workshops and conferences!
linking_module
, and key
management in make
functions.If your work uses the DataJoint Python, MATLAB, or Elements, please cite the respective manuscripts and Research Resource Identifiers (RRIDs).
","location":"about/citation/"},{"title":"DataJoint Python and MATLAB","text":"Thank you for your interest in contributing! \ud83e\udd1d
To help keep everyone in alignment and coordinated in the community effort, we\u2019ve created this document. It serves as the contribution guideline that outlines how open-source software development is to be conducted. Any software development that makes reference to this document can be assumed to adopt the policies outlined below. We\u2019ve structured the guideline in a FAQ (frequently asked questions) format to make it easier to digest. Feel free to review the questions below to determine any specific policy.
The principal maintainer of DataJoint and associated tools is the DataJoint company. The pronouns \u201cwe\u201d and \u201cus\u201d in this guideline refer to the principal maintainers. We invite reviews and contributions of the open-source software. We compiled these guidelines to make this work clear and efficient.
","location":"about/contribute/"},{"title":"Feedback","text":"DataJoint APIs, DataJoint Web GUIs, and DataJoint Elements are supported by NIH grant U24 NS116470 for disseminating open-source software for neuroscience research. Your feedback is essential for continued funding. Your feedback also helps shape the technology development roadmap for the DataJoint ecosystem. Please tell us about your projects by filling out the DataJoint Census.
","location":"about/contribute/#feedback"},{"title":"1) Which issue should I contribute towards?","text":"There are three primary things to consider when looking to contribute.
Availability: An indication of whether anyone is currently working on a fix for the given issue. Availability is indicated by who is assigned
. Issues that are unassigned
mean that there is no one yet working on resolving the issue and the issue is available for someone to work on. If an issue has been assigned, then any additional work on that issue should be coordinated with the assignee.
Specification: In order for issues to be properly addressed, the requirements of satisfying and closing the issue should be clear. If it is not, a label will be added as unspecified
. This could be due to more debug info being necessary, more details on intended behavior, or perhaps that further discussion is required to determine a good solution. Feel free to help us arrive at a proper specification.
Priority: As a community, we work on a concerted effort to bring about the realization of the milestones. We utilize milestones as a planning tool to help focus a group of changes around a release. To determine the priority of issues, simply have a look at the next milestone that is expected to arrive. Therefore, each milestone following this can be understood as lower in priority respectively. Bear in mind that much like a hurricane forecast, the execution plan is much more likely to be accurate the closer to today\u2019s date as opposed to milestones further out. Extremely low priority issues are assigned to the Backburner
milestone. Since Backburner
does not have a target date, this indicates that its issues may be deferred indefinitely. Occasionally the maintainers will move issues from Backburner
as it makes sense to address them within a release. Also, issues unassigned
to a milestone can be understood as new issues which have not been triaged.
After considering the above, you may comment on the issue you\u2019d like to help fix and a maintainer will assign it to you.
","location":"about/contribute/#1-which-issue-should-i-contribute-towards"},{"title":"2) What is the proper etiquette for proposing changes as contribution?","text":"What is generally expected from new contributions are the following:
Any proposed contributor changes should be introduced in the form of a pull request (PR) from their fork.
Proper branch target specified. The following are the generally the available branches that can be targeted:
main
or master
: Represents the single source of truth and the latest in completed development.pre
: Represents the source at the point of the last stable release.For larger more involved changes, a maintainer may determine it best to create a feature-specific branch and adjust the PR accordingly.
A summary description that describes the overall intent behind the PR.
Proper links to the issue(s) that the PR serves to resolve.
Newly introduced changes must pass any required checks. Typically as it relates to tests, this means:
Additional documentation to reflect new feature or behavior introduced.
Necessary updates to the changelog following Keep a Changelog convention.
A contributor should not approve or merge their own PR.
Reviewer suggestions or feedback should not be directly committed to a branch on a contributor\u2019s fork. A less intrusive way to collaborate would be for the reviewer to PR to the contributor\u2019s fork/branch that is associated with the main PR currently in review.
Maintainers will also ensure that PR\u2019s have the appropriate assignment for reviewer, milestone, and project.
","location":"about/contribute/#2-what-is-the-proper-etiquette-for-proposing-changes-as-contribution"},{"title":"3) How can I track the progress of an issue that has been assigned?","text":"Since milestones represent the development plan, projects represent the actual execution. Projects are typically fixed-time sprints (1-2 weeks). A \u2018workable\u2019 number of issues that have been assigned to developers and assigned to the next milestone are selected and tracked in each project to provide greater granularity in the week-to-week progress. Automation is included observing the Automated kanban with reviews
template. Maintainers will adjust the project assignment to reflect the order in which to resolve the milestone issues.
Releases follow the standard definition of semantic versioning. Meaning:
MAJOR
. MINOR
. PATCH
MAJOR
version when you make incompatible API changes,MINOR
version when you add functionality in a backwards compatible manner, andPATCH
version when you make backwards compatible bug fixes.Each release requires tagging the commit appropriately and is then issued in the normal medium for release e.g. PyPi, NPM, YARN, GitHub Release, etc.
Minor releases are triggered when all the issues assigned to a milestone are resolved and closed. Patch releases are triggered periodically from main
or master
after a reasonable number of PR merges have come in.
In order to follow the appropriate process and setting, please reference the following flow for your desired mode of engagement:
","location":"about/contribute/#5-i-am-not-yet-too-comfortable-contributing-but-would-like-to-engage-the-community-what-is-the-policy-on-community-engagement"},{"title":"5a) Generally, how do I perform ____?","text":"If the documentation does not provide clear enough instruction, please see StackOverflow posts related to the datajoint tag or ask a new question tagging it appropriately. You may refer to our datajoint tag wiki for more details on its proper use.
","location":"about/contribute/#5a-generally-how-do-i-perform-____"},{"title":"5b) I just encountered this error, how can I resolve it?","text":"Please see StackOverflow posts related to the datajoint tag or ask a new question tagging it appropriately. You may refer to our datajoint tag wiki for more details on its proper use.
","location":"about/contribute/#5b-i-just-encountered-this-error-how-can-i-resolve-it"},{"title":"5c) I just encountered this error and I am sure it is a bug, how do I report it?","text":"Please file it under the issue tracker associated with the open-source software.
","location":"about/contribute/#5c-i-just-encountered-this-error-and-i-am-sure-it-is-a-bug-how-do-i-report-it"},{"title":"5d) I have an idea or new feature request, how do I submit it?","text":"Please file it under the issue tracker associated with the open-source software.
","location":"about/contribute/#5d-i-have-an-idea-or-new-feature-request-how-do-i-submit-it"},{"title":"5e) I am curious why the maintainers choose to ____? i.e. questions that are \u2018opinionated\u2019 in nature with answers that some might disagree.","text":"Please join the community on the DataJoint Slack and ask on the most relevant channel. There, you may engage directly with the maintainers for proper discourse.
","location":"about/contribute/#5e-i-am-curious-why-the-maintainers-choose-to-____-ie-questions-that-are-opinionated-in-nature-with-answers-that-some-might-disagree"},{"title":"5f) What is the timeline or roadmap for the release of certain supported features?","text":"Please refer to milestones and projects associated with the open-source software.
","location":"about/contribute/#5f-what-is-the-timeline-or-roadmap-for-the-release-of-certain-supported-features"},{"title":"5g) I need urgent help best suited for live debugging, how can I reach out directly?","text":"Please join the community on the DataJoint Slack and ask on the most relevant channel. Please bear in mind that as open-source community software, availability of the maintainers might be limited.
","location":"about/contribute/#5g-i-need-urgent-help-best-suited-for-live-debugging-how-can-i-reach-out-directly"},{"title":"Team","text":"The project is performed by DataJoint with Dimitri Yatsenko as Principal Investigator.
","location":"about/datajoint-team/"},{"title":"Scientists","text":"The first-person pronouns \"we\" and \"our\" in these documents refer to those listed above.
","location":"about/datajoint-team/#past-contributors"},{"title":"External contributors","text":"The principal components of the Resource are developed and distributed as open-source projects and external contributions are welcome. We have adopted a Contribution Guide for DataJoint, DataJoint Elements, and related open-source tools.
","location":"about/datajoint-team/#external-contributors"},{"title":"History","text":"Dimitri Yatsenko began development of DataJoint in Andreas S.Tolias' lab in the Neuroscience Department at Baylor College of Medicine in the fall of 2009. Initially implemented as a thin MySQL API in MATLAB, it defined the major principles of the DataJoint model.
Many students and postdocs in the lab as well as collaborators and early adopters have contributed to the project. Jacob Reimer and Emmanouil Froudarakis became early adopters in Andreas Tolias' Lab and propelled development. Alexander S. Ecker, Philipp Berens, Andreas Hoenselaar, and R. James Cotton contributed to the formulation of the overall requirements for the data model and critical reviews of DataJoint development.
Outside the Tolias lab, the first labs to adopt DataJoint (approx. 2010) were the labs of Athanassios G. Siapas at CalTech, Laura Busse and Steffen Katzner at the University of T\u00fcbingen.
In 2015, the Python implementation gained momentum with Edgar Y. Walker and Fabian Sinz joining as principal contributors.
In 2016, Andreas Tolias Lab joined the MICrONS project, using DataJoint to process volumes of neurophysiology and neuroanatomical data shared across large teams.
In 2016, Vathes LLC was founded to provide support to groups using DataJoint.
In 2017, DARPA awarded a small-business innovation research grant to Vathes LLC (Contract D17PC00162) to further develop and publicize the DataJoint framework.
In June 2018, the Princeton Neuroscience Institute, under the leadership of Prof. Carlos Brody, began funding a project to generate a detailed DataJoint user manual.
","location":"about/history/"},{"title":"DataJoint Elements for Neurophysiology","text":"DataJoint Elements provides an efficient approach for neuroscience labs to create and manage scientific data workflows: the complex multi-step methods for data collection, preparation, processing, analysis, and modeling that researchers must perform in the course of an experimental study. Elements are a collection of curated modules for assembling workflows for several modalities of neurophysiology experiments and are designed for ease of integration into diverse custom workflows. This work is derived from the developments in leading neuroscience projects and uses the DataJoint API for defining, deploying, and sharing their data workflows.
An overview of the principles of DataJoint workflows and the goals of DataJoint Elements are described in the position paper \"DataJoint Elements: Data Workflows for Neurophysiology\".
Below are the projects that make up the family of open-source DataJoint Elements:
A data pipeline for calcium imaging microscopy.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for Neuropixels probes.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for electrode localization of Neuropixels probes.
Docs
A data pipeline for miniscope calcium imaging.
Interactive tutorial
Docs
A data pipeline for segmenting volumetric microscopy data with Cellpose, uploading to BossDB, and visualizing with Neuroglancer.
Interactive tutorial
Docs
A data pipeline for pose estimation with DeepLabCut.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for motion sequencing with Keypoint-MoSeq.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for pose estimation with Facemap.
Docs
A data pipeline for managing data from optogenetics experiments.
Interactive tutorial on GitHub Codespaces
Docs
A data pipeline for visual stimulation with Psychtoolbox.
Docs
A data pipeline for lab management.
Docs
A data pipeline for subject management.
Docs
A data pipeline for session management.
Docs
A data pipeline for event- and trial-based experiments.
Docs
Common functions for the DataJoint Elements.
Docs
The following conventions describe the DataJoint Python API implementation.
","location":"elements/concepts/"},{"title":"DataJoint Schemas","text":"The DataJoint Python API allows creating database schemas, which are namespaces for collections of related tables.
The following commands declare a new schema and create the object named schema
to reference the database schema.
import datajoint as dj\nschema = dj.schema('<schema_name>')\n
We follow the convention of having only one schema defined per Python module. Then such a module becomes a DataJoint schema comprising a Python module with a corresponding database schema.
The module's schema
object is then used as the decorator for classes that define tables in the database.
dj.createSchema\n
In Matlab, we list one table per file and place schemas in folders.
","location":"elements/concepts/#datajoint-schemas"},{"title":"Elements","text":"An Element is a software package defining one or more DataJoint schemas serving a particular purpose. By convention, such packages are hosted in individual GitHub repositories. For example, Element element_calcium_imaging
is hosted at this GitHub repository and contains two DataJoint schemas: scan
and imaging
.
The following YouTube videos provide information on basic design principles and file organization.
Some videos feature outdated versions of the respective GitHub repositories. For the most updated information, check the documentation page for the corresponding Element.
","location":"elements/concepts/#youtube-tutorials"},{"title":"Deferred schemas","text":"A deferred schema is one in which the name of the database schema name is not specified. This module does not declare schema and tables upon import. Instead, they are declared by calling schema.activate('<schema_name>')
after import.
By convention, all modules corresponding to deferred schema must declare the function activate
which in turn calls schema.activate
.
Thus, Element modules begin with:
import datajoint as dj\nschema = dj.schema()\n\ndef activate(schema_name):\nschema.activate(schema_name)\n
However, many activate functions perform other work associated with activating the schema such as activating other schemas upstream.
","location":"elements/concepts/#deferred-schemas"},{"title":"Linking Module","text":"To make the code more modular with fewer dependencies, Element modules do not import
upstream schemas directly. Instead, all required classes and functions must be defined in a linking_module
and passed to the module's activate
function. By keeping all upstream requirements in the linking module, all Elements can be activated as part of any larger pipeline.
For instance, the Scan module receives its required functions from the linking module passed into the module's activate
function. See the example notebooks for an example of how the linking module is passed into the Element's module.
workflow-array-ephys
) and their dependencies (e.g., element-array-ephys
).bash cd ~/Projects
)requirements.txt
in the workflow for the list of Elements to clone and install as editable. You will also need to install element-interface
deps=(\"lab\" \"animal\" \"session\" \"interface\" \"<others>\")\nfor repo in $deps # clone each\ndo \n git clone https://github.com/datajoint/element-$repo\ndone\nfor repo in $(ls -d ./{element,workflow}*) # editable install \ndo \n pip install -e ./$repo\ndone\n
If you need to drop all schemas to start fresh, you'll need to do following the dependency order. Refer to the workflow's notebook (notebooks/06-drop-optional.ipynb
) for the drop order.
/tmp/testset
)..env
file within the docker
directory with the following content. Replace /tmp/testset
with the directory where you have the test dataset downloaded. TEST_DATA_DIR=/tmp/testset
element
or your fork of an element
or the workflow
, within the Dockerfile
uncomment the lines from the different options presented. This will allow you to install the repositories of interest and run the integration tests on those packages. Be sure that the element
package version matches the version in the requirements.txt
of the workflow
.docker-compose -f ./docker/docker-compose-test.yaml up --build\n
The following document describes how to setup a development environment and connect to a database so that you can use the DataJoint Elements to build and run a workflow on your local machine.
Any of the DataJoint Elements can be combined together to create a workflow that matches your experimental setup. We have a number of example workflows to get you started. Each focuses on a specific modality, but they can be adapted for your custom workflow.
Getting up and running will require a couple items for a good development environment. If any of these items are already familiar to you and installed on your machine, you can skip the corresponding section.
1. Python
2. Conda
3. Integrated Development Environment
4. Version Control (git)
5. Visualization packages
Next, you'll need to download one of the example workflows and corresponding example data.
Finally, there are a couple different approaches to connecting to a database. Here, we highlight three approaches:
1. First Time: Beginner. Temporary storage to learn the ropes.
2. Local Database: Intermediate. Deployed on local hardware, managed by you.
3. Central Database: Advanced: Deployed on dedicated hardware.
This diagram describes the general components for a local DataJoint environment.
flowchart LR\n py_interp -->|DataJoint| db_server[(\"Database Server\\n(e.g., MySQL)\")]\n subgraph conda[\"Conda environment\"]\n direction TB\n py_interp[Python Interpreter]\n end\n subgraph empty1[\" \"] %% Empty subgraphs prevent overlapping titles\n direction TB\n style empty1 fill:none, stroke-dasharray: 0 1\n conda\n end\n subgraph term[\"Terminal or Jupyter Notebook\"]\n direction TB\n empty1\n end\n subgraph empty2[\" \"] %% Empty subgraphs prevent overlapping titles\n direction TB\n style empty2 fill:none, stroke-dasharray: 0 1\n term\n end\n class py_interp,conda,term,ide,db_server,DataJoint boxes;\n classDef boxes fill:#ddd, stroke:#333;
","location":"elements/user-guide/#development-environment"},{"title":"Python","text":"DataJoint Elements are written in Python. The DataJoint Python API supports Python versions 3.7 and up. We recommend downloading the latest stable release of 3.9 here, and following the install instructions.
","location":"elements/user-guide/#python"},{"title":"Conda","text":"Python projects each rely on different dependencies, which may conflict across projects. We recommend working in a Conda environment for each project to isolate the dependencies. For more information on why Conda, and setting up the version of Conda that best suits your needs, see this article.
To get going quickly, we recommend you ...
Download Miniconda and go through the setup, including adding Miniconda to your PATH
(full instructions here).
Declare and initialize a new conda environment with the following commands. Edit <name>
to reflect your project.
conda create --name datajoint-workflow-<name> python=3.9 \nconda activate datajoint-workflow-<name> \n
Running analyses with Element DeepLabCut or Element Calcium imaging may require tensorflow, which can cause issues on M1 machines. By saving the yaml
file below, this environment can be loaded with conda create -f my-file.yaml
. If you encounter errors related to clang
, try launching xcode and retrying.
name: dj-workflow-<name>\nchannels:\n - apple \n - conda-forge\n - defaults\ndependencies:\n - tensorflow-deps\n - opencv\n - python=3.9\n - pip>=19.0 \n - pip:\n - tensorflow-macos\n - tensorflow-metal\n - datajoint\n
","location":"elements/user-guide/#conda"},{"title":"Integrated Development Environment (IDE)","text":"Development and use can be done with a plain text editor in the terminal. However, an integrated development environment (IDE) can improve your experience. Several IDEs are available. We recommend Microsoft's Visual Studio Code, also called VS Code. To set up VS Code with Python for the first time, follow this tutorial.
","location":"elements/user-guide/#integrated-development-environment-ide"},{"title":"Version Control (git)","text":"Table definitions and analysis code can change over time, especially with multiple collaborators working on the same project. Git is an open-source, distributed version control system that helps keep track of what changes where made when, and by whom. GitHub is a platform that hosts projects managed with git. The example DataJoint Workflows are hosted on GitHub, we will use git to clone (i.e., download) this repository.
git --version
in a terminal window.To run the demo notebooks and generate visualizations associated with an example workflow, you'll need a couple extra packages.
Jupyter Notebooks help structure code (see here for full instructions on Jupyter within VS Code).
Install Jupyter packages
conda install jupyter ipykernel nb_conda_kernels\n
Ensure your VS Code python interpreter is set to your Conda environment path.
Click to expand more details.
DataJoint Diagrams rely on additional packages. To install these packages, enter the following command...
conda install graphviz python-graphviz pydotplus\n
","location":"elements/user-guide/#visualization-packages-jupyter-notebooks-datajoint-diagrams"},{"title":"Example Config, Workflows and Data","text":"Of the options below, pick the workflow that best matches your needs.
Change the directory to where you want to download the workflow.
cd ~/Projects\n
Clone the relevant repository, and change directories to this new directory.
git clone https://github.com/datajoint/<repository>\ncd <repository>\n
Install this directory as editable with the -e
flag.
pip install -e .\n
Why editable? Click for details This lets you modify the code after installation and experiment with different designs or adding additional tables. You may wish to edit pipeline.py
or paths.py
to better suit your needs. If no modification is required, using pip install .
is sufficient. Install element-interface
, which has utilities used across different Elements and Workflows.
pip install \"element-interface @ git+https://github.com/datajoint/element-interface\"\n
\u200bSet up a local DataJoint config file by saving the following block as a json in your workflow directory as dj_local_conf.json
. Not sure what to put for the < >
values below? We'll cover this when we connect to the database
{\n \"database.host\": \"<hostname>\",\n \"database.user\": \"<username>\",\n \"database.password\": \"<password>\",\n \"loglevel\": \"INFO\",\n \"safemode\": true,\n \"display.limit\": 7,\n \"display.width\": 14,\n \"display.show_tuple_count\": true,\n \"custom\": {\n \"database.prefix\": \"<username_>\"\n }\n}\n
An example workflow for session management.
Clone from GitHub
An example workflow for Neuropixels probes.
Clone from GitHub
An example workflow for calcium imaging microscopy.
Clone from GitHub
An example workflow for miniscope calcium imaging.
Clone from GitHub
An example workflow for pose estimation with DeepLabCut.
Clone from GitHub
The first notebook in each workflow will guide you through downloading example data from DataJoint's AWS storage archive. You can also process your own data. To use the example data, you would ...
Install djarchive-client
pip install git+https://github.com/datajoint/djarchive-client.git\n
Use a python terminal to import the djarchive
client and view available datasets, and revisions.
import djarchive_client\nclient = djarchive_client.client()\nlist(client.datasets()) # List available datasets, select one\nlist(client.revisions()) # List available revisions, select one\n
Prepare a directory to store the download data, for example in /tmp
, then download the data with the djarchive
client. This may take some time with larger datasets.
import os\nos.makedirs('/tmp/example_data/', exist_ok=True)\nclient.download(\n '<workflow-dataset>',\n target_directory='/tmp/example_data',\n revision='<revision>'\n)\n
The example subject6/session1
data was recorded with SpikeGLX and processed with Kilosort2.
/tmp/example_data/\n- subject6\n- session1\n - towersTask_g0_imec0\n - towersTask_g0_t0_nidq.meta\n - towersTask_g0_t0.nidq.bin\n
Element and Workflow Array Ephys also support data recorded with OpenEphys. Calcium Imaging: Click to expand details The example subject3
data was recorded with Scanbox. The example subject7
data was recorded with ScanImage. Both datasets were processed with Suite2p.
/tmp/example_data/\n- subject3/\n - 210107_run00_orientation_8dir/\n - run00_orientation_8dir_000_000.sbx\n - run00_orientation_8dir_000_000.mat\n - suite2p/\n - combined\n - plane0\n - plane1\n - plane2\n - plane3\n- subject7/\n - session1\n - suite2p\n - plane0\n
Element and Workflow Calcium Imaging also support data collected with ... - Nikon - Prairie View - CaImAn DeepLabCut: Click to expand details The example data includes both training data and pretrained models.
/tmp/test_data/from_top_tracking/\n- config.yml\n- dlc-models/iteration-0/from_top_trackingFeb23-trainset95shuffle1/\n - test/pose_cfg.yaml\n - train/\n - checkpoint\n - checkpoint_orig\n \u2500 learning_stats.csv\n \u2500 log.txt\n \u2500 pose_cfg.yaml\n \u2500 snapshot-10300.data-00000-of-00001\n \u2500 snapshot-10300.index\n \u2500 snapshot-10300.meta # same for 103000\n- labeled-data/\n - train1/\n - CollectedData_DJ.csv\n - CollectedData_DJ.h5\n - img00674.png # and others\n - train2/ # similar to above\n- videos/\n - test.mp4\n - train1.mp4\n
FaceMap: Click to expand details Associated workflow still under development
Some of the workflows carry some assumptions about how your file directory will be organized, and how some files are named.
Array Ephys: Click to expand detailsIn your DataJoint config, add another item under custom
, ephys_root_data_dir
, for your local root data directory. This can include multiple roots.
\"custom\": {\n \"database.prefix\": \"<username_>\",\n \"ephys_root_data_dir\": [\"/local/root/dir1\", \"/local/root/dir2\"]\n}\n
subject
directory names must match the subject IDs in your subjects table. The ingest.py
script ( demo ingestion notebook ) can help load these values from ./user_data/subjects.csv
.session
directories can have any naming convention, but must be specified in the session table (see also demo ingestion notebook ). probe
directory names must end in a one-digit number corresponding to the probe number.probe
directory should contain: - One neuropixels meta file named *[0-9].ap.meta
- Optionally, one Kilosort output folderFolder structure:
<ephys_root_data_dir>/\n\u2514\u2500\u2500\u2500<subject1>/ # Subject name in `subjects.csv`\n\u2502 \u2514\u2500\u2500\u2500<session0>/ # Session directory in `sessions.csv`\n\u2502 \u2502 \u2514\u2500\u2500\u2500imec0/\n\u2502 \u2502 \u2502 \u2502 *imec0.ap.meta\n\u2502 \u2502 \u2502 \u2514\u2500\u2500\u2500ksdir/\n\u2502 \u2502 \u2502 \u2502 spike_times.npy\n\u2502 \u2502 \u2502 \u2502 templates.npy\n\u2502 \u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500imec1/\n\u2502 \u2502 \u2502 *imec1.ap.meta\n\u2502 \u2502 \u2514\u2500\u2500\u2500ksdir/\n\u2502 \u2502 \u2502 spike_times.npy\n\u2502 \u2502 \u2502 templates.npy\n\u2502 \u2502 \u2502 ...\n\u2502 \u2514\u2500\u2500\u2500<session1>/\n\u2502 \u2502 \u2502 ...\n\u2514\u2500\u2500\u2500<subject2>/\n\u2502 \u2502 ...\n
Calcium Imaging: Click to expand details Note: While Element Calcium Imaging can accommodate multiple scans per session, Workflow Calcium Imaging assumes there is only one scan per session.
In your DataJoint config, add another item under custom
, imaging_root_data_dir
, for your local root data directory.
\"custom\": {\n \"database.prefix\": \"<username_>\",\n \"imaging_root_data_dir\": \"/local/root/dir1\"\n}\n
subject
directory names must match the subject IDs in your subjects table. The ingest.py
script ( tutorial notebook ) can help load these values from ./user_data/subjects.csv
.session
directories can have any naming convention, but must be specified in the session table (see also [tutorial notebook])(https://github.com/datajoint/element-calcium-imaging/blob/main/notebooks/tutorial.ipynb) . session
directory should contain: - All .tif
or .sbx
files for the scan, with any naming convention. - One suite2p
subfolder, containing the analysis outputs in the default naming convention. - One caiman
subfolder, containing the analysis output .hdf5
file, with any naming convention.Folder structure:
imaging_root_data_dir/\n\u2514\u2500\u2500\u2500<subject1>/ # Subject name in `subjects.csv`\n\u2502 \u2514\u2500\u2500\u2500<session0>/ # Session directory in `sessions.csv`\n\u2502 \u2502 \u2502 scan_0001.tif\n\u2502 \u2502 \u2502 scan_0002.tif\n\u2502 \u2502 \u2502 scan_0003.tif\n\u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500suite2p/\n\u2502 \u2502 \u2502 ops1.npy\n\u2502 \u2502 \u2514\u2500\u2500\u2500plane0/\n\u2502 \u2502 \u2502 \u2502 ops.npy\n\u2502 \u2502 \u2502 \u2502 spks.npy\n\u2502 \u2502 \u2502 \u2502 stat.npy\n\u2502 \u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500plane1/\n\u2502 \u2502 \u2502 ops.npy\n\u2502 \u2502 \u2502 spks.npy\n\u2502 \u2502 \u2502 stat.npy\n\u2502 \u2502 \u2502 ...\n\u2502 \u2502 \u2514\u2500\u2500\u2500caiman/\n\u2502 \u2502 \u2502 analysis_results.hdf5\n\u2502 \u2514\u2500\u2500\u2500<session1>/ # Session directory in `sessions.csv`\n\u2502 \u2502 \u2502 scan_0001.tif\n\u2502 \u2502 \u2502 scan_0002.tif\n\u2502 \u2502 \u2502 ...\n\u2514\u2500\u2500\u2500<subject2>/ # Subject name in `subjects.csv`\n\u2502 \u2502 ...\n
DeepLabCut: Click to expand details Note: Element DeepLabCut assumes you've already used the DeepLabCut GUI to set up your project and label your data. This can include multiple roots.
custom
, dlc_root_data_dir
, for your local root data directory. \"custom\": {\n \"database.prefix\": \"<username_>\",\n \"dlc_root_data_dir\": [\"/local/root/dir1\", \"/local/root/dir2\"]\n}\n
yaml
files reflect the current folder structure.pickle
and mat
training files. If not, follow the DeepLabCut guide to create a training datasetFolder structure:
/dlc_root_data_dir/your_project/\n- config.yaml # Including correct path information\n- dlc-models/iteration-*/your_project_date-trainset*shuffle*/\n - test/pose_cfg.yaml # Including correct path information\n - train/pose_cfg.yaml # Including correct path information\n- labeled-data/any_names/*{csv,h5,png}\n- training-datasets/iteration-*/UnaugmentedDataSet_your_project_date/\n - your_project_*shuffle*.pickle\n - your_project_scorer*shuffle*.mat\n- videos/any_names.mp4\n
Miniscope: Click to expand details In your DataJoint config, add another item under custom
, miniscope_root_data_dir
, for your local root data directory.
\"custom\": {\n \"database.prefix\": \"<username_>\",\n \"miniscope_root_data_dir\": \"/local/root/dir\"\n}\n
DataJoint helps you connect to a database server from your programming environment (i.e., Python or MATLAB), granting a number of benefits over traditional file hierarchies (see YouTube Explainer). We offer two options:
Temporary storage. Not for production use.
json
file called dj_local_conf.json
using your DataJoint account information and tutorial-db.datajoint.io
as the host. {\n \"database.host\": \"tutorial-db.datajoint.io\",\n \"database.user\": \"<datajoint-username>\",\n \"database.password\": \"<datajoint-password>\",\n \"loglevel\": \"INFO\",\n \"safemode\": true,\n \"display.limit\": 7,\n \"display.width\": 14,\n \"display.show_tuple_count\": true,\n \"custom\": {\n \"database.prefix\": \"<datajoint-username_>\"\n }\n}\n
Note: Your database prefix must begin with your username in order to have permission to declare new tables.Install Docker. Why Docker? Click for details. Docker makes it easy to package a program, including the file system and related code libraries, in a container. This container can be distributed to any machine, both automating and standardizing the setup process.
Test that docker has been installed by running the following command:
docker run --rm hello-world\n
docker run -p 3306:3306 -e MYSQL_ROOT_PASSWORD=tutorial datajoint/mysql\n
What's this doing? Click for details. json
file called dj_local_conf.json
using the following details. The prefix can be set to any value. {\n \"database.host\": \"localhost\",\n \"database.password\": \"tutorial\",\n \"database.user\": \"root\",\n \"database.port\": 3306,\n \"loglevel\": \"INFO\",\n \"safemode\": true,\n \"display.limit\": 7,\n \"display.width\": 14,\n \"display.show_tuple_count\": true,\n \"custom\": {\n \"database.prefix\": \"neuro_\"\n }\n}\n
This document is written to apply to all example workflows. Many have a docker folder used by developers to set up both a database and a local environment for integration tests. Simply docker compose up
the relevant file and docker exec
into the relevant container.
To set up a database on dedicated hardware may require expertise to set up and maintain. DataJoint's MySQL Docker image project provides all the information required to set up a dedicated database.
","location":"elements/user-guide/#central-database"},{"title":"Interacting with the Workflow","text":"","location":"elements/user-guide/#interacting-with-the-workflow"},{"title":"In Python","text":"Connect to the database and import tables
from <relevant-workflow>.pipeline import *\n
View the declared tables. For a more in depth explanation of how to run the workflow and explore the data, refer to the Jupyter notebooks in the workflow directory. Array Ephys: Click to expand details
subject.Subject()\nsession.Session()\nephys.ProbeInsertion()\nephys.EphysRecording()\nephys.Clustering()\nephys.Clustering.Unit()\n
Calcium Imaging: Click to expand details subject.Subject()\nsession.Session()\nscan.Scan()\nscan.ScanInfo()\nimaging.ProcessingParamSet()\nimaging.ProcessingTask()\n
DeepLabCut: Click to expand details subject.Subject()\nsession.Session()\ntrain.TrainingTask()\nmodel.VideoRecording.File()\nmodel.Model()\nmodel.PoseEstimation.BodyPartPosition()\n
DataJoint LabBook is a graphical user interface to facilitate data entry for existing DataJoint tables.
tutorial-db
) and you have access, you can view the contents here.You have several options for adopting DataJoint workflows for your own experiments.
","location":"elements/management/adoption/"},{"title":"Adopt independently","text":"DataJoint Elements are designed for adoption by independent users with moderate software development skills, good understanding of DataJoint principles, and adequate IT expertise or support.
If you have not yet used DataJoint, we recommend completing our online training tutorials or attending a workshop either online or in person. Interactive tutorials can be found on the DataJoint Tutorials repository.
","location":"elements/management/adoption/#adopt-independently"},{"title":"Support from DataJoint","text":"Our team provides support to labs to adopt DataJoint workflows in their research.
This includes:
These services may be subsidized by grant funding for qualified research groups.
","location":"elements/management/adoption/#support-from-datajoint"},{"title":"Dissemination Plan","text":"","location":"elements/management/dissemination/"},{"title":"1. Dissemination","text":"We conduct activities to disseminate Resource components for adoption in diverse neuroscience labs. These activities include
In order to measure the effectiveness of the Resource, we conduct several activities to estimate the adoption and use of the Resource:
This Resource is supported by the National Institute Of Neurological Disorders And Stroke of the National Institutes of Health under Award Number U24NS116470. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
","location":"elements/management/governance/#funding"},{"title":"Scientific Steering Group","text":"The project oversight and guidance is provided by the Scientific Steering Group comprising
Broad engagement with the neuroscience community is necessary for the optimization, integration, and adoption of the Resource components.
We conduct five types of outreach activities that require different approaches:
","location":"elements/management/outreach/"},{"title":"1. Precursor Projects","text":"Our Selection Process requires a \"Precursor Project\" for any new experiment modality to be included in DataJoint Elements. A precursor project is a project that develops a DataJoint pipeline for its own experiments either independently or in collaboration with our team. We reach out to teams who develop DataJoint pipelines for new experiment paradigms and modalities to identify essential design motifs, analysis tools, and related tools and interfaces. We interview the core team to learn about their collaborative culture, practices, and procedures. We jointly review their open-source code and their plans for dissemination. In many cases, our team already collaborates with such teams through our other projects and we have a good understanding of their process. As we develop a new Element to support the new modality, we remain in contact with the team to include their contribution, solicit feedback, and evaluate design tradeoffs. When the new Element is released, a full attribution is given to the Precursor Project.
Rationale: The Resource does not aim to develop fundamentally new solutions for neurophysiology data acquisition and analysis. Rather it aims to systematize and disseminate existing open-source tools proven in leading research projects.
","location":"elements/management/outreach/#1-precursor-projects"},{"title":"2. Tool Developers","text":"DataJoint pipelines rely on analysis tools, atlases, data standards, archives and catalogs, and other neuroinformatics resources developed and maintained by the broader scientific community. To ensure sustainability of the Resource, we reach out to the tool developer to establish joint sustainability roadmaps.
","location":"elements/management/outreach/#2-tool-developers"},{"title":"Management Plan","text":"DataJoint Elements has established a Resource Management Plan to select projects for development, to assure quality, and to disseminate its output as summarized in the figure below:
The following sections provide detailed information.
DataJoint and DataJoint Elements serve as a framework and starting points for numerous new projects, setting the standard of quality for data architecture and software design. To ensure higher quality, the following policies have been adopted into the software development lifecycle (SDLC).
","location":"elements/management/quality-assurance/"},{"title":"Coding Standards","text":"When writing code, the following principles should be observed.
All components and their revisions must include appropriate automated software testing to be considered for release. The core framework must undergo thorough performance evaluation and comprehensive integration testing.
Generally, this includes tests related to:
When introducing new code to the code base, the following will be required for acceptance by DataJoint core team into the main code repository.
main
branch once ready for review.Upon satisfactory adherence to the above Coding Standards, Automated Testing, and Code Reviews:
Major.Minor.Patch
number.main
branch.For external research teams that reach out to us, we will provide engineering support to help users adopt the updated software, collect feedback, and resolve issues following the processes described in the section below. If the updates require changes in the design of the database schema or formats, a process for data migration will be provided upon request.
","location":"elements/management/quality-assurance/#release-process"},{"title":"User Feedback & Issue Tracking","text":"All components will be organized in GitHub repositories with guidelines for contribution, feedback, and issue submission to the issue tracker. For more information on the general policy around issue filing, tracking, and escalation, see the DataJoint Open-Source Contribute policy. For research groups that reach out to us, our team will work closely to collect feedback and resolve issues. Typically issues will be prioritized based on their criticality and impact. If new feature requirements become apparent, this may trigger the creation of a separate workflow or a major revision of an existing workflow.
","location":"elements/management/quality-assurance/#user-feedback-issue-tracking"},{"title":"Project Selection Process","text":"The project milestones are set annually by the team under the stewardship of the NIH programmatic staff and with the guidance of the project's Scientific Steering Group
We have adopted the following general criteria for selecting and accepting new projects to be included in the Resource.
Open Precursor Projects
At least one open-source DataJoint-based precursor project must exist for any new experiment modality to be accepted for support as part of the Resource. The precursor project team must be open to interviews to describe in detail their process for the experiment workflow, tools, and interfaces.
The precursor projects must provide sample data for testing during development and for tutorials. The precursor projects will be acknowledged in the development of the component.
Rationale: This Resource does not aim to develop fundamentally new solutions for neurophysiology data acquisition and analysis. Rather it seeks to systematize and disseminate existing open-source tools proven in leading research projects.
Impact
New components proposed for support in the project must be shown to be in demand by a substantial population or research groups, on the order of 100+ labs globally.
Sustainability
For all third-party tools or resources included in the proposed component, their long-term maintenance roadmap must be established. When possible, we will contact the developer team and work with them to establish a sustainability roadmap. If no such roadmap can be established, alternative tools and resources must be identified as replacement.
","location":"partnerships/dandi/"},{"title":"Aim","text":"
DataJoint Elements and The DANDI Archive (DANDI) are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/dandi/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/dandi/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/dandi/#datajoint"},{"title":"Distributed Archives for Neurophysiology Data Integration (DANDI)","text":"DANDI - https://dandiarchive.org \u2014 is an archive for neurophysiology data, providing neuroscientists with a common platform to share, archive, and process data. The project is funded by the NIH grant R24 MH117295 and led by Dr. Satrajit S. Ghosh and Dr. Yaroslav O. Halchenko.
The principal developers of DANDI are at the Massachusetts Institute of Technology, Dartmouth College, Catalyst Neuro, and Kitware.
","location":"partnerships/dandi/#distributed-archives-for-neurophysiology-data-integration-dandi"},{"title":"General Principles","text":"","location":"partnerships/dandi/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/dandi/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and DANDI.
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the DANDI POC is Dr.Satrajit Ghosh (satra@mit.edu)
","location":"partnerships/dandi/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/dandi/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and DANDI.
","location":"partnerships/dandi/#licensing"},{"title":"Development Roadmap","text":"","location":"partnerships/facemap/"},{"title":"Aim","text":"
DataJoint Elements and Facemap are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/facemap/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/facemap/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/facemap/#datajoint"},{"title":"Facemap","text":"Facemap - https://github.com/MouseLand/facemap \u2014 is a pipeline for processing imaging data. The project is funded by HHMI Janelia Research Campus and led by Dr. Carsen Stringer and Atika Syeda.
The principal developers of Facemap are at the Janelia Research Campus.
","location":"partnerships/facemap/#facemap"},{"title":"General Principles","text":"","location":"partnerships/facemap/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/facemap/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and Facemap.
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the Facemap POC is Dr. Carsen Stringer (stringerc@janelia.hhmi.org)
","location":"partnerships/facemap/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/facemap/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and Facemap.
","location":"partnerships/facemap/#licensing"},{"title":"Development Roadmap","text":"If you use Facemap please cite Stringer, Pachitariu, et al., Science 2019 in your publications.
","location":"partnerships/facemap/#citation"},{"title":"INCF","text":"DataJoint is a company member of the INCF.
","location":"partnerships/incf/"},{"title":"Sustainability Roadmap between DataJoint Elements and Neurodata Without Borders","text":"","location":"partnerships/nwb/"},{"title":"Aim","text":"
DataJoint Elements and Neurodata Without Borders (NWB) are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/nwb/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/nwb/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/nwb/#datajoint"},{"title":"Neurodata without Borders (NWB)","text":"NWB - https://www.nwb.org \u2014 is a data standard for neurophysiology, providing neuroscientists with a common standard to share, archive, use, and build analysis tools for neurophysiology data. The project is funded by the NIH grant U24 NS120057 and led by Dr. Oliver Rubel (Lawrence Berkeley National Laboratory) and Dr. Benjamin Dichter (Catalyst Neuro).
The principal developers of NWB are the Lawrence Berkeley National Laboratory and Catalyst Neuro.
","location":"partnerships/nwb/#neurodata-without-borders-nwb"},{"title":"General Principles","text":"","location":"partnerships/nwb/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/nwb/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and NWB.
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the NWB POC is Dr. Ryan Ly (Lawrence Berkeley National Laboratory)
","location":"partnerships/nwb/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/nwb/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements workflows and NWB utilities.
","location":"partnerships/nwb/#licensing"},{"title":"Sustainability Roadmap between DataJoint Elements and Open Ephys GUI","text":"","location":"partnerships/openephysgui/"},{"title":"Aim","text":"
DataJoint Elements and Open Ephys GUI are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/openephysgui/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/openephysgui/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint Core \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and DataJoint Core is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/openephysgui/#datajoint"},{"title":"Open Ephys GUI","text":"Open Ephys GUI \u2014 https://open-ephys.org/gui \u2014 is an open-source, plugin-based application for processing, visualizing, and recording data from extracellular electrodes. The project is funded by the NIH grant U24 NS109043 and led by Dr. Josh Siegle.
The principal developers of the Open Ephys GUI are at the Allen Institute.
","location":"partnerships/openephysgui/#open-ephys-gui"},{"title":"General Principles","text":"","location":"partnerships/openephysgui/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/openephysgui/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and Open Ephys GUI.
For 2023, the DataJoint Elements POC is Dr. Thinh Nguyen (thinh@datajoint.com).
For 2023, the Open Ephys GUI POC is Dr. Josh Siegle (joshs@alleninstitute.org).
","location":"partnerships/openephysgui/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/openephysgui/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and Open Ephys GUI.
","location":"partnerships/openephysgui/#licensing"},{"title":"Development Roadmap","text":"If you use this package, please cite the Open Ephys paper in your publications.
","location":"partnerships/openephysgui/#citation"},{"title":"Sustainability Roadmap between DataJoint Elements and Suite2p","text":"","location":"partnerships/suite2p/"},{"title":"Aim","text":"
DataJoint Elements and Suite2p are two neuroinformatics initiatives in active development. The projects develop independently yet they have complementary aims and overlapping user communities. This document establishes key processes for coordinating development and communications in order to promote integration and interoperability across the two ecosystems.
","location":"partnerships/suite2p/#aim"},{"title":"Projects and Teams","text":"","location":"partnerships/suite2p/#projects-and-teams"},{"title":"DataJoint","text":"DataJoint Elements \u2014 https://datajoint.com/docs/elements/ \u2014 is a collection of open-source reference database schemas and analysis workflows for neurophysiology experiments, supported by DataJoint \u2014 https://datajoint.com/docs/core/ \u2014 an open-source software framework. The project is funded by the NIH grant U24 NS116470 and led by Dr. Dimitri Yatsenko.
The principal developer of DataJoint Elements and the DataJoint framework is the company DataJoint \u2014 https://datajoint.com.
","location":"partnerships/suite2p/#datajoint"},{"title":"Suite2p","text":"Suite2p \u2014 https://www.suite2p.org \u2014 is a pipeline for processing calcium imaging data. The project is funded by HHMI Janelia Research Campus and led by Dr. Carsen Stringer and Dr. Marius Pachitariu.
The principal developers of Suite2p are at the Janelia Research Campus.
","location":"partnerships/suite2p/#suite2p"},{"title":"General Principles","text":"","location":"partnerships/suite2p/#general-principles"},{"title":"No obligation","text":"The developers of the two ecosystems acknowledge that this roadmap document creates no contractual relationship between them but they agree to work together in the spirit of partnership to ensure that there is a united, visible, and responsive leadership and to demonstrate administrative and managerial commitment to coordinate development and communications.
","location":"partnerships/suite2p/#no-obligation"},{"title":"Coordinated Development","text":"The two projects will coordinate their development approaches to ensure maximum interoperability. This includes:
To achieve the aims of coordinated development, both projects appoint a primary point of contact (POC) to respond to questions relating to the integration and interoperability of DataJoint Elements and Suite2p.
For 2022, the DataJoint Elements POC is Dr. Kushal Bakshi (kushal@datajoint.com)
For 2022, the Suite2p POC is Dr. Carsen Stringer (stringerc@janelia.hhmi.org)
","location":"partnerships/suite2p/#points-of-contact"},{"title":"Annual Review","text":"To achieve the aims of coordinated development, the principal developers conduct a joint annual review of this roadmap document to ensure that the two programs are well integrated and not redundant. The contents and resolutions of the review will be made publicly available.
","location":"partnerships/suite2p/#annual-review"},{"title":"Licensing","text":"The two parties ensure that relevant software components are developed under licenses that avoid any hindrance to integration and interoperability between DataJoint Elements and Suite2p.
","location":"partnerships/suite2p/#licensing"},{"title":"Development Roadmap","text":"If you use Suite2p please cite Pachitariu et al., bioRxiv 2017 in your publications.
","location":"partnerships/suite2p/#citation"},{"title":"Project Showcase","text":"Catalog
Teams
Publications
The following publications relied on DataJoint open-source software for data analysis. If your work uses DataJoint or DataJoint Elements, please cite the respective manuscripts and RRIDs.
","location":"projects/publications/"},{"title":"2024","text":"DataJoint was originally developed by working systems neuroscientists at Baylor College of Medicine to meet the needs of their own research. Below is a partial list of known teams who use DataJoint.
","location":"projects/teams/#projects"},{"title":"Multi-lab collaboratives","text":"