CoderData is designed to be a customizable resources that can be altered and appended for your own needs.
CoderData is indeed a work in progress. If you have specific requests or bugs, please file an issue on our GitHub page and we will begin a conversation about details and how to fix the issue. If you would like to create a new feature to address the issue, you are welcome to fork the repository and create a pull request to discuss it in more detail. These will be triaged by the CoderData team as they are received.
CoderData is designed to be federated and therefore you can build your own dataset that can be accessed locally. Below is an image of the current CoderData framework. Each dataset is processed by a single Docker image with a series of standard scripts
To add your own data, you must add a Docker image with the following constraints:
Dockerfile.[yourdataset]
and reside in the
/build/docker
directorybuild_omics.sh
, build_samples.sh
,
build_drugs.sh
and
build_exp.sh
The full process is documented on our GitHub site under ‘Adding a new dataset’.
Considerations to include are:
Sample
table of the linkML
schema?Lastly, check out examples! We have numerous Docker files in our Dockerfile directory, and multiple datasets in our build directory.
Your contributions are essential to the growth and improvement of CoderData. We look forward to collaborating with you!