DataLad project
This is the DataLad project’s (meta) documentation. It should have everything there is to know about DataLad – the project, not (just) the software. This includes how things are done, what we are planning on doing, and maybe why we are no longer doing things in particular ways.
Select any topic from the menu or search this site for information.
About the DataLad Project
DataLad is a Python-based distributed data management system that keeps track of your data with version control, creates structure, ensures reproducibility, supports collaboration, and integrates with widely used data infrastructure. DataLad (the software) is developed and maintained as a free and open source project by a global and interdisciplinary community of scientists.
The primary goal of the DataLad project is to support the collaborative process of distilling knowledge from data according to the FAIR Guiding Principles — Findability, Accessibility, Interoperability, and Reusability. We emphasize creating an inclusive, supportive space where users are empowered to make the most of our products and contribute to the project and community, and we strive to foster an interconnected network through interoperable software development within a larger ecosystem, the organization of community events, and participation in collaborative research initiatives.
Historically, the DataLad project was established for researchers in medicine and the neurosciences, and it is hosted through a collaboration between the Brain and Behaviour division of the Institute of Neuroscience and Medicine (INM-7) at Foschungszentrum Jülich and the Center for Open Neuroscience affiliated with Dartmouth University. The project’s domain-agnostic focus on software interoperability through integrations and extensions now extends its reach into diverse disciplines to anyone seeking to work responsibly with data. DataLad is governed as a consensus-based meritocracy relying on its thousands of users and its dedicated contributors.
Users and developers can ask questions and support each other asynchronously in the community Matrix chat or via Q&A (Question and Answer) portals, or interface live during an online office hour call. Contributors engage with the project through many avenues — including the various communication channels — by submitting issues, documentation, and code for consideration in project code repositories and participating in discussions on development, project management and strategic planning, and voting on the development mailing list. Community membors are encouraged to show their support for the project by following the DataLad blog and social media accounts. All members of the DataLad community adhere to its Code of Conduct.
DataLad development is primarily funded by the U.S. National Science Foundation (NSF) and the German Federal Ministry of Education and Research (BMBF). Additional support has been provided by the Helmholtz Research Center Jülich (FZJ), U.S. National Institute of Biomedical Imaging and Bioengineering (NIBIB) via ReproNim, the European Union’s Horizon 2020 research and innovation programme, the Deutsche Forschungsgemeinschaft(DFG), and the German federal state of Saxony-Anhalt and the European Regional Development Fund. All DataLad code and learning resources are available under open-source licenses, and the software itself and its associated documentation are published under the MIT license.