This is the DataLad project’s (meta) documentation. It should have everything there is to know about DataLad – the project, not (just) the software. This includes how things are done, what we are planning on doing, and maybe why we are no longer doing things in particular ways.
Select any topic from the menu or search this site for information.
About the DataLad Project
DataLad is a Python-based distributed data management system that keeps track of your data with version control, creates structure, ensures reproducibility, supports collaboration, and integrates with widely used data infrastructure. DataLad (the software) is developed and maintained as a free and open source project by a global and interdisciplinary community of scientists.
The primary goal of the DataLad project is to support the collaborative process of distilling knowledge from data according to the FAIR Guiding Principles — Findability, Accessibility, Interoperability, and Reusability. We emphasize creating an inclusive, supportive space where users are empowered to make the most of our products and contribute to the project and community, and we strive to foster an interconnected network through interoperable software development within a larger ecosystem, the organization of community events, and participation in collaborative research initiatives.
Historically, the DataLad project was established for researchers in medicine and the neurosciences, and it is hosted through a collaboration between the Brain and Behaviour division of the Institute of Neuroscience and Medicine (INM-7) at Forschungszentrum Jülich and the Center for Open Neuroscience affiliated with Dartmouth University. The project’s domain-agnostic focus on software interoperability through integrations and extensions now extends its reach into diverse disciplines to anyone seeking to work responsibly with data. DataLad is governed as a consensus-based meritocracy relying on its thousands of users and its dedicated contributors.
Users and developers can ask questions and support each other asynchronously in the community Matrix chat or via Q&A (Question and Answer) portals, or interface live during an online office hour call. Contributors engage with the project through many avenues — including the various communication channels — by submitting issues, documentation, and code for consideration in project code repositories and participating in discussions on development, project management and strategic planning, and voting on the development mailing list. Community members are encouraged to show their support for the project by following the DataLad blog and social media accounts. All members of the DataLad community adhere to its Code of Conduct.
While the DataLad software package can stand alone as a lightweight and portable tool for distributed data management, the DataLad project represents a larger ecosystem of free and open source tools. The DataLad project makes decisions on a subset of these tools,
as defined in the scope of the governance structure; tools outside of the governance scope are also contributed to by
other entities.
Learn more about the many products offered under the DataLad umbrella:
DataLad is a Python-based command line tool that makes data management and data distribution more accessible. Built on the shoulders of Git and git-annex, DataLad delivers a decentralized system for data exchange and version control, providing structure as Git(-annex) repositories - or datasets. DataLad datasets are structured just like computer directories, where files are collected into folders that can be nested into modular units. At its core, DataLad is a general, multi-purpose tool, but several extensions are available to provide additional domain-specific functionality.
This is a pure-Python library with a collection of utilities for working with
data in the vicinity of Git and
git-annex. While this is a foundational
library from and for the DataLad project, its implementations are standalone,
and are meant to be equally well usable outside the DataLad system.
A focus of this library is efficient communication with subprocesses, such as
Git or git-annex commands, which read and produce data in some format.
DataLad Framework
datalad-core is a lightweight core library for the DataLad framework. It can be used for the development of any DataLad packages.
DataLad Concepts
datalad-concepts collects standardized schemas/vocabulary for implementing metadata-driven workflows.
The schemas express metadata in simple data structures rather than requiring the use of vocabulary specific
to databases and query languages. All schemas are implemented in LinkML.
Installer
The DataLad installer is a utility for installing Datalad, git-annex, and
related components all in a single invocation. It requires no third-party
Python libraries, though it does make heavy use of external packaging commands.
DataLad Extensions
DataLad is not just a single software package. Numerous extension packages can equip the base
package with additional functionality, or even tailor and tune the way the base package works.
DataLad extensions are shipped as separate Python packages. The installation is
typically done with standard Python package managers, such as pip. For some extensions
it may be necessary to perform additional set up steps in order to become fully functional.
Here is a list of available extension packages for the DataLad software:
Additional DataLad extensions found on PyPi may represent other community packages and/or packages
that are deprecated, unmaintained, and/or archived.
Subsections of DataLad Extensions
datalad-catalog extension
This is a DataLad extension that allows you to automatically generate a user-friendly data browser from structured metadata.
datalad-container extension
This extension equips DataLad’s run/rerun functionality with the ability to
transparently execute commands in containerized computational environments.
This is a DataLad extension that allows you to crawl external web resources into an automated data distribution.
It provides functionality for tracking data on a website and make its files available on a local machine, as well
as for querying for potential updates to the website and obtaining any changes.
datalad-dataverse extension
This extension provides interoperability with Dataverse to support data transport (along with full version history) to and from Dataverse instances.
Dataverse is an open source research data repository software that is deployed all over the world. It supports sharing, preserving, citing, exploring, and analyzing research data with descriptive metadata, and thus contributes greatly to open, reproducible, and FAIR science.
datalad-debian extension
This is a DataLad extension that allows for working with Debian packages and package repositories.
datalad-deprecated extension
This is a DataLad extension that provides functionality that has been phased out of the core DataLad package.
datalad-fuse extension
This is a DataLad extension that allows for reading files in a DataLad dateset from their remote web URLs without having to download them in their entirety first. Instead, fsspec is used to sparsely download and locally cache the files as needed.
datalad-gooey extension
This extension provides a graphical user interface (GUI) for DataLad, making key data management tasks more accessible and convenient, without requiring familiarity with the command line. This simplified interface to DataLad is built on a foundation that is capable of providing GUIs for any DataLad command, including those provided by extension packages. Moreover, extension packages can even provide their own GUI suites, by mixing and tuning a custom set of commands and parameters. DataLad Gooey is compatible with all major operating systems.
datalad-metalad extension
This extension equips DataLad with an alternative command suite and advanced tooling for metadata handling (extraction, aggregation, filtering, and reporting).
datalad-neuroimaging extension
This extension enhances DataLad for working with neuroimaging data and workflows. It provides metadata extraction support for a range of standards common to neuroimaging data.
datalad-next extension
This DataLad extension can be thought of as a staging area for additional
functionality, or for improved performance and user experience. Unlike other
topical or more experimental extensions, the focus here is on functionality
with broad applicability. This extension is a suitable dependency for other
software packages that intend to build on this improved set of functionality.
This extension enables DataLad datasets to be hosted on the Open Science Framework (OSF). Use it to store, share, retrieve, and collaborate on DataLad datasets via the OSF.
datalad-ukbiobank extension
This extension equips DataLad with a set of commands to obtain, monitor, and restructure imaging data releases of the UKBiobank. It is designed to download MRI bulk data, track additions/redactions/fixes from the UK Biobank, and (optionally) restructure into BIDS layout.
UKBiobank is a national and international health resource with unparalleled research opportunities, open to all bona fide health researchers. UKBiobank aims to improve the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia. It is following the health and well-being of 500,000 volunteer participants and provides health information, which does not identify participants, to approved researchers in the UK and overseas, from academia and industry.
datalad-xnat extension
This extension packages equips DataLad with a set of commands to track XNAT projects.
XNAT is an open source imaging informatics platform developed by the Neuroinformatics Research Group at Washington University. It facilitates common management, productivity, and quality assurance tasks for imaging and associated data. XNAT can be used to support a wide range of neuro/medical imaging-based projects.
Support
The DataLad project has several community support channels for users and developers to get in touch with other community members and DataLad core developers. Learn more about how these support channels are managed by visiting the Support section of the Governance page.
The main chat is a general purpose room for any topic. The office hour chat is more focused on user support and also used
to coordinate the online office hour.
Office Hour
We run a weekly online office hour, where anybody can pop in to ask questions or to demo challenges.
The location/link is communicated via the office hour chatroom. This is also the
place where cancellations or holiday breaks are announced.
The current office hour slot is every Monday 14:00 CE(S)T.
Q&A Portals
There is some monitoring of Q&A portals:
neurostars.org using the datalad
tag. This is the most active Q&A site
for DataLad-related questions, due to the historically large user community in
this field.
Anyone is welcome to report bugs or request new features by filing issues through the relevant issue trackers for DataLad products.
Mailing List
Communications on management, strategic planning, decision-making, and voting for the DataLad project take place publicly on the datalad-devel mailing list; all DataLad contributors are encouraged to subscribe.
Documentation
The DataLad community maintains several important resources to help researchers and DataLad users/developers understand and effectively engage with DataLad tools.
The DataLad Handbook is a user guide including information on getting started with DataLad; implementing more advanced use cases; and on using the handbook as a training tool.
The DataLad community has developed (and made publicly available) training materials and educational resources to help teach best practices for using DataLad.
The DataLad Tutorials repository is a collection of tutorials, workshop materials, and talks for learning more about using DataLad and applying it to specific use cases.
In order to support the DataLad community in diverse ways and to create infrastructure to support DataLad’s many functionalities, the project provides a number of websites and services.
https://blog.datalad.org/ is a blog on data management and DataLad that acts as a forum for people to share developments in the community. It is generated from a DataLad dataset with Hugo.
https://project.datalad.org/ serves as the DataLad project documentation.
Rather than providing technical documentation for datalad, the software,
this site provides metadocumentation for DataLad, the project.
This is a meritocratic, consensus-based community project. Anyone with an interest in the project can join the community, contribute to the project design and participate in the decision making process. This document describes how that participation takes place and how to set about earning merit within the project community.
Scope
The DataLad project develops software that is part of a larger ecosystem of interoperable components, also contributed by other entities. This document is exclusively concerned with the governance of a subset of those components that are collectively developed and maintained by the DataLad project. The components presently are:
The main communication channel for contributors is the development mailing list, presently at datalad-devel@fz-juelich.de. Mailing list messages are archived. The list archive is public, presently at https://lists.fz-juelich.de/hyperkitty/list/datalad-devel@lists.fz-juelich.de.
The project maintains a number of additional communication channels for a variety of purposes and audiences. However, all communication on management, strategic planning, and voting takes place on the development mailing list. All contributors are encouraged to subscribe to this mailing list.
Users are community members who have a need for the project.
They are the most important members of the community and without them the project would have no purpose.
Anyone can be a user; there are no special requirements.
The project asks its users to participate in the project and community as much as possible.
User contributions help to ensure that the project outputs satisfy the needs of those users.
Common user contributions include (but are not limited to):
evangelizing about the project (e.g. a link on a website and word-of-mouth awareness raising)
informing developers of strengths and weaknesses from a new user perspective
providing moral support (a “thank you” goes a long way)
showing support, e.g., by “staring” the project on GitHub or subscribing/liking/following the project on social media.
Users who continue to engage with the project and its community will often become more and more involved.
Such users may find themselves becoming contributors.
Contributors
Contributors are community members who contribute in concrete ways to the project.
Anyone can become a contributor, and contributions can take many forms.
There is no expectation of commitment to the project, no specific skill requirements and no selection process.
In addition to their actions as users, contributors may also find themselves doing one or more of the following:
supporting new users (existing users are often the best people to support new users)
reporting bugs
identifying requirements
providing graphics and web design
programming
assisting with project infrastructure
writing documentation
fixing bugs
adding features
Contributors engage with the project through issue trackers or other communication channels, or by writing or editing documentation.
They submit changes to the project code repositories, which will be considered for inclusion by existing committers.
The development mailing list is the most appropriate place to ask for help when making that first contribution.
As contributors gain experience and familiarity with the project, their profile within, and commitment to, the community will increase.
At some stage, they may find themselves being nominated for committership.
Committers
Committers are community members who have shown that they are committed to the continued development of the project through ongoing engagement with the community.
Committership allows contributors to more easily carry on with their project-related activities by giving them direct access to the project’s resources.
That is, they can make changes directly to project outputs, without having to submit changes via patches.
This does not mean that a committer is free to do what they want.
In fact, committers have no more authority over the project than contributors.
While committership indicates a valued member of the community who has demonstrated a healthy respect for the project’s aims and objectives, their work continues to be reviewed by the community before acceptance in an official release.
The key difference between a committer and a contributor is when this approval is sought from the community.
A committer seeks approval after the contribution is made, rather than before.
Seeking approval after making a contribution is known as a commit-then-review process.
It is more efficient to allow trusted people to make direct contributions, as the majority of those contributions will be accepted by the project.
The project employs various communication mechanisms to ensure that all contributions are reviewed by the community as a whole.
By the time a contributor is invited to become a committer, they will have become familiar with the project’s various tools as a user and then as a contributor.
Anyone can become a committer; there are no special requirements, other than to have shown a willingness and ability to participate in the project as a team player.
Typically, a potential committer will need to show that they have an understanding of the project, its objectives and its strategy.
They will also have provided valuable contributions to the project over a period of time.
New committers can be nominated by any existing committer.
Once they have been nominated, there will be a vote by the project management committee (PMC).
Committer voting is one of the few activities that takes place on the project’s private management channel.
This is to allow PMC members to freely express their opinions about a nominee without causing embarrassment.
Once the vote has been held, the outcome of the vote is communicated to the project via the development mailing list.
The nominee is entitled to request an explanation of any ’no’ votes against them, regardless of the outcome of the vote.
This explanation will be provided by the PMC Chair and will be anonymous and constructive in nature.
Nominees may decline their appointment as a committer.
However, this is unusual, as the project does not expect any specific time or resource commitment from its community members.
The intention behind the role of committer is to allow people to contribute to the project more easily, not to tie them in to the project in any formal way.
It is important to recognize that commitership is a privilege, not a right.
That privilege must be earned and once earned it can be removed by the PMC in extreme circumstances.
However, under normal circumstances committership exists for as long as the committer wishes to continue engaging with the project.
Effective commit access to project repositories for an individual person may be disabled for security reasons when they have not contributed within the last 12 months, but is reinstated upon request by the committer.
A committer who shows an above-average level of contribution to the project, particularly with respect to its strategic direction and long-term health, may be nominated to become a member of the PMC.
Project management committee (PMC)
The project management committee has additional responsibilities over and above those of a committer.
These responsibilities ensure the smooth running of the project.
PMC members are expected to review code contributions, participate in strategic planning, approve changes to the governance model and manage the copyrights within the project outputs.
Members of the PMC do not have significant authority over other members of the community, although it is the PMC that votes on new committers.
It also makes decisions when community consensus cannot be reached.
The PMC also decides whether to include additional components of the DataLad ecosystem under this governance umbrella, or whether to remove components that have been covered previously.
In addition, the PMC has access to the project’s private communication channels.
These are used for sensitive issues, such as votes for new committers and legal matters that cannot be discussed in public.
They are never used for project management or planning.
Membership of the PMC is by invitation from the existing PMC members.
A nomination will result in discussion and then a vote by the existing PMC members.
PMC membership votes are subject to consensus approval of the current PMC members.
PMC Chair
The PMC Chair is a single individual, voted for by the PMC members.
Once someone has been appointed Chair, they remain in that role until they choose to retire, or the PMC casts a two-thirds majority vote to remove them.
The PMC Chair has no additional authority over other members of the PMC: the role is one of coordinator and facilitator.
The Chair is also expected to ensure that all governance processes are adhered to.
If there is no external PMC member that can function as a tie breaker, the Chair has the casting vote when the project fails to reach consensus.
External PMC member
The PMC may have one member who is not a committer.
This person shall represent the larger ecosystem and community that DataLad is part of.
Membership is by invitation from PMC members, and voted for by PMC members.
Membership continues until the person chooses to retire, or the PMC casts a two-thirds majority vote to remove them.
The external PMC member does not have a binding vote.
However, in case of a tie between choices in the outcome of a vote, the external PMC member is asked to select one of these options as the winner.
Project management committee
The project management committee presently comprises the following individuals (in alphabetical order), with their specific roles (if any) listed:
All participants in the community are encouraged to provide support for new users.
This support is provided as a way of growing the community.
Those seeking support should recognize that all support activity within the project is voluntary and is therefore provided as and when time allows.
A user requiring guaranteed response times or results should therefore seek to purchase a support contract from a community member.
However, for those willing to engage with the project on its own terms, and willing to help support other users, the community support channels are ideal.
Contribution process
Anyone can contribute to the project, regardless of their skills, as there are many ways to contribute.
For instance, a contributor might be active on the project mailing list and issue tracker, or might supply patches.
The development mailing list is the most appropriate place for a contributor to ask for help when making their first contribution.
Decision making process
Decisions about the future of the project are made through discussion with all members of the community, from the newest user to the most experienced PMC member.
All non-sensitive project management discussion takes place on the development mailing list.
Occasionally, sensitive discussion occurs on a private channel.
In order to ensure that the project is not bogged down by endless discussion and continual voting, the project operates a policy of lazy consensus.
This allows the majority of decisions to be made without resorting to a formal vote.
Lazy consensus
Decision making typically involves the following steps:
Proposal
Discussion
Vote (if consensus is not reached through discussion)
Decision
Any community member can make a proposal for consideration by the community.
In order to initiate a discussion about a new idea, they should send an email to the development mailing list or submit a patch implementing the idea to the issue tracker (or version-control system if they have commit access).
This will prompt a review and, if necessary, a discussion of the idea.
The goal of this review and discussion is to gain approval for the contribution.
Since most people in the project community have a shared vision, there is often little need for discussion in order to reach consensus.
In general, as long as nobody explicitly opposes a proposal or patch, it is recognized as having the support of the community.
This is called lazy consensus - that is, those who have not stated their opinion explicitly have implicitly agreed to the implementation of the proposal.
Lazy consensus is a very important concept within the project.
It is this process that allows a large group of people to efficiently reach consensus, as someone with no objections to a proposal need not spend time stating their position, and others need not spend time reading such mails.
For lazy consensus to be effective, it is necessary to allow at least 96 hours before assuming that there are no objections to the proposal.
This requirement ensures that everyone is given enough time to read, digest and respond to the proposal.
This time period is chosen so as to be as inclusive as possible of all participants, regardless of their location and time commitments.
Voting
Not all decisions can be made using lazy consensus.
Issues such as those affecting the strategic direction or legal standing of the project must gain explicit approval in the form of a vote.
Every member of the community is encouraged to express their opinions in all discussion and all votes.
However, only project committers and/or PMC members have binding votes for the purposes of decision making.
Procedure
If a formal vote on a proposal is called (signaled simply by sending an email with [VOTE] in the subject line), all subscribers of the development mailing list may express an opinion and vote.
They do this by sending an email in reply to the original [VOTE] email, with the following vote and information:
+1 (yes, agree): also willing to help bring about the proposed action
+0 (yes, agree): not willing or able to help bring about the proposed action
-0 (no, disagree): but will not oppose the action’s going forward
-1 (no, disagree): opposes the action going forward and must propose an alternative action to address the issue (or a justification for not addressing the issue)
To abstain from the vote, participants simply do not respond to the email.
However, it can be more helpful to cast a +0 or -0 than to abstain, since this allows the team to gauge the general feeling of the community if the proposal should be controversial.
Every member of the community, from interested user to the most active developer, has a vote.
The project encourages all members to express their opinions in all discussion and all votes.
However, only some members have binding votes for the purposes of decision making (see below).
It is therefore their responsibility to ensure that the opinions of all community members are considered.
While not all members may have a binding vote, a well-justified -1 from a non-committer must be considered by the community, and if appropriate, supported by a binding -1.
A -1 can also indicate a veto, depending on the type of vote and who is using it.
Someone without a binding vote cannot veto a proposal, so in their case a -1 would simply indicate an objection.
When a [VOTE] receives a -1, it is the responsibility of the community as a whole to address the objection.
Such discussion will continue until the objection is either rescinded, overruled (in the case of a non-binding veto) or the proposal itself is altered in order to achieve consensus (possibly by withdrawing it altogether).
In the rare circumstance that consensus cannot be achieved, the PMC will decide the forward course of action.
In summary:
Those who don’t agree with the proposal and think they have a better idea should vote -1 and defend their counter-proposal.
Those who don’t agree but don’t have a better idea should vote -0.
Those who agree but will not actively assist in implementing the proposal should vote +0.
Those who agree and will actively assist in implementing the proposal should vote +1.
Types of approval
Different actions require different types of approval, ranging from lazy consensus to a majority decision by the PMC.
These are summarised in the table below.
The section after the table describes which type of approval should be used in common situations.
An action with lazy consensus is implicitly allowed, unless a binding -1 vote is received. Depending on the type of action, a vote will then be called. Note that even though a binding -1 is required to prevent the action, all community members are encouraged to cast a -1 vote with supporting argument. Committers are expected to evaluate the argument and, if necessary, support it with a binding -1.
N/A
Lazy majority
A lazy majority vote requires more binding +1 votes than binding -1 votes.
72 hours
Consensus approval
Consensus approval requires three binding +1 votes and no binding -1 votes.
72 hours
Unanimous consensus
All of the binding votes that are cast are to be +1 and there can be no binding vetoes (-1).
120 hours
2/3 majority
Some strategic actions require a 2/3 majority of PMC members; in addition, 2/3 of the binding votes cast must be +1. Such actions typically affect the foundation of the project (e.g. adopting a new codebase to replace an existing product).
120 hours
When is a vote required?
Every effort is made to allow the majority of decisions to be taken through lazy consensus.
That is, simply stating one’s intentions is assumed to be enough to proceed, unless an objection is raised.
However, some activities require a more formal approval process in order to ensure fully transparent decision making.
The table below describes some of the actions that will require a vote.
It also identifies which type of vote should be called.
Action
Description
Approval type
Release plan
Defines the timetable and actions for a release. A release plan cannot be vetoed (hence lazy majority).
Lazy majority
Product release
When a release of one of the project’s products is ready, a vote is required to accept the release as an official release of the project. A release cannot be vetoed (hence lazy majority).
The DataLad project stands on the shoulders on an enthusiastic and diverse community of people who are dedicated to supporting it in various ways. Comprised of people from all over the globe, the DataLad community welcomes and encourages participation from everyone.
Visit the sections listed below for details on individual aspect’s of the project’s community.
DataLad started out as a rather monolithic code base that mixed a Python library, a Python API geared towards interactive use, and a command line interface (CLI).
The general development trajectory is to disentangle the code, and form a more modular, layered software system that comprises:
dedicated applications providing a CLI, a graphical user interface (GUI), and a Python-based command API for interactive use and scripting
a collection of topical extension packages
utility libraries for a DataLad framework of closely aligned implementations
another utility library with generic algorithms and implementations, not considered to be part of the DataLad framework
The schema below depict the envisioned relationships and dependencies between
these components (solid arrows indicate dependencies and dashed ones optional usage).
graph LR;
subgraph "Non-framework<br>utility libraries"
salad("datasalad")
end
subgraph DataLad framework
core("datalad-core")
next("datalad-next")
end
subgraph "DataLad<br>applications"
dlcmd("dlcmd (CLI)")
gooey("gooey (GUI)")
py("Python<br>Command API")
end
subgraph "(3rd-party)<br>Extension packages"
extension("datalad-${extension}")
end
salad ---> core
salad ---> next
salad ---> extension
core --> next
core --> extension
next -.-> extension
core --> dlcmd
next -.-> dlcmd
extension -.-> dlcmd
core --> gooey
next -.-> gooey
extension -.-> gooey
core --> py
next -.-> py
extension -.-> py
%% node links to websites
click salad href "https://github.com/datalad/datasalad"
click core href "https://github.com/datalad/datalad-core"
click next href "https://github.com/datalad/datalad-next"
click dlcmd href "/dev/dlcmd/"
Targeted components
Non-framework utility libraries
Such libraries hold implementations developed by the DataLad project and for the DataLad project that are nevertheless so generic that they are not considered to be part of the DataLad framework. The means:
no DataLad jargon in messages
no dependencies on other DataLad components
no use of DataLad facilities (e.g., recognition of DataLad-specific configuration)
A concrete library example is datasalad, which provides tooling to work with subprocesses.
DataLad framework libraries
These library provide everything necessary to implement DataLad command and have them work in a uniform fashion.
This includes aspects like configuration management, particular workflows (e.g., credential input and storage), and working with git(-annex) repositories in a particular “DataLad way”.
We distinguish two different libraries: core and next.
The core library provide the essential set of DataLad functionality that is broadly applicable to the widest range of use cases.
It aims to have a lean dependency footprint to enable deploying DataLad in a wide range of environments.
The current development state is available at https://github.com/datalad/datalad-core.
The next library serves the same purpose and scope as the core library.
It is, however, a staging area for making new and improved implementations available before they may migrate in the core library.
While core evolves at a comparatively slow pace, next is expected to have a much higher frequency of feature releases.
The current development state is available at https://github.com/datalad/datalad-next.
Topical DataLad extension packages can use both libraries to implement their functionality.
User interfaces
The libraries are accompanied by applications that provide concrete user interfaces. These (can) include:
command line interface (CLI)
graphical user interface (GUI)
language-bindings or scripting interfaces
Such interface applications could be lean (only proxying library functionality), or heavily tailored for a specific purpose.
There is no assumption of exclusivity.
For example, there can be any number of CLI implementations.
In order to cleanly separate the underlying requirements and dependencies, even a “Python
command API” is distinguished from the framework libraries (also written in Python).
Only the former will define aspects like a uniform logging/messaging behavior.
Topical extension packages
Extension packages extend DataLad with additional functionality.
Many extensions are provided by the DataLad project, but their can be implemented completely independent of the project and require no approval and generally need no coordination with the DataLad project.
Any functionality that is out-of-scope for the DataLad framework libraries can be implemented in an extension package.
We have a paid subscription of the Appveyor CI/CD service. It is administered by mih.
Logs for projects can be found at URLs of the pattern https://ci.appveyor.com/project/mih/<project-name>.
Github actions
Forgejo actions
The hub is set up to run Forgejo actions. These are, to some degree, compatible with Github actions. This means that Forgejo can (attempt to) run Github actions, but not the other way round.
dlcmd is a command line interface (CLI) for DataLad that aims to provide a modern, and convenient approach to using DataLad in a terminal.
DataLad functionality provided via dlcmd is separated into two different categories:
tailored, stable commands for a finite set of features
auto-generated interfaces for any DataLad command available in an installation
For the second category, dlcmd provides no guarantees regarding API stability, and accessibility of particular functionality in the terminal.
The first category, however, comprises individually tuned and documented commands that are specifically tailored and integrated for their joint use in a terminal.
Here dlcmd is not serving as a thin layer between the terminal and a Python implementation of a command, but as a fully featured application with consistent (error) messaging, and behavior.
These implementations are individually tested to work via the CLI.