Roadmap
DataLad started out as a rather monolithic code base that mixed a Python library, a Python API geared towards interactive use, and a command line interface (CLI). The general development trajectory is to disentangle the code, and form a more modular, layered software system that comprises:
- dedicated applications providing a CLI, a graphical user interface (GUI), and a Python-based command API for interactive use and scripting
- a collection of topical extension packages
- utility libraries for a DataLad framework of closely aligned implementations
- another utility library with generic algorithms and implementations, not considered to be part of the DataLad framework
The schema below depict the envisioned relationships and dependencies between these components (solid arrows indicate dependencies and dashed ones optional usage).
graph LR; subgraph "Non-framework<br>utility libraries" salad("datasalad") end subgraph DataLad framework core("datalad-core") next("datalad-next") end subgraph "DataLad<br>applications" dlcmd("dlcmd (CLI)") gooey("gooey (GUI)") py("Python<br>Command API") end subgraph "(3rd-party)<br>Extension packages" extension("datalad-${extension}") end salad ---> core salad ---> next salad ---> extension core --> next core --> extension next -.-> extension core --> dlcmd next -.-> dlcmd extension -.-> dlcmd core --> gooey next -.-> gooey extension -.-> gooey core --> py next -.-> py extension -.-> py %% node links to websites click salad href "https://github.com/datalad/datasalad" click core href "https://github.com/datalad/datalad-core" click next href "https://github.com/datalad/datalad-next" click dlcmd href "/dev/dlcmd/"
Targeted components
Non-framework utility libraries
Such libraries hold implementations developed by the DataLad project and for the DataLad project that are nevertheless so generic that they are not considered to be part of the DataLad framework. The means:
- no DataLad jargon in messages
- no dependencies on other DataLad components
- no use of DataLad facilities (e.g., recognition of DataLad-specific configuration)
A concrete library example is datasalad, which provides tooling to work with subprocesses.
DataLad framework libraries
These library provide everything necessary to implement DataLad command and have them work in a uniform fashion. This includes aspects like configuration management, particular workflows (e.g., credential input and storage), and working with git(-annex) repositories in a particular “DataLad way”.
We distinguish two different libraries: core
and next
.
The core
library provide the essential set of DataLad functionality that is broadly applicable to the widest range of use cases.
It aims to have a lean dependency footprint to enable deploying DataLad in a wide range of environments.
The current development state is available at https://github.com/datalad/datalad-core.
The next
library serves the same purpose and scope as the core
library.
It is, however, a staging area for making new and improved implementations available before they may migrate in the core
library.
While core
evolves at a comparatively slow pace, next
is expected to have a much higher frequency of feature releases.
The current development state is available at https://github.com/datalad/datalad-next.
Topical DataLad extension packages can use both libraries to implement their functionality.
User interfaces
The libraries are accompanied by applications that provide concrete user interfaces. These (can) include:
- command line interface (CLI)
- graphical user interface (GUI)
- language-bindings or scripting interfaces
Such interface applications could be lean (only proxying library functionality), or heavily tailored for a specific purpose. There is no assumption of exclusivity. For example, there can be any number of CLI implementations.
In order to cleanly separate the underlying requirements and dependencies, even a “Python command API” is distinguished from the framework libraries (also written in Python). Only the former will define aspects like a uniform logging/messaging behavior.
Topical extension packages
Extension packages extend DataLad with additional functionality. Many extensions are provided by the DataLad project, but their can be implemented completely independent of the project and require no approval and generally need no coordination with the DataLad project.
Any functionality that is out-of-scope for the DataLad framework libraries can be implemented in an extension package.
Extension development is facilitated by a project template at https://github.com/datalad/datalad-extension-template.
Examples of extension packages are