The Imaging Platform is meant to provide a powerful support to the developer throughout the typical stages of an AI project workflow. Following, a simple representation of the common AI workflow:
The content of the Datasets is actually stored outside the platform. There are several Storage Sites that can be configured and connected to the Imaging Platform as external repositories. The role of the platform within the Data acquisition stage is to provide a common way of creating an image collection and to ensure it is managed under a version control tool (DVC), and integrated within the project workflow by means of Gitlab.
The platform is also designed to define and promote a common structure for project development and experiment tracking, in such a way that the output components can be easily understandable and reusable.
More specifically, and according to the typical stages depicted above, the Imaging Platform provides support via the following workflow mappings:
In the same way as Storage Sites, the platform may count on external Execution endpoints, intended to perform high resource-consuming tasks (training and testing).
The platform differentiates among Datasets and Projects, but both are git repositories hosted on GitLab with DVC enabled. The relationship between them is tracked by forked relationships.
As an example, this diagram illustrates how a real structure will look like.
The idea behind the pattern is to easily track which projects are based on a dataset and which projects generate different datasets.
This forks relationship offer a great advantage in code synchronization.