Project
This page is for dltHub Feature, which requires a license. Join our early access program for a trial license.
A dlt+ Project offers developers a declarative approach for defining data workflow components: sources, destinations, pipelines, transformations, parameters, etc. It follows an opinionated structure centered around a Python manifest file dlt.yml, where all dlt entities are defined and configured in an organized way. The manifest file acts like a single source of truth for data pipelines, keeping all teams aligned.
The project layout has the following components:
- A dlt manifest file (dlt.yml) which specifies data platform entities like sources, destinations, pipelines, transformations, etc. and their respective configurations
- .dltfolder with secrets and other information, backward compatible with open source dlt.
- Python modules with source code and tests. We propose a strict layout of the modules (i.e., source code is in the sources/folder, etc.)
- _datafolder (excluded from- .git) where pipeline working directories and local destination files (i.e., filesystem, duckdb databases) are kept.
A general dlt+ project has the following structure:
.
โโโ .dlt/                 # your dlt secrets
โ   โโโ dev.secrets.toml
โ   โโโ secrets.toml
โโโ _data/             # local storage for your project, excluded from git
โโโ sources/              # your sources, contains the code for the arrow source
โ   โโโ arrow.py
โโโ .gitignore
โโโ requirements.txt
โโโ dlt.yml               # the main project manifest
Read more about dlt+ Project in the project feature description
To get started with a dlt+ Project and learn how to manage it using cli commands, check out our tutorial.