Developer Guide¶
All commits to main should go through a PR. CI checks should pass before merging the PR. Before merge commits are squashed. PR titles should follow Conventional Commits.
Foundation & CLA¶
We hate red tape. Currently all committers need to sign the CLA in github. To ensure the future of Lakekeeper, we want to donate the project to a foundation. We are not sure yet if this is going to be Apache, Linux, a Lakekeeper foundation or something else. Currently we prefer to spent our time on adding cool new features to Lakekeeper, but we will revisit this topic during 2026.
Quickstart¶
# start postgres
docker run -d --name postgres-15 -p 5432:5432 -e POSTGRES_PASSWORD=postgres postgres:15
# set envs
echo 'export DATABASE_URL=postgresql://postgres:postgres@localhost:5432/postgres' > .env
echo 'export ICEBERG_REST__PG_ENCRYPTION_KEY="abc"' >> .env
echo 'export ICEBERG_REST__PG_DATABASE_URL_READ="postgresql://postgres:postgres@localhost/postgres"' >> .env
echo 'export ICEBERG_REST__PG_DATABASE_URL_WRITE="postgresql://postgres:postgres@localhost/postgres"' >> .env
source .env
# migrate db
cd crates/iceberg-catalog
sqlx database create && sqlx migrate run
cd ../..
# run tests
cargo test --all-features --all-targets
# run clippy
cargo clippy --all-features --all-targets
This quickstart does not run tests against cloud-storage providers or KV2. For that, please refer to the sections below.
Developing with docker compose¶
The following shell snippet will start a full development environment including the catalog plus its dependencies and a jupyter server with spark. The iceberg-catalog and its migrations will be built from source. This can be useful for development and testing.
You may then head to localhost:8888
and try out one of the notebooks.
Working with SQLx¶
This crate uses sqlx. For development and compilation a Postgres Database is required. You can use Docker to launch one.:
Thecrates/iceberg-catalog
folder contains a .env.sample
File.
Copy this file to .env
and add your database credentials if they differ.
Run:
KV2 / Vault¶
This catalog supports KV2 as backend for secrets. Tests for KV2 are disabled by default. To enable them, you need to run the following commands:
docker run -d -p 8200:8200 --cap-add=IPC_LOCK -e 'VAULT_DEV_ROOT_TOKEN_ID=myroot' -e 'VAULT_DEV_LISTEN_ADDRESS=0.0.0.0:8200' hashicorp/vault
# append some more env vars to the .env file, it should already have PG related entries defined above.
# this will enable the KV2 tests
echo 'export TEST_KV2=1' >> .env
# the values below configure KV2
echo 'export ICEBERG_REST__KV2__URL="http://localhost:8200"' >> .env
echo 'export ICEBERG_REST__KV2__USER="test"' >> .env
echo 'export ICEBERG_REST__KV2__PASSWORD="test"' >> .env
echo 'export ICEBERG_REST__KV2__SECRET_MOUNT="secret"' >> .env
source .env
# setup vault
./tests/vault-setup.sh http://localhost:8200
cargo test --all-features --all-targets
Test cloud storage profiles¶
Currently, we're not aware of a good way of testing cloud storage integration against local deployments. That means, in order to test against AWS S3, GCS and ADLS Gen2, you need to set the following environment variables. For more information take a look at the Storage Guide. A sample .env
could look like this:
# TEST_AZURE=<some-value> controls a proc macro which either includes or excludes the azure tests
# if you compiled without TEST_AZURE, you'll have to change a file or do a cargo clean before rerunning tests. The same applies for the TEST_AWS and TEST_MINIO env vars.
export TEST_AZURE=1
export AZURE_TENANT_ID=<your tenant id>
export AZURE_CLIENT_ID=<your entra id app registration client id>
export AZURE_CLIENT_SECRET=<your entra id app registration client secret>
export AZURE_STORAGE_ACCOUNT_NAME=<your azure storage account name>
export AZURE_STORAGE_FILESYSTEM=<your azure adls filesystem name>
export TEST_AWS=1
export AWS_S3_BUCKET=<your aws s3 bucket>
export AWS_S3_REGION=<your aws s3 region>
# replace with actual values
export AWS_S3_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_S3_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_S3_STS_ROLE_ARN=arn:aws:iam::123456789012:role/role-name
# the values below should work with the default minio in our docker-compose
export TEST_MINIO=1
export LAKEKEEPER_TEST__S3_BUCKET=tests
export LAKEKEEPER_TEST__S3_REGION=local
export LAKEKEEPER_TEST__S3_ACCESS_KEY=minio-root-user
export LAKEKEEPER_TEST__S3_SECRET_KEY=minio-root-password
export LAKEKEEPER_TEST__S3_ENDPOINT=http://localhost:9000
You may then run a test via:
Running integration test¶
Please check the Integration Test Docs.
Extending Authz¶
When adding a new endpoint, you may need to extend the authorization model. Please check the Authorization Docs for more information. For openfga, you'll have to perform the following steps:
- extend the respective enum in
crate::service::authz
by adding the new action, e.g.crate::service::authz::CatalogViewAction::CanUndrop
- add the relation to
crate::service::authz::implementations::openfga::relations
, e.g. addViewRelation::CanUndrop
- add the mapping from the
implementations
type to theservice
type inopenfga::relations
, e.g.CatalogViewAction::CanUndrop => ViewRelation::CanUndrop
- create a new authz schema version by copying the latest existing one, e.g.
authz/openfga/v1/
toauthz/openfga/v2/
- apply your changes, e.g. add
define can_undrop: modify
to theview
type inauthz/openfga/v2/schema.fga
- create a diff between the old and new schema via
diff -u authz/openfga/v1/schema.fga authz/openfga/v2/schema.fga > authz/openfga/v2/changed.diff
to help your reviewers - regenerate
schema.json
via./fga model transform --file authz/openfga/v2/schema.fga > authz/openfga/v2/schema.json
(download thefga
binary from the OpenFGA repo) - Head to
crate::service::authz::implementations::openfga::models.rs
, extendCollaborationModels
with a field for your version, e.g.,v2
and then add your new model version on top of the file, like:const V2_MODEL: &str = include_str!("../../../../../../../authz/openfga/v2/schema.json"); static MODEL: LazyLock<CollaborationModels> = LazyLock::new(|| CollaborationModels { v1: serde_json::from_str(V1_MODEL).expect("Failed to parse OpenFGA model V1 as JSON"), // this is your added model below v2: serde_json::from_str(V2_MODEL).expect("Failed to parse OpenFGA model V2 as JSON"), });
- set your model as the active model like:
const ACTIVE_MODEL: ModelVersion = ModelVersion::V2;
- implement the migration in
crate::service::authz::implementations::openfga::migrations::migrate
like: