Configuration¶
Lakekeeper is configured via environment variables. Settings listed in this page are shared between all projects and warehouses. Previous to Lakekeeper Version 0.5.0 please prefix all environment variables with ICEBERG_REST__ instead of LAKEKEEPER__.
For most deployments, we recommend to set at least the following variables: LAKEKEEPER__PG_DATABASE_URL_READ, LAKEKEEPER__PG_DATABASE_URL_WRITE, LAKEKEEPER__PG_ENCRYPTION_KEY.
Routing and Base-URL¶
Some Lakekeeper endpoints return links pointing at Lakekeeper itself. By default, these links are generated using the x-forwarded-host, x-forwarded-proto, x-forwarded-port and x-forwarded-prefix headers, if these are not present, the host header is used. If this is not working for you, you may set the LAKEKEEPER_BASE_URI environment variable to the base-URL where Lakekeeper is externally reachable. This may be necessary if Lakekeeper runs behind a reverse proxy or load balancer, and you cannot set the headers accordingly. In general, we recommend relying on the headers. To respect the host header but not the x-forwarded- headers, set LAKEKEEPER__USE_X_FORWARDED_HEADERS to false.
General¶
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__BASE_URI |
https://example.com:8181 |
Optional base-URL where the catalog is externally reachable. Default: None. See Routing and Base-URL. |
LAKEKEEPER__ENABLE_DEFAULT_PROJECT |
true |
If true, the NIL Project ID ("00000000-0000-0000-0000-000000000000") is used as a default if the user does not specify a project when connecting. This option is enabled by default, which we recommend for all single-project (single-tenant) setups. Default: true. |
LAKEKEEPER__RESERVED_NAMESPACES |
system,examples,information_schema |
Reserved Namespaces that cannot be created via the REST interface |
LAKEKEEPER__METRICS_PORT |
9000 |
Port where the Prometheus metrics endpoint is reachable. Default: 9000 |
LAKEKEEPER__LISTEN_PORT |
8181 |
Port Lakekeeper listens on. Default: 8181 |
LAKEKEEPER__BIND_IP |
0.0.0.0, ::1, :: |
IP Address Lakekeeper binds to. Default: 0.0.0.0 (listen to all incoming IPv4 packages) |
LAKEKEEPER__SECRET_BACKEND |
postgres |
The secret backend to use. If kv2 (Hashicorp KV Version 2) is chosen, you need to provide additional parameters Default: postgres, one-of: [postgres, kv2] |
LAKEKEEPER__SERVE_SWAGGER_UI |
true |
If true, Lakekeeper serves a swagger UI for management & catalog openAPI specs under /swagger-ui |
LAKEKEEPER__ALLOW_ORIGIN |
* |
A comma separated list of allowed origins for CORS. |
LAKEKEEPER__USE_X_FORWARDED_HEADERS |
false |
If true, Lakekeeper respects the x-forwarded-host, x-forwarded-proto, x-forwarded-port and x-forwarded-prefix headers in incoming requests. This is mostly relevant for the /config endpoint. Default: true (Headers are respected.) |
Pagination¶
Lakekeeper has default values for default and max page sizes of paginated queries. These are safeguards against malicious requests and the problems related to large page sizes described below.
The REST catalog spec requires servers to return all results if pageToken is not set in the request. To obtain that behavior, set LAKEKEEPER__PAGINATION_SIZE_MAX to 4294967295, which corresponds to u32::MAX. Larger page sizes would lead to practical problems. Things to keep in mind:
- Retrieving huge numbers of rows is expensive, which might be exploited by malicious requests.
- Requests may time out or responses may exceed size limits for huge numbers of results.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__PAGINATION_SIZE_DEFAULT |
1024 |
The default page size used for paginated queries. This value is used if the request's pageToken is set but empty. Default: 100 |
LAKEKEEPER__PAGINATION_SIZE_MAX |
2048 |
The max page size used for paginated queries. This value is used if the request's pageToken is not set. Default: 1000 |
Storage¶
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__ENABLE_AWS_SYSTEM_CREDENTIALS |
true |
Lakekeeper supports using AWS system identities (i.e. through AWS_* environment variables or EC2 instance profiles) as storage credentials for warehouses. This feature is disabled by default to prevent accidental access to restricted storage locations. To enable AWS system identities, set LAKEKEEPER__ENABLE_AWS_SYSTEM_CREDENTIALS to true. Default: false (AWS system credentials disabled) |
LAKEKEEPER__S3_ENABLE_DIRECT_SYSTEM_CREDENTIALS |
true |
By default, when using AWS system credentials, users must specify an assume-role-arn for Lakekeeper to assume when accessing S3. Setting this option to true allows Lakekeeper to use system credentials directly without role assumption, meaning the system identity must have direct access to warehouse locations. Default: false (direct system credential access disabled) |
LAKEKEEPER__S3_REQUIRE_EXTERNAL_ID_FOR_SYSTEM_CREDENTIALS |
true |
Controls whether an external-id is required when assuming a role with AWS system credentials. External IDs provide additional security when cross-account role assumption is used. Default: true (external ID required) |
LAKEKEEPER__ENABLE_AZURE_SYSTEM_CREDENTIALS |
true |
Lakekeeper supports using Azure system identities (i.e. through AZURE_* environment variables or VM managed identities) as storage credentials for warehouses. This feature is disabled by default to prevent accidental access to restricted storage locations. To enable Azure system identities, set LAKEKEEPER__ENABLE_AZURE_SYSTEM_CREDENTIALS to true. Default: false (Azure system credentials disabled) |
LAKEKEEPER__ENABLE_GCP_SYSTEM_CREDENTIALS |
true |
Lakekeeper supports using GCP system identities (i.e. through GOOGLE_APPLICATION_CREDENTIALS environment variables or the Compute Engine Metadata Server) as storage credentials for warehouses. This feature is disabled by default to prevent accidental access to restricted storage locations. To enable GCP system identities, set LAKEKEEPER__ENABLE_GCP_SYSTEM_CREDENTIALS to true. Default: false (GCP system credentials disabled) |
Persistence Store¶
Currently Lakekeeper supports only Postgres as a persistence store. You may either provide connection strings using PG_DATABASE_URL_* or use the PG_* environment variables. Connection strings take precedence. Postgres needs to be Version 15 or higher.
Lakekeeper supports configuring separate database URLs for read and write operations, allowing you to utilize read replicas for better scalability. By directing read queries to dedicated replicas via LAKEKEEPER__PG_DATABASE_URL_READ, you can significantly reduce load on your database primary (specified by LAKEKEEPER__PG_DATABASE_URL_WRITE), improving overall system performance as your deployment scales. This separation is particularly beneficial for read-heavy workloads. When using read replicas, be aware that replication lag may occur between the primary and replica databases depending on your Database setup. This means that immediately after a write operation, the changes might not be instantly visible when querying a read-only Lakekeeper endpoint (which uses the read replica). Consider this potential lag when designing applications that require immediate read-after-write consistency. For deployments where read-after-write consistency is critical, you can simply omit the LAKEKEEPER__PG_DATABASE_URL_READ setting, which will cause all operations to use the primary database connection.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__PG_DATABASE_URL_READ |
postgres://postgres:password@localhost:5432/iceberg |
Postgres Database connection string used for reading. Defaults to LAKEKEEPER__PG_DATABASE_URL_WRITE. |
LAKEKEEPER__PG_DATABASE_URL_WRITE |
postgres://postgres:password@localhost:5432/iceberg |
Postgres Database connection string used for writing. If LAKEKEEPER__PG_DATABASE_URL_READ is not specified, this connection is also used for reading. |
LAKEKEEPER__PG_ENCRYPTION_KEY |
This is unsafe, please set a proper key |
If LAKEKEEPER__SECRET_BACKEND=postgres, this key is used to encrypt secrets. It is required to change this for production deployments. |
LAKEKEEPER__PG_READ_POOL_CONNECTIONS |
10 |
Number of connections in the read pool |
LAKEKEEPER__PG_WRITE_POOL_CONNECTIONS |
5 |
Number of connections in the write pool |
LAKEKEEPER__PG_HOST_R |
localhost |
Hostname for read operations. Defaults to LAKEKEEPER__PG_HOST_W. |
LAKEKEEPER__PG_HOST_W |
localhost |
Hostname for write operations |
LAKEKEEPER__PG_PORT |
5432 |
Port number |
LAKEKEEPER__PG_USER |
postgres |
Username for authentication |
LAKEKEEPER__PG_PASSWORD |
password |
Password for authentication |
LAKEKEEPER__PG_DATABASE |
iceberg |
Database name |
LAKEKEEPER__PG_SSL_MODE |
require |
SSL mode (disable, allow, prefer, require) |
LAKEKEEPER__PG_SSL_ROOT_CERT |
/path/to/root/cert |
Path to SSL root certificate |
LAKEKEEPER__PG_ENABLE_STATEMENT_LOGGING |
true |
Enable SQL statement logging |
LAKEKEEPER__PG_TEST_BEFORE_ACQUIRE |
true |
Test connections before acquiring from the pool |
LAKEKEEPER__PG_CONNECTION_MAX_LIFETIME |
1800 |
Maximum lifetime of connections in seconds |
LAKEKEEPER__PG_ACQUIRE_TIMEOUT |
10 |
Timeout to acquire a new postgres connection in seconds. Default: 5 |
Vault KV Version 2¶
Configuration parameters if a Vault KV version 2 (i.e. Hashicorp Vault) compatible storage is used as a backend. Currently, we only support the userpass authentication method. Configuration may be passed as single values like LAKEKEEPER__KV2__URL=http://vault.local or as a compound value:
LAKEKEEPER__KV2='{url="http://localhost:1234", user="test", password="test", secret_mount="secret"}'
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__KV2__URL |
https://vault.local |
URL of the KV2 backend |
LAKEKEEPER__KV2__USER |
admin |
Username to authenticate against the KV2 backend |
LAKEKEEPER__KV2__PASSWORD |
password |
Password to authenticate against the KV2 backend |
LAKEKEEPER__KV2__SECRET_MOUNT |
kv/data/iceberg |
Path to the secret mount in the KV2 backend |
Task Queues¶
Lakekeeper uses task queues internally to remove soft-deleted tabulars and purge tabular files. The following global configuration options are available:
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__TASK_POLL_INTERVAL |
3600ms/30s | Interval between polling for new tasks. Default: 10s. Supported units: ms (milliseconds) and s (seconds), leaving the unit out is deprecated, it'll default to seconds but is due to be removed in a future release. |
LAKEKEEPER__TASK_TABULAR_EXPIRATION_WORKERS |
2 | Number of workers spawned to expire soft-deleted tables and views. |
LAKEKEEPER__TASK_TABULAR_PURGE_WORKERS |
2 | Number of workers spawned to purge table files after dropping a table with the purge option. |
LAKEKEEPER__TASK_EXPIRE_SNAPSHOTS_WORKERS |
2 | Number of workers spawned that work on expire Snapshots tasks. See Expire Snapshots Docs for more information. |
NATS¶
Lakekeeper can publish change events to NATS. The following configuration options are available:
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__NATS_ADDRESS |
nats://localhost:4222 |
The URL of the NATS server to connect to |
LAKEKEEPER__NATS_TOPIC |
iceberg |
The subject to publish events to |
LAKEKEEPER__NATS_USER |
test-user |
User to authenticate against NATS, needs LAKEKEEPER__NATS_PASSWORD |
LAKEKEEPER__NATS_PASSWORD |
test-password |
Password to authenticate against nats, needs LAKEKEEPER__NATS_USER |
LAKEKEEPER__NATS_CREDS_FILE |
/path/to/file.creds |
Path to a file containing NATS credentials |
LAKEKEEPER__NATS_TOKEN |
xyz |
NATS token to use for authentication |
Kafka¶
Lakekeeper uses rust-rdkafka to enable publishing events to Kafka.
The following features of rust-rdkafka are enabled:
- tokio
- ztstd
- gssapi-vendored
- curl-static
- ssl-vendored
- libz-static
This means that all features of librdkafka are usable. All necessary dependencies are statically linked and cannot be disabled. If you want to use dynamic linking or disable a feature, you'll have to fork Lakekeeper and change the features accordingly. Please refer to the documentation of rust-rdkafka for details on how to enable dynamic linking or disable certain features.
To publish events to Kafka, set the following environment variables:
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__KAFKA_TOPIC |
lakekeeper |
The topic to which events are published |
LAKEKEEPER__KAFKA_CONFIG |
{"bootstrap.servers"="host1:port,host2:port","security.protocol"="SSL"} |
librdkafka Configuration as "Dictionary". Note that you cannot use "JSON-Style-Syntax". Also see notes below |
LAKEKEEPER__KAFKA_CONFIG_FILE |
/path/to/config_file |
librdkafka Configuration to be loaded from a file. Also see notes below |
Notes¶
LAKEKEEPER__KAFKA_CONFIG and LAKEKEEPER__KAFKA_CONFIG_FILE are mutually exclusive and the values are not merged, if both variables are set. In case that both are set, LAKEKEEPER__KAFKA_CONFIG is used.
A LAKEKEEPER__KAFKA_CONFIG_FILE could look like this:
{
"bootstrap.servers"="host1:port,host2:port",
"security.protocol"="SASL_SSL",
"sasl.mechanisms"="PLAIN",
}
Checking configuration parameters is deferred to rdkafka
Logging Cloudevents¶
Cloudevents can also be logged, if you do not have Nats up and running. This feature can be enabled by setting Cloudevents can also be logged, if you do not have Nats or Kafka up and running. This feature can be enabled by setting
LAKEKEEPER__LOG_CLOUDEVENTS=true
Authentication¶
To prohibit unwanted access to data, we recommend to enable Authentication.
Authentication is enabled if:
LAKEKEEPER__OPENID_PROVIDER_URIis set ORLAKEKEEPER__ENABLE_KUBERNETES_AUTHENTICATIONis set to true
In Lakekeeper multiple Authentication mechanisms can be enabled together, for example OpenID + Kubernetes. Lakekeeper builds an internal Authenticator chain of up to three identity providers. Incoming tokens need to be JWT tokens - Opaque tokens are not yet supported. Incoming tokens are introspected, and each Authentication provider checks if the given token can be handled by this provider. If it can be handled, the token is authenticated against this provider, otherwise the next Authenticator in the chain is checked.
The following Authenticators are available. Enabled Authenticators are checked in order:
- OpenID / OAuth2
Enabled if:LAKEKEEPER__OPENID_PROVIDER_URIis set
Validates Token with: Locally with JWKS Keys fetched from the well-known configuration.
Accepts JWT if (both must be true):- Issuer matches the issuer provided in the
.well-known/openid-configurationof theLAKEKEEPER__OPENID_PROVIDER_URIOR issuer matches any of theLAKEKEEPER__OPENID_ADDITIONAL_ISSUERS. - If
LAKEKEEPER__OPENID_AUDIENCEis specified, any of the configured audiences must be present in the token
- Issuer matches the issuer provided in the
- Kubernetes
Enabled if:LAKEKEEPER__ENABLE_KUBERNETES_AUTHENTICATIONis true
Validates Token with: KubernetesTokenReviewAPI Accepts JWT if:- Token audience matches any of the audiences provided in
LAKEKEEPER__KUBERNETES_AUTHENTICATION_AUDIENCE - If
LAKEKEEPER__KUBERNETES_AUTHENTICATION_AUDIENCEis not set, all tokens proceed to validation! We highly recommend to configure audiences, for most deploymentshttps://kubernetes.default.svcworks.
- Token audience matches any of the audiences provided in
- Kubernetes Legacy Tokens
Enabled if:LAKEKEEPER__ENABLE_KUBERNETES_AUTHENTICATIONis true andLAKEKEEPER__KUBERNETES_AUTHENTICATION_ACCEPT_LEGACY_SERVICEACCOUNTis true
Validates Token with: KubernetesTokenReviewAPI
Accepts JWT if:- Tokens issuer is
kubernetes/serviceaccountorhttps://kubernetes.default.svc.cluster.local
- Tokens issuer is
If LAKEKEEPER__OPENID_PROVIDER_URI is specified, Lakekeeper will verify access tokens against this provider. The provider must provide the .well-known/openid-configuration endpoint and the openid-configuration needs to have jwks_uri and issuer defined.
Typical values for LAKEKEEPER__OPENID_PROVIDER_URI are:
- Keycloak:
https://keycloak.local/realms/{your-realm} - Entra-ID:
https://login.microsoftonline.com/{your-tenant-id-here}/v2.0/
Please check the Authentication Guide for more details.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__OPENID_PROVIDER_URI |
https://keycloak.local/realms/{your-realm} |
OpenID Provider URL. Lakekeeper expects to find <LAKEKEEPER__OPENID_PROVIDER_URI>/.well-known/openid-configuration and load JWKS tokens from there. Do not include the /.well-known/openid-configuration in the provided URL. |
LAKEKEEPER__OPENID_AUDIENCE |
the-client-id-of-my-app |
If set, the aud of the provided token must match the value provided. Multiple allowed audiences can be provided as a comma separated list. |
LAKEKEEPER__OPENID_ADDITIONAL_ISSUERS |
https://sts.windows.net/<Tenant>/ |
A comma separated list of additional issuers to trust. The issuer defined in the issuer field of the .well-known/openid-configuration is always trusted. LAKEKEEPER__OPENID_ADDITIONAL_ISSUERS has no effect if LAKEKEEPER__OPENID_PROVIDER_URI is not set. |
LAKEKEEPER__OPENID_SCOPE |
lakekeeper |
Specify a scope that must be present in provided tokens received from the openid provider. |
LAKEKEEPER__OPENID_SUBJECT_CLAIM |
sub or oid,sub |
Specify the claim(s) in the user's JWT used to identify a User. Accepts a single claim name or a comma-separated list of claim names; the first claim present in the token is used. By default Lakekeeper tries oid first, then falls back to sub. We strongly recommend setting this configuration explicitly in production deployments. Entra-ID users want to use oid; users from all other IdPs most likely want to use sub. |
LAKEKEEPER__OPENID_ROLES_CLAIM |
resource_access.lakekeeper.roles |
Specify the claim to use in provided JWT tokens to extract roles. The field should contain an array of strings or a single string. Supports nested claims using dot notation, e.g., "resource_access.account.roles". Currently only has an effect when using the Cedar Authorizer. Requires a project ID to be set via the x-project-id header or LAKEKEEPER__DEFAULT_PROJECT_ID. |
LAKEKEEPER__ENABLE_KUBERNETES_AUTHENTICATION |
true | If true, kubernetes service accounts can authenticate to Lakekeeper. This option is compatible with LAKEKEEPER__OPENID_PROVIDER_URI - multiple IdPs (OIDC and Kubernetes) can be enabled simultaneously. |
LAKEKEEPER__KUBERNETES_AUTHENTICATION_AUDIENCE |
https://kubernetes.default.svc |
Audiences that are expected in Kubernetes tokens. Only has an effect if LAKEKEEPER__ENABLE_KUBERNETES_AUTHENTICATION is true. |
LAKEKEEPER_TEST__KUBERNETES_AUTHENTICATION_ACCEPT_LEGACY_SERVICEACCOUNT |
false |
Add an authenticator that handles tokens with no audiences and the issuer set to kubernetes/serviceaccount. Only has an effect if LAKEKEEPER__ENABLE_KUBERNETES_AUTHENTICATION is true. |
Authorization¶
Authorization is only effective if Authentication is enabled. Authorization must not be enabled after Lakekeeper has been bootstrapped! Please create a new Lakekeeper instance, bootstrap it with authorization enabled, and migrate your tables.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__AUTHZ_BACKEND |
allowall |
The authorization backend to use. If openfga or cedar is chosen, additional parameters are required (see below). The allowall backend disables authorization - authenticated users can access all endpoints. Default: allowall, one-of: [openfga, allowall, cedar] |
OpenFGA¶
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__OPENFGA__ENDPOINT |
http://localhost:35081 |
OpenFGA Endpoint (gRPC). |
LAKEKEEPER__OPENFGA__STORE_NAME |
lakekeeper |
The OpenFGA Store to use. Default: lakekeeper |
LAKEKEEPER__OPENFGA__API_KEY |
my-api-key |
The API Key used for Pre-shared key authentication to OpenFGA. If LAKEKEEPER__OPENFGA__CLIENT_ID is set, the API Key is ignored. If neither API Key nor Client ID is specified, no authentication is used. |
LAKEKEEPER__OPENFGA__CLIENT_ID |
12345 |
The Client ID to use for Authenticating if OpenFGA is secured via OIDC. |
LAKEKEEPER__OPENFGA__CLIENT_SECRET |
abcd |
Client Secret for the Client ID. |
LAKEKEEPER__OPENFGA__TOKEN_ENDPOINT |
https://keycloak.example.com/realms/master/protocol/openid-connect/token |
Token Endpoint to use when exchanging client credentials for an access token for OpenFGA. Required if Client ID is set |
LAKEKEEPER__OPENFGA__SCOPE |
openfga |
Additional scopes to request in the Client Credential flow. |
LAKEKEEPER__OPENFGA__AUTHORIZATION_MODEL_PREFIX |
collaboration |
Explicitly set the Authorization model prefix. Defaults to collaboration if not set. We recommend to use this setting only in combination with LAKEKEEPER__OPENFGA__AUTHORIZATION_MODEL_PREFIX. |
LAKEKEEPER__OPENFGA__AUTHORIZATION_MODEL_VERSION |
3.1 |
Version of the model to use. If specified, the specified model version must already exist. This can be used to roll-back to previously applied model versions or to connect to externally managed models. Migration is disabled if the model version is set. Version should have the format |
LAKEKEEPER__OPENFGA__MAX_BATCH_CHECK_SIZE |
50 |
p The maximum number of checks than can be handled by a batch check request. This is a configuration option of the OpenFGA server with default value 50. |
Cedar ¶
Please check the Authorization User Guide for more information on Cedar.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__CEDAR__POLICY_SOURCES__LOCAL_FILES |
[/path/to/policies1.cedar,/path/to/policies2.cedar] |
List of local file paths containing Cedar policies in Cedar format (not JSON). |
LAKEKEEPER__CEDAR__ENTITY_JSON_SOURCES__LOCAL_FILES |
[/path/to/entities1.json,/path/to/entities2.json] |
List of local JSON file paths containing additional Cedar entities (typically roles). |
LAKEKEEPER__CEDAR__POLICY_SOURCES__K8S_CM |
[my-cm-1, my-cm-2] |
List of Kubernetes ConfigMap names in the same namespace as Lakekeeper. Every key ending with .cedar is treated as a policy source in Cedar format (not JSON). |
LAKEKEEPER__CEDAR__ENTITY_JSON_SOURCES__K8S_CM |
[my-cm-1, my-cm-2] |
List of Kubernetes ConfigMap names in the same namespace as Lakekeeper. Every key ending with .cedarentities.json is treated as an entity source. |
LAKEKEEPER__CEDAR__REFRESH_INTERVAL_SECS |
5 |
Refresh interval in seconds for reloading policies and entities from Kubernetes ConfigMaps and local files. Default: 5 seconds. See Cedar Authorization for more information. |
LAKEKEEPER__CEDAR__REFRESH_DISABLED |
false |
When set to true, disables periodic reloading of policies and entities entirely. Useful in environments where Cedar configuration is known to be static and the polling overhead is undesirable. Default: false. |
LAKEKEEPER__CEDAR__EXTERNALLY_MANAGED_USER_AND_ROLES |
false |
When set to true, Lakekeeper expects all roles and users to be managed externally via entities.json and does not extract Lakekeeper::Role or Lakekeeper::User entities from the user's token. When set to false (default), Lakekeeper automatically provides Lakekeeper::Role and Lakekeeper::User entities to Cedar based on information extracted from the user's token. When set to false, ensure LAKEKEEPER__OPENID_ROLES_CLAIM is configured to specify which claim in the token contains role information. |
LAKEKEEPER__CEDAR__SCHEMA_FILE |
/path/to/custom/schema.cedarschema |
Path to a custom Cedar schema file that replaces the embedded default schema entirely. Use this only when you need complete control over the schema definition. Your custom schema must maintain compatibility with all Lakekeeper-provided entities (Server, Project, Warehouse, Namespace, Table, View, and optionally User & Role). For most use cases, prefer LAKEKEEPER__CEDAR__SCHEMA_FRAGMENT_FILE to extend the built-in schema. |
LAKEKEEPER__CEDAR__SCHEMA_FRAGMENT_FILE |
/path/to/schema-fragment.cedarschema |
Path to a Cedar schema fragment file that extends the embedded default schema. This is the recommended approach for adding custom entity types or grouped actions while preserving compatibility with Lakekeeper's built-in schema. The fragment is merged with the default schema at startup. |
LAKEKEEPER__CEDAR__PROPERTY_PARSE_PREFIXES |
["access_", "access-"] |
List of property key prefixes that trigger entity-reference parsing for ABAC. Table, Namespace, and View properties whose key starts with one of these prefixes are parsed as JSON arrays of role: / role-full: / user: references. Parsed values are exposed in Cedar as roles: Set<Role> and users: Set<User> on each ResourcePropertyValue. Set to [] to disable parsing entirely. Default: ["access_", "access-"]. See Property-Based Access Control. |
LAKEKEEPER__CEDAR__GLOBAL_ROLE_IDS_ENABLED |
false |
When true, the global_role_ids: Set<String> attribute on every Lakekeeper::User entity is populated with the source_id of every provider-resolved role (token claims, LDAP, etc.). This enables simpler policies such as principal.global_role_ids.contains("admins") without needing to specify a provider_id. Only meaningful when all configured role providers use globally unique source_id values (i.e. no two providers assign the same source_id to different roles). When false (default), global_role_ids is always an empty set. |
LAKEKEEPER__CEDAR__USER_DERIVATIONS__<NAME>__SOURCE |
source_id |
Source field for a user identity derivation rule. Supported values: source_id (the user's subject in the IdP) or provider_id (e.g. oidc, kubernetes). <NAME> is a human-readable key (e.g. EMAIL_PARTS) used in error messages. See User Identity Derivations. |
LAKEKEEPER__CEDAR__USER_DERIVATIONS__<NAME>__PATTERN |
^(?<username>[^@]+)@(?<domain>.+)$ |
Regex pattern with named capture groups for a user identity derivation rule. Each named group that matches a non-empty substring becomes a string tag on the UserDerivedAttributes entity, accessible in policies via principal.derived_attributes.hasTag("…") / principal.derived_attributes.getTag("…"). Invalid patterns cause a startup error. See User Identity Derivations. |
LAKEKEEPER__CEDAR__USER_DERIVATIONS__<NAME>__TRANSFORM |
lowercase |
Optional transformation applied to all captured values before they become Cedar tags. Supported values: none (default — keep as-is), lowercase, uppercase. Because Cedar string comparison is case-sensitive, use lowercase to normalize captured values so policies can compare against a known-case literal (e.g. getTag("domain") == "example.com"). If different capture groups need different transforms, use separate derivation entries with distinct regexes. See User Identity Derivations. |
Debug configurations for Cedar
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__CEDAR__DEBUG__LOG_ENTITIES |
false |
If true, logs all internal entities (excluding externally managed entities) for each authorization request at debug level. This is useful for debugging authorization issues but can be verbose and impacts performance. Logging only occurs when both this flag is true AND debug logging is enabled (RUST_LOG=debug). Default: false. |
UI¶
When using the built-in UI which is hosted as part of the Lakekeeper binary, most values are pre-set with the corresponding values of Lakekeeper itself. Customization is typically required if Authentication is enabled. Please check the Authentication guide for more information.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__UI__OPENID_PROVIDER_URI |
https://keycloak.local/realms/{your-realm} |
OpenID provider URI used for login in the UI. Defaults to LAKEKEEPER__OPENID_PROVIDER_URI. Set this only if the IdP is reachable under a different URI from the users browser and lakekeeper. |
LAKEKEEPER__UI__OPENID_CLIENT_ID |
lakekeeper-ui |
Client ID to use for the Authorization Code Flow of the UI. Required if Authentication is enabled. Defaults to lakekeeper |
LAKEKEEPER__UI__OPENID_REDIRECT_PATH |
/callback |
Path where the UI receives the callback including the tokens from the users browser. Defaults to: /callback |
LAKEKEEPER__UI__OPENID_SCOPE |
openid email |
Scopes to request from the IdP. Defaults to openid profile email. |
LAKEKEEPER__UI__OPENID_RESOURCE |
lakekeeper-api |
Resources to request from the IdP. If not specified, the resource field is omitted (default). |
LAKEKEEPER__UI__OPENID_POST_LOGOUT_REDIRECT_PATH |
/logout |
Path the UI calls when users are logged out from the IdP. Defaults to /logout |
LAKEKEEPER__UI__LAKEKEEPER_URL |
https://example.com/lakekeeper |
URI where the users browser can reach Lakekeeper. Defaults to the value of LAKEKEEPER__BASE_URI. |
LAKEKEEPER__UI__OPENID_TOKEN_TYPE |
access_token |
The token type to use for authenticating to Lakekeeper. The default value access_token works for most IdPs. Some IdPs, such as the Google Identity Platform, recommend the use of the OIDC ID Token instead. To use the ID token instead of the access token for Authentication, specify a value of id_token. Possible values are access_token and id_token. |
Caching¶
Lakekeeper uses in-memory caches to speed up certain operations.
Short-Term Credentials (STC) Cache
When Lakekeeper vends short-term credentials for cloud storage access (S3 STS, Azure SAS tokens, or GCP access tokens), these credentials can be cached to reduce load on cloud identity services and improve response times.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__CACHE__STC__ENABLED |
true |
Enable or disable the short-term credentials cache. Default: true |
LAKEKEEPER__CACHE__STC__CAPACITY |
10000 |
Maximum number of credential entries to cache. Default: 10000 |
Expiry Mechanism: Cached credentials automatically expire based on the validity period of the underlying cloud credentials. Lakekeeper caches credentials for half their lifetime (e.g., if GCP STS returns credentials valid for 1 hour, they're cached for 30 minutes) with a maximum cache duration of 1 hour. This ensures credentials remain fresh while reducing unnecessary identity service calls.
Metrics: The STC cache exposes Prometheus metrics for monitoring:
lakekeeper_cache_size{cache_type="stc"}: Current number of entries in the cachelakekeeper_cache_hits_total{cache_type="stc"}: Total number of cache hitslakekeeper_cache_misses_total{cache_type="stc"}: Total number of cache misses
Warehouse Cache
Caches warehouse metadata to reduce database queries for warehouse lookups.
| Configuration Key | Type | Default | Description |
|---|---|---|---|
LAKEKEEPER__CACHE__WAREHOUSE__ENABLED |
boolean | true |
Enable/disable warehouse caching. Default: true |
LAKEKEEPER__CACHE__WAREHOUSE__CAPACITY |
integer | 1000 |
Maximum number of warehouses to cache. Default: 1000 |
LAKEKEEPER__CACHE__WAREHOUSE__TIME_TO_LIVE_SECS |
integer | 60 |
Time-to-live for cache entries in seconds. Default: 60 |
If the cache is enabled, changes to Storage Profile may take up to the configured TTL (default: 60 seconds) to be reflected in all Lakekeeper workers. If a single worker is used, the Cache is always up to date. Warehouse metadata is guaranteed to be fresh for load table & view operations also for multi-worker deployments.
Metrics: The Warehouse cache exposes Prometheus metrics for monitoring:
lakekeeper_cache_size{cache_type="warehouse"}: Current number of entries in the cachelakekeeper_cache_hits_total{cache_type="warehouse"}: Total number of cache hitslakekeeper_cache_misses_total{cache_type="warehouse"}: Total number of cache misses
Namespace Cache
Caches namespace metadata and hierarchies to reduce database queries for namespace lookups. Namespace lookups are also required for table & view operations.
| Configuration Key | Type | Default | Description |
|---|---|---|---|
LAKEKEEPER__CACHE__NAMESPACE__ENABLED |
boolean | true |
Enable/disable namespace caching. Default: true |
LAKEKEEPER__CACHE__NAMESPACE__CAPACITY |
integer | 1000 |
Maximum number of namespaces to cache. Default: 1000 |
LAKEKEEPER__CACHE__NAMESPACE__TIME_TO_LIVE_SECS |
integer | 60 |
Time-to-live for cache entries in seconds. Default: 60 |
If the cache is enabled, changes to namespace properties may take up to the configured TTL (default: 60 seconds) to be reflected in all Lakekeeper workers. If a single worker is used, the Cache is always up to date. The namespace cache stores both individual namespaces and their parent hierarchies for efficient lookups.
Metrics: The Namespace cache exposes Prometheus metrics for monitoring:
lakekeeper_cache_size{cache_type="namespace"}: Current number of entries in the cachelakekeeper_cache_hits_total{cache_type="namespace"}: Total number of cache hitslakekeeper_cache_misses_total{cache_type="namespace"}: Total number of cache misses
Secrets Cache
Caches storage secrets to reduce load on the secret store. Since Lakekeeper never updates secrets, long TTLs can significantly increase resilience against secret store outages, especially when the secret store is external to the main database backend.
| Configuration Key | Type | Default | Description |
|---|---|---|---|
LAKEKEEPER__CACHE__SECRETS__ENABLED |
boolean | true |
Enable/disable secrets caching. Default: true |
LAKEKEEPER__CACHE__SECRETS__CAPACITY |
integer | 500 |
Maximum number of secrets to cache. Default: 500 |
LAKEKEEPER__CACHE__SECRETS__TIME_TO_LIVE_SECS |
integer | 600 |
Time-to-live for cache entries in seconds. Default: 600 (10 minutes) |
Metrics: The Secrets cache exposes Prometheus metrics for monitoring:
lakekeeper_cache_size{cache_type="secrets"}: Current number of entries in the cachelakekeeper_cache_hits_total{cache_type="secrets"}: Total number of cache hitslakekeeper_cache_misses_total{cache_type="secrets"}: Total number of cache misses
Role Cache
Caches role metadata to reduce database queries for role lookups. The role cache uses a two-tier caching mechanism: a primary cache indexed by role ID and a secondary index by project ID and role identifier, enabling efficient lookups from both identifiers. Note that this cache only stores role definitions and does not include any information about role assignments to users or principals.
| Configuration Key | Type | Default | Description |
|---|---|---|---|
LAKEKEEPER__CACHE__ROLE__ENABLED |
boolean | true |
Enable/disable role caching. Default: true |
LAKEKEEPER__CACHE__ROLE__CAPACITY |
integer | 10000 |
Maximum number of roles to cache. Default: 10000 |
LAKEKEEPER__CACHE__ROLE__TIME_TO_LIVE_SECS |
integer | 120 |
Time-to-live for cache entries in seconds. Default: 120 (2 minutes) |
If the cache is enabled, changes to role metadata may take up to the configured TTL (default: 120 seconds) to be reflected in all Lakekeeper workers. If a single worker is used, the cache is always up to date. The cache is automatically invalidated when roles are updated or deleted.
Metrics: The Role cache exposes Prometheus metrics for monitoring:
lakekeeper_cache_size{cache_type="role"}: Current number of entries in the cachelakekeeper_cache_hits_total{cache_type="role"}: Total number of cache hitslakekeeper_cache_misses_total{cache_type="role"}: Total number of cache misses
User Assignments Cache
Caches the set of roles assigned to each user (UserId → role assignments). This is the hot-path cache checked on every authorization request and is also the in-memory layer used by the LDAP role provider's two-layer caching scheme. The TTL must not exceed LAKEKEEPER__CACHE__ROLE__TIME_TO_LIVE_SECS to bound the window in which a deleted role can still appear in assignment results.
| Configuration Key | Type | Default | Description |
|---|---|---|---|
LAKEKEEPER__CACHE__USER_ASSIGNMENTS__ENABLED |
boolean | true |
Enable/disable user-assignments caching. Default: true |
LAKEKEEPER__CACHE__USER_ASSIGNMENTS__CAPACITY |
integer | 50000 |
Maximum number of users whose assignments are held in memory. Default: 50000 |
LAKEKEEPER__CACHE__USER_ASSIGNMENTS__TIME_TO_LIVE_SECS |
integer | 120 |
Time-to-live for cache entries in seconds. Must not exceed LAKEKEEPER__CACHE__ROLE__TIME_TO_LIVE_SECS. Default: 120 (2 minutes) |
Metrics: The User Assignments cache exposes Prometheus metrics for monitoring:
lakekeeper_cache_size{cache_type="user_assignments"}: Current number of entries in the cachelakekeeper_cache_hits_total{cache_type="user_assignments"}: Total number of cache hitslakekeeper_cache_misses_total{cache_type="user_assignments"}: Total number of cache misses
Endpoint Statistics¶
Lakekeeper collects statistics about the usage of its endpoints. Every Lakekeeper instance accumulates endpoint calls for a certain duration in memory before writing them into the database. The following configuration options are available:
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__ENDPOINT_STAT_FLUSH_INTERVAL |
30s | Interval in seconds to write endpoint statistics into the database. Default: 30s, valid units are (s|ms) |
SSL Dependencies¶
You may be running Lakekeeper in your own environment which uses self-signed certificates for e.g. Minio. Lakekeeper is built with reqwest's rustls-tls-native-roots feature activated, this means SSL_CERT_FILE and SSL_CERT_DIR environment variables are respected. If both are not set, the system's default CA store is used. If you want to use a custom CA store, set SSL_CERT_FILE to the path of the CA file or SSL_CERT_DIR to the path of the CA directory. The certificate used by the server cannot be a CA. It needs to be an end entity certificate, else you may run into CaUsedAsEndEntity errors.
Request Limits¶
Lakekeeper allows you to configure limits on incoming requests to protect against resource exhaustion and denial-of-service attacks.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__MAX_REQUEST_BODY_SIZE |
2097152 |
Maximum request body size in bytes. Default: 2097152 (2 MB) |
LAKEKEEPER__MAX_REQUEST_TIME |
30s |
Maximum time allowed for a request to complete. Accepts format {number}{ms\|s}. Default: 30s |
Idempotency¶
Lakekeeper supports the Iceberg REST Catalog Idempotency specification. When enabled, clients can send an Idempotency-Key header on mutation requests to guarantee at-most-once execution. The server advertises support via the idempotency-key-lifetime field in the GET /v1/config response.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__IDEMPOTENCY__ENABLED |
true |
Enable idempotency key support. When enabled, idempotency-key-lifetime is advertised in getConfig. Default: true |
LAKEKEEPER__IDEMPOTENCY__LIFETIME |
PT30M |
How long idempotency records are kept, in ISO-8601 duration format. This value is advertised to clients. Default: PT30M (30 minutes) |
LAKEKEEPER__IDEMPOTENCY__GRACE_PERIOD |
PT5M |
Grace period added on top of lifetime for clock skew and transit delays, in ISO-8601 duration format. Default: PT5M (5 minutes) |
LAKEKEEPER__IDEMPOTENCY__CLEANUP_TIMEOUT |
PT30S |
Maximum time a background cleanup task may run before being considered dead. If exceeded, the next attempt takes over. Default: PT30S (30 seconds) |
Audit Logging¶
Lakekeeper can generate detailed audit logs for all authorization events. Audit logs are written to the standard logging output and can be filtered by the event_source = "audit" field. For more information, see the Audit Logging Guide.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__AUDIT__TRACING__ENABLED |
true |
Enable audit logging for authorization events. When enabled, all authorization checks (both successful and failed) are logged at the INFO level with event_source = "audit". Audit logs include the actor, action, resource, and outcome. Default: false |
Role Provider¶
Authorizers such as Cedar support pluggable role providers that resolve a user's group memberships from an external directory (e.g. LDAP / Active Directory). Multiple providers can be configured in parallel, each with a unique identifier. OpenFGA does not use role providers — roles are stored directly in OpenFGA.
Chain settings¶
| Variable | Default | Description |
|---|---|---|
LAKEKEEPER__ROLE_PROVIDER_CHAIN__LOG_UNHANDLED_USERS |
true |
When true, an audit event is emitted whenever a user is not matched by any configured role provider. Useful for detecting misconfigured domain filters. Set to false to suppress these events for deployments where some users are intentionally not covered by any provider. |
LAKEKEEPER__ROLE_PROVIDER_CHAIN__LOG_ROLE_ASSIGNMENTS |
false |
When true, an audit event listing every resolved role name is emitted after each successful role resolution. Very noisy — intended for debugging role-provider configuration only. See Logging — Operational Audit Events for the event schema. |
Token role provider¶
When LAKEKEEPER__OPENID_ROLES_CLAIM is set, Lakekeeper extracts roles directly from the authenticated user's JWT. A built-in token role provider is added to the chain automatically — no additional configuration is required.
The token role provider only applies to OIDC-authenticated users (those whose identity was established via the configured OpenID Connect provider). It is a no-op for users authenticated through other mechanisms (e.g. Kubernetes service accounts).
The provider uses the reserved identifier oidc. If you declare a role provider with this identifier in your configuration, the automatic provider is suppressed and your custom provider takes its place.
LDAP role provider¶
Each LDAP provider is configured under a unique <ID> of your choosing. All variables below use the prefix LAKEKEEPER__ROLE_PROVIDER__<ID>__.
Required fields:
| Variable | Example | Description |
|---|---|---|
…__TYPE |
ldap |
Provider type. Must be ldap. |
…__URL |
ldaps://ldap.example.com:636 |
LDAP server URL. Use ldap:// for plain-text or STARTTLS, ldaps:// for TLS. |
…__DOMAINS |
["example.com","*.corp.example.com"] |
JSON array of domain patterns. Only users whose login name ends with one of these domains are resolved via this provider. Supports * (any number of characters) and ? (exactly one character). |
…__USER_BASE_DN |
ou=people,dc=example,dc=com |
Base DN for the LDAP user search. |
Authentication:
| Variable | Default | Description |
|---|---|---|
…__BIND_DN |
(anonymous) | Distinguished name of the service account used to bind. Omit for anonymous bind. |
…__BIND_PASSWORD |
Password for the service account. Required when …__BIND_DN is set; can also be supplied via …__BIND_PASSWORD_FILE. |
User search:
| Variable | Default | Description |
|---|---|---|
…__USER_SEARCH_FILTER |
(uid=${USER}) |
LDAP filter used to locate a user entry. The literal ${USER} is replaced with the subject portion of the user's login name (the part before @). |
…__USER_SEARCH_SCOPE |
sub |
Search scope: sub (entire subtree), one (one level below base), or base. |
Group / role mapping:
| Variable | Default | Description |
|---|---|---|
…__USER_MEMBER_OF_ATTRIBUTE |
memberOf |
Multi-valued attribute on the user entry that lists the groups the user belongs to. The default (memberOf) is correct for Active Directory and OpenLDAP with the memberof overlay. |
…__GROUP_NAME_SOURCE |
dn_cn |
How to derive the role name from a group entry. dn_cn extracts the CN= component from the group's distinguished name (recommended for AD/ADFS). |
…__GROUP_CASE |
keep |
Case transformation applied to the resolved group name before it is stored as a role. One of keep, upper, or lower. |
Connection and TLS:
| Variable | Default | Description |
|---|---|---|
…__STARTTLS |
false |
Upgrade a plain TCP connection with STARTTLS before binding. Only applies to ldap:// URLs. |
…__ALLOW_INSECURE |
false |
Skip TLS certificate verification. Do not use in production. |
…__CONNECT_TIMEOUT_SECS |
30 |
Seconds to wait when establishing the initial connection. |
…__READ_TIMEOUT_SECS |
60 |
Seconds to wait for an LDAP response. |
Caching & performance:
Each LDAP provider uses a two-layer cache to avoid a network round-trip to the LDAP server on every request:
- In-memory layer — role assignments are held in a per-node moka cache (see User Assignments Cache above). Reads that hit this layer incur no I/O at all.
- Database layer — on an in-memory miss, role assignments are read from (and re-populate) the database. The database record includes a
synced_attimestamp that is compared againstSYNC_INTERVAL_SECSto decide whether the data is still fresh.
If the database record is older than SYNC_INTERVAL_SECS, Lakekeeper contacts LDAP, writes the fresh assignments back to both the database and the in-memory cache, and returns the result. If LDAP is temporarily unreachable, the stale database record is served instead and an audit warning is emitted — the request is never failed solely due to an LDAP outage.
| Variable | Default | Description |
|---|---|---|
…__SYNC_INTERVAL_SECS |
300 |
Maximum age (in seconds) of a cached role-assignment record before Lakekeeper re-fetches from LDAP. Increase to reduce LDAP traffic; decrease when group membership changes must propagate more quickly. Also controls the TTL of the corresponding database record. |
Startup and resilience:
| Variable | Default | Description |
|---|---|---|
…__REQUIRE_CONNECTED_ON_STARTUP |
false |
When true, Lakekeeper refuses to start if this provider cannot connect. Useful for catching misconfiguration early. When false, the provider starts in a disconnected state and reconnects automatically on first use. |
…__RECONNECT_COOLDOWN_SECS |
30 |
Minimum seconds between reconnection attempts after a failure. |
IDP filtering (optional):
| Variable | Default | Description |
|---|---|---|
…__IDP_IDS |
(all IDPs) | JSON array of identity provider IDs. When set, only users from these IDPs are resolved via this provider. Omit to allow all IDPs. |
Example — minimal LDAP provider (env vars):
LAKEKEEPER__ROLE_PROVIDER__MY_LDAP__TYPE=ldap
LAKEKEEPER__ROLE_PROVIDER__MY_LDAP__URL=ldaps://ldap.corp.example.com:636
LAKEKEEPER__ROLE_PROVIDER__MY_LDAP__DOMAINS=["corp.example.com"]
LAKEKEEPER__ROLE_PROVIDER__MY_LDAP__USER_BASE_DN=ou=people,dc=corp,dc=example,dc=com
LAKEKEEPER__ROLE_PROVIDER__MY_LDAP__BIND_DN=cn=svc-lakekeeper,ou=service-accounts,dc=corp,dc=example,dc=com
LAKEKEEPER__ROLE_PROVIDER__MY_LDAP__BIND_PASSWORD_FILE=/run/secrets/ldap-password
File-based configuration¶
All providers can alternatively be configured through a single TOML file. This is convenient when secrets management or config management tools produce a single artefact (e.g. Vault agent, Kubernetes projected volumes, Ansible templates).
Point LAKEKEEPER__ROLE_PROVIDER_FILE at a standard TOML file. Each provider is a section [role_provider.<id>] where <id> is the provider ID you choose. Multiple providers can be defined in the same file.
Example — two LDAP providers in one file:
/etc/lakekeeper/role-providers.toml:
[role_provider.corporate]
type = "ldap"
url = "ldaps://ldap.corp.example.com:636"
domains = ["corp.example.com"]
user_base_dn = "ou=people,dc=corp,dc=example,dc=com"
bind_dn = "cn=svc-lakekeeper,ou=service-accounts,dc=corp,dc=example,dc=com"
bind_password = "s3cr3t"
[role_provider.subsidiary]
type = "ldap"
url = "ldaps://ldap.subsidiary.example.com:636"
domains = ["subsidiary.example.com"]
user_base_dn = "ou=users,dc=subsidiary,dc=example,dc=com"
bind_dn = "cn=svc-lakekeeper,ou=service-accounts,dc=subsidiary,dc=example,dc=com"
bind_password = "s3cr3t"
Then set the single environment variable:
Combining file and environment variables: The file and env-var approaches can be combined. The file is loaded first and env vars are merged on top — env vars override individual fields for the same provider while unset fields are preserved from the file. This makes it easy to store non-sensitive configuration in the file and inject secrets via env vars:
# /etc/lakekeeper/role-providers.toml (checked in, no secrets) [role_provider.corporate] type = "ldap" url = "ldaps://ldap.corp.example.com:636" domains = ["corp.example.com"] user_base_dn = "ou=people,dc=corp,dc=example,dc=com" bind_dn = "cn=svc-lakekeeper,ou=service-accounts,dc=corp,dc=example,dc=com"
Debug¶
Lakekeeper provides debugging options to help troubleshoot issues during development. These options should not be enabled in production environments as they can expose sensitive data and impact performance.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__DEBUG__LOG_REQUEST_BODIES |
true |
If set to true, Lakekeeper will log all incoming and outgoing request bodies at debug level. This is useful for debugging API interactions but should never be enabled in production as it can expose sensitive data (credentials, tokens, etc.) and significantly impact performance. Default: false |
LAKEKEEPER__DEBUG__MIGRATE_BEFORE_SERVE |
true |
If set to true, Lakekeeper waits for the DB (30s) and runs migrations when serve is called. Default: false |
LAKEKEEPER__DEBUG__AUTO_SERVE |
true |
If set to true, Lakekeeper will automatically start the server when no subcommand is provided (i.e., when running the binary without arguments). This is useful for development environments to quickly start the server without explicitly specifying the serve command. Default: false |
LAKEKEEPER__DEBUG__EXTENDED_LOGS |
false |
Controls whether file names and line numbers are included in JSON log output. When set to false, these fields are omitted for cleaner logs. When set to true, each log entry includes filename and line_number fields for easier debugging. Default: false |
LAKEKEEPER__DEBUG__LOG_AUTHORIZATION_HEADER |
false |
If set to true, the Authorization header is included in request trace spans for the /catalog/v1/config and /management/v1/info endpoints. This exposes sensitive credentials (tokens, passwords) and should never be enabled in production. Default: false |
Warning: Debug options can expose sensitive information in logs and should only be used in secure development environments.
Test Configurations¶
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__SKIP_STORAGE_VALIDATION |
true | If set to true, Lakekeeper does not validate the provided storage configuration & credentials when creating or updating Warehouses. This is not suitable for production. Default: false |
License Configuration¶
LAKEKEEPER__LICENSE__KEY takes precedence.
| Variable | Example | Description |
|---|---|---|
LAKEKEEPER__LICENSE__KEY |
<license-key> |
License key as a string. Takes precedence over LAKEKEEPER__LICENSE__KEY_PATH if both are set. |
LAKEKEEPER__LICENSE__KEY_PATH |
/path/to/license.lic |
Path to a file containing the license key. |