Configuration

Review the Trama configuration surface for workers, Redis topology, queue sharding, telemetry, and deployment profiles. Start with local defaults, then move into worker scaling and Redis Cluster settings when preparing production environments.

Reference page Operators and platform teams Local to production

InstallGet a local environment running first. ProductionSee how these settings shape real deployments. ObservabilityConnect runtime tuning to what you monitor.

Start here

For local development, focus on runtime enablement, Redis/Postgres connectivity, and metrics. For production, focus on worker count, Redis topology, queue behavior, telemetry, and shard ownership settings.

Core Keys

Key	Default	Purpose
RUNTIME
`runtime.enabled`	`true`	Enable queue workers on this instance.
`runtime.workerCount`	`4`	Number of concurrent worker coroutines.
`runtime.bufferSize`	`200`	Internal dispatch channel buffer size.
`runtime.emptyPollDelayMillis`	`50`	Delay between polls when the queue is empty.
`runtime.maxStepsPerExecution`	`25`	Maximum nodes processed per worker cycle before re-enqueue.
`runtime.store`	`REDIS`	Execution state backend: `REDIS` or `POSTGRES`.
`runtime.callback.baseUrl`	`""`	Public base URL injected as `{{runtime.callback.url}}` in async task requests. Required for async tasks.
`runtime.callback.hmacSecret`	`""`	HMAC-SHA256 secret for signing and validating callback tokens. Required for async tasks.
`runtime.callback.hmacKid`	`default`	Key ID embedded in the token header for future key rotation.
REDIS
`redis.topology`	`STANDALONE`	Redis client mode: `STANDALONE` or `CLUSTER`.
`redis.url`	`redis://localhost:6379`	Connection URL used in standalone mode.
`redis.pool.maxTotal`	`16`	Maximum total connections in the Redis connection pool.
`redis.pool.maxIdle`	`16`	Maximum idle connections kept in the pool.
`redis.pool.minIdle`	`0`	Minimum idle connections maintained in the pool.
`redis.cluster.nodes`	`[]`	Seed nodes used when `redis.topology=CLUSTER`.
`redis.queue.keyPrefix`	`saga:executions`	Prefix for shard-aware ready and in-flight queue keys.
`redis.consumer.batchSize`	`50`	Messages claimed per poll cycle.
`redis.consumer.processingTimeoutMillis`	`60000`	In-flight lease timeout before a message is requeued.
`redis.consumer.requeueIntervalMillis`	`5000`	How often expired in-flight messages are recovered.
`redis.sharding.virtualShardCount`	`1024`	Virtual shard fan-out used by rendezvous allocation.
`redis.sharding.membershipKey`	`saga:runtime:pods`	Redis ZSET tracking live worker pods.
`redis.sharding.membershipTtlMillis`	`10000`	Worker membership entry TTL.
`redis.sharding.heartbeatIntervalMillis`	`3000`	How often each worker refreshes its membership entry.
`redis.sharding.refreshIntervalMillis`	`2000`	How often each worker recalculates shard ownership.
`redis.sharding.claimerCount`	`4`	Parallel shard claimers per worker process.
DATABASE
`database.host`	`db`	Postgres host. Used when `runtime.store=POSTGRES`.
`database.port`	`5432`	Postgres port.
`database.database`	`saga`	Postgres database name.
`database.user`	`saga`	Postgres user.
`database.password`	`saga`	Postgres password.
`database.pool.maxPoolSize`	`10`	Maximum JDBC connection pool size.
`database.pool.minIdle`	`1`	Minimum idle JDBC connections.
HTTP CLIENT
`http.connectTimeoutMillis`	`10000`	Outbound HTTP connect timeout for workflow task requests.
`http.requestTimeoutMillis`	`30000`	Total outbound HTTP request timeout.
`http.socketTimeoutMillis`	`30000`	Outbound HTTP socket read timeout.
RATE LIMIT
`rateLimit.enabled`	`true`	Enable per-saga rate limiting.
`rateLimit.maxFailures`	`5`	Failure count within the window before an execution is blocked.
`rateLimit.windowMillis`	`60000`	Sliding window duration for failure counting.
`rateLimit.blockMillis`	`60000`	How long a blocked execution is held before retry is allowed.
`rateLimit.keyPrefix`	`saga:rate`	Redis key prefix for rate-limit counters.
MAINTENANCE
`maintenance.enabled`	`true`	Enable the background partition maintenance job (Postgres only).
`maintenance.partitionLookaheadMonths`	`13`	How many months ahead to pre-create Postgres partitions.
`maintenance.partitionStartOffsetMonths`	`1`	How many months back to start partition creation.
`maintenance.retentionDays`	`15`	Days of execution history to retain before purging.
`maintenance.intervalMillis`	`3600000`	How often the maintenance job runs (1 hour default).
METRICS & TELEMETRY
`metrics.enabled`	`true`	Expose Prometheus metrics at `/metrics`.
`telemetry.enabled`	`false`	Enable OpenTelemetry span export.
`telemetry.serviceName`	`trama`	Service name attached to all exported spans.
`telemetry.otlpEndpoint`	`http://localhost:4317`	OTLP gRPC endpoint for trace export.

Async Callback

Async task nodes pause a workflow and wait for an external system to POST to a callback URL. The three runtime.callback.* keys must be set for this to work. The runtime injects {{runtime.callback.url}} and {{runtime.callback.token}} template variables into the outbound request body, so the external service knows where to call back and which token to present.

runtime:
  callback:
    baseUrl: "https://trama.internal"   # must be reachable by downstream services
    hmacSecret: "change-me-in-production"
    hmacKid: "default"

Redis Cluster

Trama now supports Redis Cluster-aware queue sharding. Ready/in-flight keys, execution metadata, and rate-limit keys are written with hash tags so multi-key Lua operations stay inside one Redis slot.

Use redis.topology: CLUSTER to enable the Lettuce cluster client.
Set redis.cluster.nodes to one or more reachable cluster seed nodes.
Keep the same redis.queue.keyPrefix on every worker pod.
Give each worker process a unique redis.sharding.podId so rendezvous ownership is stable.
Tune redis.sharding.virtualShardCount and redis.sharding.claimerCount together when scaling out.

redis:
  topology: "CLUSTER"
  cluster:
    nodes:
      - "redis://redis-cluster-0.redis:6379"
      - "redis://redis-cluster-1.redis:6379"
      - "redis://redis-cluster-2.redis:6379"
  queue:
    keyPrefix: "saga:executions"
  consumer:
    batchSize: 100
    processingTimeoutMillis: 60000
    requeueIntervalMillis: 5000
  sharding:
    podId: "${HOSTNAME}"
    virtualShardCount: 1024
    membershipKey: "saga:runtime:pods"
    membershipTtlMillis: 10000
    heartbeatIntervalMillis: 3000
    refreshIntervalMillis: 2000
    claimerCount: 4

Environment Overrides

Supported overrides include:

RUNTIME_ENABLED
METRICS_ENABLED
TELEMETRY_ENABLED
REDIS_URL
REDIS_TOPOLOGY
REDIS_CLUSTER_NODES
DATABASE_HOST
DATABASE_PORT
DATABASE_DATABASE
DATABASE_USER
DATABASE_PASSWORD

Profiles

Dev profile

runtime:
  workerCount: 2
  emptyPollDelayMillis: 100
metrics:
  enabled: true
telemetry:
  enabled: false

Prod profile (split API and Workers)

In production, run API pods separately from worker pods. API instances should not process queue jobs.

API deployment profile

runtime:
  enabled: false
metrics:
  enabled: true
telemetry:
  enabled: true
database:
  pool:
    maxPoolSize: 20

Worker deployment profile

runtime:
  enabled: true
  workerCount: 8
  bufferSize: 500
  maxStepsPerExecution: 25
redis:
  consumer:
    batchSize: 100
    processingTimeoutMillis: 60000
rateLimit:
  enabled: true
metrics:
  enabled: true
telemetry:
  enabled: true

Worker deployment profile with Redis Cluster

runtime:
  enabled: true
  workerCount: 8
  bufferSize: 500
redis:
  topology: "CLUSTER"
  cluster:
    nodes:
      - "redis://redis-cluster-0.redis:6379"
      - "redis://redis-cluster-1.redis:6379"
      - "redis://redis-cluster-2.redis:6379"
  consumer:
    batchSize: 100
    processingTimeoutMillis: 60000
    requeueIntervalMillis: 5000
  sharding:
    podId: "${HOSTNAME}"
    virtualShardCount: 1024
    membershipKey: "saga:runtime:pods"
    membershipTtlMillis: 10000
    heartbeatIntervalMillis: 3000
    refreshIntervalMillis: 2000
    claimerCount: 4
metrics:
  enabled: true
telemetry:
  enabled: true

Scale workers horizontally by running multiple worker deployments/replicas with this same worker profile.