feat(webapp): add 60s/60s SWR cache to getEntitlement#3388
Conversation
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/webapp/app/services/platform.v3.server.ts (1)
539-550: IncludeorganizationIdin entitlement error logs for faster incident triage.Both new error paths log the error but omit tenant context, which makes debugging cross-tenant billing incidents slower.
📋 Suggested logging tweak
- logger.error("Error getting entitlement - no success", { error: response.error }); + logger.error("Error getting entitlement - no success", { + organizationId, + error: response.error, + }); ... - logger.error("Error getting entitlement - caught error", { error: e }); + logger.error("Error getting entitlement - caught error", { + organizationId, + error: e, + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/services/platform.v3.server.ts` around lines 539 - 550, The entitlement error logs inside the platformCache.entitlement.swr callback omit tenant context; update both logger.error calls (the one in the !response.success branch and the catch branch around client.getEntitlement) to include organizationId in the structured metadata so logs contain the tenant for faster triage—i.e., when logging in the platformCache.entitlement.swr block referencing client.getEntitlement, add organizationId to the object passed to logger.error alongside existing error/response.error fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@apps/webapp/app/services/platform.v3.server.ts`:
- Around line 539-550: The entitlement error logs inside the
platformCache.entitlement.swr callback omit tenant context; update both
logger.error calls (the one in the !response.success branch and the catch branch
around client.getEntitlement) to include organizationId in the structured
metadata so logs contain the tenant for faster triage—i.e., when logging in the
platformCache.entitlement.swr block referencing client.getEntitlement, add
organizationId to the object passed to logger.error alongside existing
error/response.error fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 82bfad1d-c81a-4ff8-9372-fd4c6ba52be8
📒 Files selected for processing (2)
.server-changes/getEntitlement-swr-cache.mdapps/webapp/app/services/platform.v3.server.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
Files:
apps/webapp/app/services/platform.v3.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Add crumbs as you write code using
//@Crumbscomments or `// `#region` `@crumbsblocks. These are temporary debug instrumentation and must be stripped usingagentcrumbs stripbefore merge.
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.ts{,x}
📄 CodeRabbit inference engine (CLAUDE.md)
Always import from
@trigger.dev/sdkwhen writing Trigger.dev tasks. Never use@trigger.dev/sdk/v3or deprecatedclient.defineJob.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.server.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
apps/webapp/**/*.server.ts: Access environment variables viaenvexport fromapp/env.server.ts. Never useprocess.envdirectly.
Always usefindFirstinstead offindUniquein Prisma queries.findUniquehas implicit DataLoader batching with active bugs (uppercase UUIDs returning null, composite key SQL correctness, 5-10x worse performance).findFirstavoids these issues.
Access Prisma database via the singleton instance exported fromapp/db.server.ts.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{server,test,spec}.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
For testable code, never import
env.server.tsin test files. Pass configuration as options instead. Separate testable services (e.g.,realtimeClient.server.ts) from singleton factories (e.g.,realtimeClientGlobal.server.ts).
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
Use named constants for sentinel/placeholder values (e.g.
const UNSET_VALUE = "__unset__") instead of raw string literals scattered across comparisons.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: Access environment variables through theenvexport ofenv.server.tsinstead of directly accessingprocess.env
Use subpath exports from@trigger.dev/corepackage instead of importing from the root@trigger.dev/corepath
Files:
apps/webapp/app/services/platform.v3.server.ts
🧠 Learnings (4)
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-26T09:02:07.973Z
Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 3274
File: apps/webapp/app/services/runsReplicationService.server.ts:922-924
Timestamp: 2026-03-26T09:02:07.973Z
Learning: When parsing Trigger.dev task run annotations in server-side services, keep `TaskRun.annotations` strictly conforming to the `RunAnnotations` schema from `trigger.dev/core/v3`. If the code already uses `RunAnnotations.safeParse` (e.g., in a `#parseAnnotations` helper), treat that as intentional/necessary for atomic, schema-accurate annotation handling. Do not recommend relaxing the annotation payload schema or using a permissive “passthrough” parse path, since the annotations are expected to be written atomically in one operation and should not contain partial/legacy payloads that would require a looser parser.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-04-15T15:39:22.659Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-15T15:39:22.659Z
Learning: Applies to apps/webapp/{app/v3/services/triggerTask.server.ts,app/v3/services/batchTriggerV3.server.ts,app/v3/queues.server.ts} : Do NOT add database queries to `triggerTask.server.ts` or `batchTriggerV3.server.ts` hot paths. Task defaults (TTL, etc.) are resolved via `backgroundWorkerTask.findFirst()` in `queues.server.ts` — piggyback on the existing query instead of adding new ones.
Applied to files:
.server-changes/getEntitlement-swr-cache.md
🔇 Additional comments (2)
apps/webapp/app/services/platform.v3.server.ts (1)
74-78: SWR entitlement namespace configuration looks solid.The dedicated
entitlementcache namespace with 60s fresh/60s stale TTL cleanly matches the intended behavior and existing cache architecture..server-changes/getEntitlement-swr-cache.md (1)
1-6: Change note is clear and accurately scoped.The note documents the behavior change, fallback behavior, and outage trade-off in a way that is easy to reason about.
b2e9108 to
dda13d9
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/webapp/app/services/platform.v3.server.ts (1)
376-376: Consider wrapping cache removal in try-catch to prevent Redis failures from breaking the user flow.At this point,
client.setPlanhas already succeeded. IfplatformCache.entitlement.remove()throws (e.g., Redis unavailable), the user sees an error despite their plan change succeeding. Wrapping in try-catch with a warning log would improve resilience.♻️ Suggested improvement
if (result.accepted) { // Invalidate billing cache since plan changed opts?.invalidateBillingCache?.(organization.id); - await platformCache.entitlement.remove(organization.id); + try { + await platformCache.entitlement.remove(organization.id); + } catch (e) { + logger.warn("Failed to invalidate entitlement cache", { organizationId: organization.id, error: e }); + } return redirect(newProjectPath(organization, "You're on the Free plan."));Apply the same pattern to lines 393 and 399.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/services/platform.v3.server.ts` at line 376, Wrap the cache removal calls to platformCache.entitlement.remove(...) in a try-catch so Redis failures don't surface after client.setPlan has succeeded: surround each platformCache.entitlement.remove(organization.id) invocation with try { await platformCache.entitlement.remove(organization.id); } catch (err) { logger.warn?.("Failed to remove entitlement cache for organization", { orgId: organization.id, err }); } (or use the module's existing logger/ processLogger) and do not rethrow; apply the same pattern for the other two platformCache.entitlement.remove calls so cache errors are logged as warnings and do not break the user flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@apps/webapp/app/services/platform.v3.server.ts`:
- Line 376: Wrap the cache removal calls to
platformCache.entitlement.remove(...) in a try-catch so Redis failures don't
surface after client.setPlan has succeeded: surround each
platformCache.entitlement.remove(organization.id) invocation with try { await
platformCache.entitlement.remove(organization.id); } catch (err) {
logger.warn?.("Failed to remove entitlement cache for organization", { orgId:
organization.id, err }); } (or use the module's existing logger/ processLogger)
and do not rethrow; apply the same pattern for the other two
platformCache.entitlement.remove calls so cache errors are logged as warnings
and do not break the user flow.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 22729c9d-264f-4ebf-883a-6afb0d5703e9
📒 Files selected for processing (2)
.server-changes/getEntitlement-swr-cache.mdapps/webapp/app/services/platform.v3.server.ts
✅ Files skipped from review due to trivial changes (1)
- .server-changes/getEntitlement-swr-cache.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: typecheck / typecheck
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
Files:
apps/webapp/app/services/platform.v3.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Add crumbs as you write code using
//@Crumbscomments or `// `#region` `@crumbsblocks. These are temporary debug instrumentation and must be stripped usingagentcrumbs stripbefore merge.
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.ts{,x}
📄 CodeRabbit inference engine (CLAUDE.md)
Always import from
@trigger.dev/sdkwhen writing Trigger.dev tasks. Never use@trigger.dev/sdk/v3or deprecatedclient.defineJob.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.server.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
apps/webapp/**/*.server.ts: Access environment variables viaenvexport fromapp/env.server.ts. Never useprocess.envdirectly.
Always usefindFirstinstead offindUniquein Prisma queries.findUniquehas implicit DataLoader batching with active bugs (uppercase UUIDs returning null, composite key SQL correctness, 5-10x worse performance).findFirstavoids these issues.
Access Prisma database via the singleton instance exported fromapp/db.server.ts.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{server,test,spec}.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
For testable code, never import
env.server.tsin test files. Pass configuration as options instead. Separate testable services (e.g.,realtimeClient.server.ts) from singleton factories (e.g.,realtimeClientGlobal.server.ts).
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
Use named constants for sentinel/placeholder values (e.g.
const UNSET_VALUE = "__unset__") instead of raw string literals scattered across comparisons.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: Access environment variables through theenvexport ofenv.server.tsinstead of directly accessingprocess.env
Use subpath exports from@trigger.dev/corepackage instead of importing from the root@trigger.dev/corepath
Files:
apps/webapp/app/services/platform.v3.server.ts
🧠 Learnings (9)
📚 Learning: 2026-03-26T17:27:09.938Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3264
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.settings.private-connections._index/route.tsx:176-176
Timestamp: 2026-03-26T17:27:09.938Z
Learning: In `apps/webapp/app/routes/_app.orgs.$organizationSlug.settings.private-connections._index/route.tsx`, the variable `hasPrivateNetworking` is intentionally hardcoded to `true` as a placeholder. Plan-level gating will be wired to actual billing data (e.g., `plan?.v3Subscription?.plan?.limits?.hasPrivateNetworking`) once the billing integration is complete. The route is already guarded at the feature-flag level via `hasPrivateConnections` in the loader. Do not flag this hardcoded value as dead code or a bug until the billing integration is in place.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2025-11-14T16:03:06.917Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2681
File: apps/webapp/app/services/platform.v3.server.ts:258-302
Timestamp: 2025-11-14T16:03:06.917Z
Learning: In `apps/webapp/app/services/platform.v3.server.ts`, the `getDefaultEnvironmentConcurrencyLimit` function intentionally throws an error (rather than falling back to org.maximumConcurrencyLimit) when the billing client returns undefined plan limits. This fail-fast behavior prevents users from receiving more concurrency than their plan entitles them to. The org.maximumConcurrencyLimit fallback is only for self-hosted deployments where no billing client exists.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-26T10:02:25.354Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3254
File: apps/webapp/app/services/platformNotifications.server.ts:363-385
Timestamp: 2026-03-26T10:02:25.354Z
Learning: In `triggerdotdev/trigger.dev`, the `getNextCliNotification` fallback in `apps/webapp/app/services/platformNotifications.server.ts` intentionally uses `prisma.orgMember.findFirst` (single org) when no `projectRef` is provided. This is acceptable for v1 because the CLI (`dev` and `login` commands) always passes `projectRef` in normal usage, making the fallback a rare edge case. Do not flag the single-org fallback as a multi-org correctness bug in this file.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-04-07T14:12:59.018Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3331
File: apps/webapp/app/runEngine/concerns/batchPayloads.server.ts:112-136
Timestamp: 2026-04-07T14:12:59.018Z
Learning: In `apps/webapp/app/runEngine/concerns/batchPayloads.server.ts`, the `pRetry` call wrapping `uploadPacketToObjectStore` intentionally retries **all** error types (no `shouldRetry` filter / `AbortError` guards). The maintainer explicitly prefers over-retrying to under-retrying because multiple heterogeneous object store backends are supported and it is impractical to enumerate all permanent error signatures. Do not flag this as an issue in future reviews.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-02-19T18:09:23.944Z
Learnt from: samejr
Repo: triggerdotdev/trigger.dev PR: 3095
File: apps/webapp/app/components/navigation/DashboardDialogs.tsx:47-47
Timestamp: 2026-02-19T18:09:23.944Z
Learning: In the triggerdotdev/trigger.dev codebase, the pattern `isPaying === false` is used consistently to explicitly check for free plan users. This is a project convention that distinguishes between `isPaying === true` (paying), `isPaying === false` (free), and `isPaying === undefined` (no subscription data). Do not suggest changing this to negation pattern.
```
<!-- <review_comment_addressed>
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-04-13T21:44:00.032Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierRegistry.server.ts:24-67
Timestamp: 2026-04-13T21:44:00.032Z
Learning: In `apps/webapp/app/services/taskIdentifierRegistry.server.ts`, the sequential upsert/updateMany/findMany writes in `syncTaskIdentifiers` are intentionally NOT wrapped in a Prisma transaction. This function runs only during deployment-change events (low-concurrency path), and any partial `isInLatestDeployment` state is acceptable because it self-corrects on the next deployment. Do not flag this as a missing-transaction/atomicity issue in future reviews.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-26T09:02:07.973Z
Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 3274
File: apps/webapp/app/services/runsReplicationService.server.ts:922-924
Timestamp: 2026-03-26T09:02:07.973Z
Learning: When parsing Trigger.dev task run annotations in server-side services, keep `TaskRun.annotations` strictly conforming to the `RunAnnotations` schema from `trigger.dev/core/v3`. If the code already uses `RunAnnotations.safeParse` (e.g., in a `#parseAnnotations` helper), treat that as intentional/necessary for atomic, schema-accurate annotation handling. Do not recommend relaxing the annotation payload schema or using a permissive “passthrough” parse path, since the annotations are expected to be written atomically in one operation and should not contain partial/legacy payloads that would require a looser parser.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
🔇 Additional comments (2)
apps/webapp/app/services/platform.v3.server.ts (2)
542-570: LGTM! Well-designed SWR implementation with proper error handling.The approach of catching errors inside the loader and returning
undefined(to prevent caching the fail-open value) while applying the fail-open logic outside the SWR call is sound. The comment clearly explains the rationale for this design, particularly around preventing unhandled rejections in background revalidation.
74-78: Verify the type ofReportUsageResultimported from@trigger.dev/platform.The namespace caches entitlement responses (with
hasAccessandplanproperties) using theReportUsageResulttype imported from@trigger.dev/platform. While the code compiles and functions correctly, the type name suggests it's intended for usage reporting rather than entitlement responses. Confirm with the platform package maintainers whetherReportUsageResultis the intended type for entitlement responses, or if there's a more semantically appropriate type available.
Wraps getEntitlement in platform.v3.server.ts with the existing
platformCache (LRU memory + Redis) under a new `entitlement` namespace.
Eliminates a synchronous billing-service HTTP round trip on every
trigger.
Cache config: 60s fresh / 60s stale SWR. Cache key is the
organization id. Errors are caught inside the loader and return the
existing permissive { hasAccess: true } fallback, which is also
cached to prevent thundering-herd on billing outages.
Trade-off: plan upgrade/downgrade is now visible after up to ~120s
worst-case (60s fresh + 60s stale revalidation). Acceptable since
the existing limits and usage namespaces use 5min/10min, and the
defensive hasAccess: true fallback already exists.
Devin caught a correctness bug in the previous commit. Returning
{ hasAccess: true } from inside the SWR loader on error caused that
fail-open value to be cached for 60-120s, which could overwrite a
legitimate hasAccess: false during a transient billing outage and
grant a blocked org access for up to 120s.
Fix: catch errors inside the loader (so we don't trigger the @unkey/cache
unhandled-rejection issue during background revalidation) and return
undefined. Apply the fail-open default *outside* the SWR call so it
never becomes a cached access decision.
Trade-off: returning undefined from the loader still overwrites the
previous cached entry with an undefined value, but @unkey/cache's
swr() treats an undefined cached value as a miss and re-fetches on the
next request — so on billing recovery, the cache picks up the real
result immediately rather than serving a stale fail-open for up to
120s.
The stale parameter on @unkey/cache Namespace is the TOTAL ttl, not an additional window beyond fresh. Setting fresh: 60_000 and stale: 60_000 gave a fresh-only entry that expired entirely at 60s with no SWR window. Setting stale: 120_000 yields the intended behavior: fresh 0-60s, stale-revalidate 60-120s. Verified locally with agentcrumbs by running through cache miss, fresh hit, stale revalidate, and stale-with-failed-bg-revalidate scenarios against a live billing server.
When setPlan transitions a customer's plan (free_connected, updated_subscription, canceled_subscription), invalidate the new entitlement cache alongside the existing billing cache invalidation. Without this, a downgrade or cancellation could leave hasAccess: true served from cache for up to 120s — meaning revoked access would linger. Per Devin's review on the swr cache PR.
dda13d9 to
58a87f4
Compare
Cache removal in setPlan should not block or throw after the billing
mutation has already committed. Switch from await to fire-and-forget
with .catch(() => {}) so a Redis failure during cache invalidation
cannot 500 the plan-change response.
Per CodeRabbit review.
## Summary 8 new features, 18 improvements, 11 bug fixes. ## Breaking changes - Add server-side deprecation gate for deploys from v3 CLI versions (gated by `DEPRECATE_V3_CLI_DEPLOYS_ENABLED`). v4 CLI deploys are unaffected. ([#3415](#3415)) ## Improvements - Add `--no-browser` flag to `init` and `login` to skip auto-opening the browser during authentication. Also error loudly when `init` is run without `--yes` under non-TTY stdin (previously default-and-exited silently, leaving the project half-initialized). Both commands now show an `Examples` section in `--help`. ([#3483](#3483)) - Add `isReplay` boolean to the run context (`ctx.run.isReplay`), derived from the existing `replayedFromTaskRunFriendlyId` database field. Defaults to `false` for backwards compatibility. ([#3454](#3454)) - Redact the `resolveWaitpoint` runtime log so it only emits `id` and `type` instead of the full completed waitpoint. Previously the log printed the entire waitpoint (including `output`) to stdout in production runs, which could leak sensitive payloads. The value returned by `wait.forToken()` is unchanged. ([#3490](#3490)) - Add `SessionId` friendly ID generator and schemas for the new durable Session primitive. Exported from `@trigger.dev/core/v3/isomorphic` alongside `RunId`, `BatchId`, etc. Ships the `CreateSessionStreamWaitpoint` request/response schemas alongside the main Session CRUD. ([#3417](#3417)) - Truncate large error stacks and messages to prevent OOM crashes. Stack traces are capped at 50 frames (keeping top 5 + bottom 45 with an omission notice), individual stack lines at 1024 chars, and error messages at 1000 chars. Applied in parseError, sanitizeError, and OTel span recording. ([#3405](#3405)) ## Server changes These changes affect the self-hosted Docker image and Trigger.dev Cloud: - Add a "Back office" tab to `/admin` and a per-organization detail page at `/admin/back-office/orgs/:orgId`. The first action available on that page is editing the org's API rate limit: admins can save a `tokenBucket` override (refill rate, interval, max tokens) and see a plain-English preview of the resulting sustained rate and burst allowance. Writes are audit-logged via the server logger. ([#3434](#3434)) - Optional `DEPLOY_REGISTRY_ECR_DEFAULT_REPOSITORY_POLICY` env var to apply a default repository policy when the webapp creates new ECR repos ([#3467](#3467)) - Ship the Errors page to all users, with a polish + bug-fix pass: pinned "No channel" item in the Slack alert channel picker, viewer-timezone alert timestamps via Slack's `<!date^>` token, Activity sparkline peak tooltip, centered loading spinner and bug-icon empty state on the error detail page, ellipsis on the Configure alerts trigger. ([#3477](#3477)) - Configure the set of machine presets to build boot snapshots for at deploy time via `COMPUTE_TEMPLATE_MACHINE_PRESETS` (CSV of preset names, default `small-1x`). Use `COMPUTE_TEMPLATE_MACHINE_PRESETS_REQUIRED` (CSV, default = full PRESETS list) to scope which preset failures fail a required-mode deploy. Optional preset failures are logged and don't block the deploy. ([#3492](#3492)) - Regenerating a RuntimeEnvironment API key no longer invalidates the previous key immediately. The old key is recorded in a new `RevokedApiKey` table with a 24 hour grace window, and `findEnvironmentByApiKey` falls back to it when the submitted key doesn't match any live environment. The grace window can be ended early (or extended) by updating `expiresAt` on the row. ([#3420](#3420)) - Add the `Session` primitive — a durable, task-bound, bidirectional I/O channel that outlives a single run and acts as the run manager for `chat.agent`. Ships the Postgres `Session` + `SessionRun` tables, ClickHouse `sessions_v1` + replication service, the `sessions` JWT scope, and the public CRUD + realtime routes (`/api/v1/sessions`, `/realtime/v1/sessions/:session/:io`) including `end-and-continue` for server-orchestrated run handoffs and session-stream waitpoints. ([#3417](#3417)) - Add `KUBERNETES_POD_DNS_NDOTS_OVERRIDE_ENABLED` flag (off by default) that overrides the cluster default and sets `dnsConfig.options.ndots` on runner pods (defaulting to 2, configurable via `KUBERNETES_POD_DNS_NDOTS`). Kubernetes defaults pods to `ndots: 5`, so any name with fewer than 5 dots — including typical external domains like `api.example.com` — is first walked through every entry in the cluster search list (`<ns>.svc.cluster.local`, `svc.cluster.local`, `cluster.local`) before being tried as-is, turning one resolution into 4+ CoreDNS queries (×2 with A+AAAA). Using a lower `ndots` value reduces DNS query amplification in the `cluster.local` zone. Note: before enabling, make sure no code path relies on search-list expansion for names with dots ≥ the configured value — those names will hit their as-is form first and could resolve externally before falling back to the cluster search path. ([#3441](#3441)) - Vercel integration option to disable auto promotions ([#3376](#3376)) - Make it clear in the admin that feature flags are global and should rarely be changed. ([#3408](#3408)) - Admin worker groups API: add GET loader and expose more fields on POST. ([#3390](#3390)) - Add 60s fresh / 60s stale SWR cache to `getEntitlement` in `platform.v3.server.ts`. Eliminates a synchronous billing-service HTTP round trip on every trigger. Reuses the existing `platformCache` (LRU memory + Redis) pattern already used for `limits` and `usage`. Cache key is `${orgId}`. Errors return a permissive `{ hasAccess: true }` fallback (existing behavior) and are also cached to prevent thundering-herd on billing outages. ([#3388](#3388)) - Show a `MicroVM` badge next to the region name on the regions page. ([#3407](#3407)) - Increase default maximum project count per organization from 10 to 25 ([#3409](#3409)) - Merge execution snapshot creation into the dequeue taskRun.update transaction, reducing 2 DB commits to 1 per dequeue operation ([#3395](#3395)) - Add per-worker Node.js heap metrics to the OTel meter — `nodejs.memory.heap.used`, `nodejs.memory.heap.total`, `nodejs.memory.heap.limit`, `nodejs.memory.external`, `nodejs.memory.array_buffers`, `nodejs.memory.rss`. Host-metrics only publishes RSS, which overstates V8 heap by the external + native footprint; these give direct heap visibility per cluster worker so `NODE_MAX_OLD_SPACE_SIZE` can be sized against observed heap peaks rather than RSS. ([#3437](#3437)) - Tag Prisma spans with `db.datasource: "writer" | "replica"` so monitors and trace queries can distinguish the writer pool from the replica pool. Applies to all `prisma:engine:*` spans (including `prisma:engine:connection` used by the connection-pool monitors) and the outer `prisma:client:operation` span. ([#3422](#3422)) - Clarify the cross-region intent in the Terraform and AI-prompt helpers on the Add Private Connection page. Both already default `supported_regions` to `["us-east-1", "eu-central-1"]`; added an inline comment / parenthetical so the user understands why both regions are listed (Trigger.dev runs in both, so the service must be consumable from either). ([#3465](#3465)) - Add `RUN_ENGINE_READ_REPLICA_SNAPSHOTS_SINCE_ENABLED` flag (default off) to route the Prisma reads inside `RunEngine.getSnapshotsSince` through the read-only replica client. Offloads the snapshot polling queries (fired by every running task runner) from the primary. When disabled, behavior is unchanged. ([#3423](#3423)) - Stop creating TaskRunTag records and _TaskRunToTaskRunTag join table entries during task triggering. The denormalized runTags string array on TaskRun already stores tag names, making the M2M relation redundant write overhead. ([#3369](#3369)) - Stop writing per-tick state (`lastScheduledTimestamp`, `nextScheduledTimestamp`, `lastRunTriggeredAt`) on `TaskSchedule` and `TaskScheduleInstance`. The schedule engine now carries the previous fire time forward via the worker queue payload, eliminating ~270K dead-tuple-driven autovacuums per year on these hot tables and the associated `IO:XactSync` mini-spikes on the writer. Customer-facing `payload.lastTimestamp` semantics are unchanged. ([#3476](#3476)) - Replace the expensive DISTINCT query for task filter dropdowns with a dedicated TaskIdentifier registry table backed by Redis. Environments migrate automatically on their next deploy, with a transparent fallback to the legacy query for unmigrated environments. Also fixes duplicate dropdown entries when a task changes trigger source, and adds active/archived grouping for removed tasks. Moves BackgroundWorkerTask reads in the trigger hot path to the read replica. ([#3368](#3368)) - Public Access Tokens (PATs) minted before an API key rotation now keep working during the 24h grace window. `validatePublicJwtKey` falls back to any non-expired `RevokedApiKey` rows for the signing environment when the primary signature check against the env's current `apiKey` fails. The fallback query only runs on the failure path, so the hot success path is unchanged. ([#3464](#3464)) - Batch items that hit the environment queue size limit now fast-fail without retries and without creating pre-failed TaskRuns. ([#3352](#3352)) - Show the cancel button in the runs list for runs in `DEQUEUED` status. `DEQUEUED` was missing from `NON_FINAL_RUN_STATUSES` so the list hid the button even though the single run page allowed it. ([#3421](#3421)) - Reduce 5xx feedback loops on hot debounce keys by quantizing `delayUntil`, adding an unlocked fast-path skip, and gracefully handling redlock contention in `handleDebounce` so the SDK no longer retries into a herd. ([#3453](#3453)) - Fix RSS memory leak in the realtime proxy routes. `/realtime/v1/runs`, `/realtime/v1/runs/:id`, and `/realtime/v1/batches/:id` called `fetch()` into Electric with no abort signal, so when a client disconnected mid long-poll, undici kept the upstream socket open and buffered response chunks that would never be consumed — retained only in RSS, invisible to V8 heap tooling. Thread `getRequestAbortSignal()` through `RealtimeClient.streamRun/streamRuns/streamBatch` to `longPollingFetch` and cancel the upstream body in the error path. Isolated reproducer showed ~44 KB retained per leaked request; signal propagation releases it cleanly. ([#3442](#3442)) - Fix memory leak where every aborted SSE connection pinned the full request/response graph on Node 20, caused by `AbortSignal.any()` in `sse.ts` retaining its source signals indefinitely (see nodejs/node#54614, nodejs/node#55351). Also clear the `setTimeout(abort)` timer in `entry.server.tsx` so successful HTML renders don't pin the React tree for 30s per request. ([#3430](#3430)) - Preserve filters on the queues page when submitting modal actions. ([#3471](#3471)) - Fix Redis connection leak in realtime streams and broken abort signal propagation. **Redis connections**: Non-blocking methods (ingestData, appendPart, getLastChunkIndex) now share a single Redis connection instead of creating one per request. streamResponse still uses dedicated connections (required for XREAD BLOCK) but now tears them down immediately via disconnect() instead of graceful quit(), with a 15s inactivity fallback. **Abort signal**: request.signal is broken in Remix/Express due to a Node.js undici GC bug (nodejs/node#55428) that severs the signal chain when Remix clones the Request internally. Added getRequestAbortSignal() wired to Express res.on("close") via httpAsyncStorage, which fires reliably on client disconnect. All SSE/streaming routes updated to use it. ([#3399](#3399)) - Prevent dashboard crash (React error #31) when span accessory item text is not a string. Filters out malformed accessory items in SpanCodePathAccessory instead of passing objects to React as children. ([#3400](#3400)) - Upgrade Remix packages from 2.1.0 to 2.17.4 to address security vulnerabilities in React Router ([#3372](#3372)) - Fix Vercel integration settings page (remove redundant section toggles) and improve the Vercel onboarding flow so the modal closes after connecting a GitHub repo and the marketplace `next` URL is preserved across the GitHub app install redirect. ([#3424](#3424)) <details> <summary>Raw changeset output</summary> # Releases ## @trigger.dev/build@4.4.5 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.4.5` ## trigger.dev@4.4.5 ### Patch Changes - Add `--no-browser` flag to `init` and `login` to skip auto-opening the browser during authentication. Also error loudly when `init` is run without `--yes` under non-TTY stdin (previously default-and-exited silently, leaving the project half-initialized). Both commands now show an `Examples` section in `--help`. ([#3483](#3483)) - Updated dependencies: - `@trigger.dev/core@4.4.5` - `@trigger.dev/build@4.4.5` - `@trigger.dev/schema-to-json@4.4.5` ## @trigger.dev/core@4.4.5 ### Patch Changes - Add `isReplay` boolean to the run context (`ctx.run.isReplay`), derived from the existing `replayedFromTaskRunFriendlyId` database field. Defaults to `false` for backwards compatibility. ([#3454](#3454)) - Redact the `resolveWaitpoint` runtime log so it only emits `id` and `type` instead of the full completed waitpoint. Previously the log printed the entire waitpoint (including `output`) to stdout in production runs, which could leak sensitive payloads. The value returned by `wait.forToken()` is unchanged. ([#3490](#3490)) - Add `SessionId` friendly ID generator and schemas for the new durable Session primitive. Exported from `@trigger.dev/core/v3/isomorphic` alongside `RunId`, `BatchId`, etc. Ships the `CreateSessionStreamWaitpoint` request/response schemas alongside the main Session CRUD. ([#3417](#3417)) - Truncate large error stacks and messages to prevent OOM crashes. Stack traces are capped at 50 frames (keeping top 5 + bottom 45 with an omission notice), individual stack lines at 1024 chars, and error messages at 1000 chars. Applied in parseError, sanitizeError, and OTel span recording. ([#3405](#3405)) ## @trigger.dev/python@4.4.5 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.4.5` - `@trigger.dev/build@4.4.5` - `@trigger.dev/sdk@4.4.5` ## @trigger.dev/react-hooks@4.4.5 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.4.5` ## @trigger.dev/redis-worker@4.4.5 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.4.5` ## @trigger.dev/rsc@4.4.5 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.4.5` ## @trigger.dev/schema-to-json@4.4.5 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.4.5` ## @trigger.dev/sdk@4.4.5 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.4.5` </details> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Wraps getEntitlement in platform.v3.server.ts with the existing
platformCache (LRU memory + Redis) under a new
entitlementnamespace.Eliminates a synchronous billing-service HTTP round trip on every
trigger.
Cache config: 60s fresh / 60s stale SWR. Cache key is the
organization id. Errors are caught inside the loader and return the
existing permissive { hasAccess: true } fallback, which is also
cached to prevent thundering-herd on billing outages.
Trade-off: plan upgrade/downgrade is now visible after up to ~120s
worst-case (60s fresh + 60s stale revalidation). Acceptable since
the existing limits and usage namespaces use 5min/10min, and the
defensive hasAccess: true fallback already exists.