Bug: Webhooks stopped triggering updates + Sonarr season packs cause incomplete download pickup and app crashes (v1.7.31) #61

Closed
opened 2026-05-28 08:49:01 +01:00 by Gandalf · 0 comments
Owner

Description:

Two related issues are affecting real-time updates and queue accuracy in v1.7.31 (release/1.7.31, merged 2026-05-28):

  1. Webhooks are not reliably triggering dashboard/SSE updates ("not working again").
  2. Not all downloads are being picked up — ~140 episode-level items appear in the Sonarr queue, but far fewer appear in the download client views (qBittorrent, SABnzbd, etc.) because many are multi-episode season packs. This mismatch has already caused at least one app crash.

These issues surfaced or regressed after the frontend remediation and poller/SSE test expansions in v1.7.31.

Investigation Findings (Code Analysis on release/1.7.31)

1. Webhook Issues (server/routes/webhook.js + related)

  • Replay protection is overly aggressive (isReplay function):

    • Uses key ${eventType}:${instanceName}:${eventDate} (or ${requestId}-${eventDate} for Ombi).
    • Sonarr/Radarr often send slightly varying date values or identical timestamps for rapid events → legitimate webhooks return 200 { duplicate: true } and skip processWebhookEvent() + SSE broadcast.
    • This explains "not working again" after previous fixes.
  • Instance resolution fallback is brittle:

    • sonarrInstances.find(i => i.name === instanceName || i.id === instanceName) || sonarrInstances[0]
    • Mismatch between *arr instanceName payload and configured instances causes wrong cache/metrics updates.
  • Ombi path still fragile (despite query-param secret fallback in commit 7b9c895):

    • Does not call validatePayload().
    • Complex 3-retry + delay + extractRequestedUser() logic can fail silently for some payloads/Ombi versions, leaving poll:ombi-requests incomplete.
  • Fire-and-forget + SSE dependency:

    • processWebhookEvent(...).catch(...) only logs errors.
    • Final await pollAllServices() (for SSE) can fail silently if poller/cache has issues (see below).
    • No per-webhook success confirmation in UI.
  • No changes in v1.7.31 to webhook logic (only frontend + tests), so previous partial fixes regressed under load.

2. Sonarr Queue / Season Pack Handling (server/clients/PollingSonarrRetriever.js + poller + cache)

  • getQueue() implementation (lines ~20-60):

    // Pagination with includeSeries + includeEpisode
    const response = await axios.get(..., { params: { includeSeries: true, includeEpisode: true, page, pageSize: 1000 } });
    allRecords = allRecords.concat(records);  // Simple concat, no transformation
    
    • Correctly paginates (up to 50 pages / 50k items) and enriches with series/episode data.
    • No handling whatsoever for season packs / multi-episode items:
      • Each season pack record (representing 10–24 episodes) is treated as one queue item.
      • No unpacking, no episodeCount aggregation, no deduplication against download client queues.
    • Result: Sonarr queue shows ~140 items; download clients (qBittorrent etc.) show far fewer actual torrents → UI mismatch.
  • Poller + Webhook refresh path (server/utils/poller.js + arrRetrieverRegistry.getQueuesByType()):

    • Flattens records with _instanceUrl / _instanceKey and caches under poll:sonarr-queue.
    • When webhooks fire, processWebhookEvent calls this and then pollAllServices() for SSE.
    • No special season-pack logic → dashboard downloads view under-reports.
    • Over-reliance on webhooks (skips polling if recent webhook activity) means missed updates when season packs complete without triggering expected events.
  • Crash cause:

    • With 140+ complex items (season packs have nested series, episode, status, etc.), downstream code (dashboard rendering, SSE payload, cache operations, or UI lists in server/routes/dashboard.js) can hit:
      • Memory spikes from repeated deep object traversal.
      • Unhandled nulls/missing fields in episode data for packs.
      • UI React loops or virtualisation failures on large un-deduplicated lists.
    • Confirmed crash reported by user after ~140-item queue appeared.
  • No deduplication or cross-client reconciliation anywhere in the queue/download client pipeline.

Impact

  • Real-time updates via webhooks are unreliable (replay protection + SSE gaps).
  • Download progress is incomplete/inaccurate for anyone using season packs (common in Sonarr).
  • Risk of crashes on moderate-to-large queues.
  • Breaks the "real-time dashboard" promise, especially when webhooks + poller interact.

Proposed Solution / Fix Plan:

  1. Webhook Reliability (High Priority)

    • Soften replay protection: normalise eventDate to minute precision or use a content hash of key payload fields instead of raw date.
    • Add optional forceRefresh query param or admin bypass for testing.
    • Improve instance matching (fuzzy or ID-only fallback with logging).
    • Add structured success/failure response + metrics for webhook deliveries.
    • Ensure pollAllServices() errors are properly surfaced.
  2. Sonarr Season Pack / Multi-Episode Handling (Critical)

    • In PollingSonarrRetriever.getQueue() (or post-processing in arrRetrieverRegistry):
      • Detect season packs (seasonNumber && !episodeNumbers || episodeCount > 1).
      • Either:
        • Unpack into individual episode records for display (preferred for accurate per-episode progress), or
        • Add isSeasonPack: true, episodeCount, and aggregated progress fields.
      • Store original Sonarr record + derived episode list.
    • Add deduplication logic when merging Sonarr queue with download client queues (match by downloadId / torrent hash / title).
    • Update dashboard UI to clearly show "Season Pack (X episodes)" with overall progress.
  3. Stability / Crash Prevention

    • Add size limits + pagination/virtualisation warnings in dashboard for queues > 100 items.
    • Add defensive null checks and try/catch around queue flattening and SSE payload construction.
    • Increase test coverage for season-pack payloads (currently expanded tests don't cover this).
  4. Implementation Notes

    • Keep backward-compatible cache keys (poll:sonarr-queue).
    • Update README.md and webhook docs to recommend proper *arr webhook settings (include full episode info).
    • Add a "Queue Diagnostics" panel in the UI showing raw vs processed counts.

Suggested Labels:
Kind/Bug, Priority: High, Area/Webhooks, Area/Queue, Compat/Non-Breaking (with migration note for season packs)

Affected Versions: v1.7.30 – v1.7.31 (regressed after frontend + poller changes)

This ticket combines two user-reported issues that share the webhook → queue refresh → display pipeline. Fixing both will restore reliable real-time operation and accurate multi-episode tracking.

**Description:** Two related issues are affecting real-time updates and queue accuracy in v1.7.31 (release/1.7.31, merged 2026-05-28): 1. **Webhooks are not reliably triggering dashboard/SSE updates** ("not working again"). 2. **Not all downloads are being picked up** — ~140 episode-level items appear in the Sonarr queue, but far fewer appear in the download client views (qBittorrent, SABnzbd, etc.) because many are multi-episode **season packs**. This mismatch has already caused at least one app crash. These issues surfaced or regressed after the frontend remediation and poller/SSE test expansions in v1.7.31. ### Investigation Findings (Code Analysis on release/1.7.31) #### 1. Webhook Issues (server/routes/webhook.js + related) - **Replay protection is overly aggressive** (`isReplay` function): - Uses key `${eventType}:${instanceName}:${eventDate}` (or `${requestId}-${eventDate}` for Ombi). - Sonarr/Radarr often send slightly varying `date` values or identical timestamps for rapid events → legitimate webhooks return `200 { duplicate: true }` and **skip** `processWebhookEvent()` + SSE broadcast. - This explains "not working again" after previous fixes. - **Instance resolution fallback is brittle**: - `sonarrInstances.find(i => i.name === instanceName || i.id === instanceName) || sonarrInstances[0]` - Mismatch between *arr `instanceName` payload and configured instances causes wrong cache/metrics updates. - **Ombi path still fragile** (despite query-param secret fallback in commit 7b9c895): - Does **not** call `validatePayload()`. - Complex 3-retry + delay + `extractRequestedUser()` logic can fail silently for some payloads/Ombi versions, leaving `poll:ombi-requests` incomplete. - **Fire-and-forget + SSE dependency**: - `processWebhookEvent(...).catch(...)` only logs errors. - Final `await pollAllServices()` (for SSE) can fail silently if poller/cache has issues (see below). - No per-webhook success confirmation in UI. - **No changes in v1.7.31** to webhook logic (only frontend + tests), so previous partial fixes regressed under load. #### 2. Sonarr Queue / Season Pack Handling (server/clients/PollingSonarrRetriever.js + poller + cache) - **getQueue() implementation** (lines ~20-60): ```js // Pagination with includeSeries + includeEpisode const response = await axios.get(..., { params: { includeSeries: true, includeEpisode: true, page, pageSize: 1000 } }); allRecords = allRecords.concat(records); // Simple concat, no transformation ``` - Correctly paginates (up to 50 pages / 50k items) and enriches with series/episode data. - **No handling whatsoever for season packs / multi-episode items**: - Each season pack record (representing 10–24 episodes) is treated as **one** queue item. - No unpacking, no `episodeCount` aggregation, no deduplication against download client queues. - Result: Sonarr queue shows ~140 items; download clients (qBittorrent etc.) show far fewer actual torrents → UI mismatch. - **Poller + Webhook refresh path** (`server/utils/poller.js` + `arrRetrieverRegistry.getQueuesByType()`): - Flattens records with `_instanceUrl` / `_instanceKey` and caches under `poll:sonarr-queue`. - When webhooks fire, `processWebhookEvent` calls this and then `pollAllServices()` for SSE. - **No special season-pack logic** → dashboard downloads view under-reports. - Over-reliance on webhooks (skips polling if recent webhook activity) means missed updates when season packs complete without triggering expected events. - **Crash cause**: - With 140+ complex items (season packs have nested `series`, `episode`, `status`, etc.), downstream code (dashboard rendering, SSE payload, cache operations, or UI lists in `server/routes/dashboard.js`) can hit: - Memory spikes from repeated deep object traversal. - Unhandled nulls/missing fields in episode data for packs. - UI React loops or virtualisation failures on large un-deduplicated lists. - Confirmed crash reported by user after ~140-item queue appeared. - **No deduplication or cross-client reconciliation** anywhere in the queue/download client pipeline. ### Impact - Real-time updates via webhooks are unreliable (replay protection + SSE gaps). - Download progress is incomplete/inaccurate for anyone using season packs (common in Sonarr). - Risk of crashes on moderate-to-large queues. - Breaks the "real-time dashboard" promise, especially when webhooks + poller interact. **Proposed Solution / Fix Plan:** 1. **Webhook Reliability (High Priority)** - Soften replay protection: normalise `eventDate` to minute precision or use a content hash of key payload fields instead of raw date. - Add optional `forceRefresh` query param or admin bypass for testing. - Improve instance matching (fuzzy or ID-only fallback with logging). - Add structured success/failure response + metrics for webhook deliveries. - Ensure `pollAllServices()` errors are properly surfaced. 2. **Sonarr Season Pack / Multi-Episode Handling (Critical)** - In `PollingSonarrRetriever.getQueue()` (or post-processing in `arrRetrieverRegistry`): - Detect season packs (`seasonNumber && !episodeNumbers || episodeCount > 1`). - Either: - **Unpack** into individual episode records for display (preferred for accurate per-episode progress), **or** - Add `isSeasonPack: true`, `episodeCount`, and aggregated `progress` fields. - Store original Sonarr record + derived episode list. - Add deduplication logic when merging Sonarr queue with download client queues (match by `downloadId` / torrent hash / title). - Update dashboard UI to clearly show "Season Pack (X episodes)" with overall progress. 3. **Stability / Crash Prevention** - Add size limits + pagination/virtualisation warnings in dashboard for queues > 100 items. - Add defensive null checks and try/catch around queue flattening and SSE payload construction. - Increase test coverage for season-pack payloads (currently expanded tests don't cover this). 4. **Implementation Notes** - Keep backward-compatible cache keys (`poll:sonarr-queue`). - Update `README.md` and webhook docs to recommend proper *arr webhook settings (include full episode info). - Add a "Queue Diagnostics" panel in the UI showing raw vs processed counts. **Suggested Labels:** Kind/Bug, Priority: High, Area/Webhooks, Area/Queue, Compat/Non-Breaking (with migration note for season packs) **Affected Versions:** v1.7.30 – v1.7.31 (regressed after frontend + poller changes) This ticket combines two user-reported issues that share the webhook → queue refresh → display pipeline. Fixing both will restore reliable real-time operation and accurate multi-episode tracking.
Gandalf added the Area/WebhooksKind/Bug
Priority
High
2
labels 2026-05-28 11:53:31 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Gandalf/sofarr#61