Files
sofarr/ARCHITECTURE.md
Gronod 1bef14d590
All checks were successful
Build and Push Docker Image / build (push) Successful in 41s
Docs Check / Markdown lint (push) Successful in 48s
Licence Check / Licence compatibility and copyright header verification (push) Successful in 57s
CI / Security audit (push) Successful in 1m23s
CI / Tests & coverage (push) Successful in 1m36s
Docs Check / Mermaid diagram parse check (push) Successful in 1m43s
feat(webhooks): security hardening, tests, full documentation audit & polish (Phase 6)
2026-05-19 17:11:45 +01:00

10 KiB
Raw Blame History

sofarr — Architecture Reference

Concise top-level architecture guide. For the full deep-dive (API reference, matching pipeline, deployment) see docs/ARCHITECTURE.md.


1. Overview

sofarr is a Node.js/Express single-page application. It aggregates download activity from multiple media automation services, filters results by Emby user identity, and presents a real-time personalised dashboard.

Three pluggable layers form the core:

Layer Name Location
Download client abstraction PDCA — Pluggable Download Client Architecture server/clients/ + server/utils/downloadClients.js
*arr data retrieval PALDRA — Pluggable *Arr Library Data Retrieval Architecture server/utils/arrRetrievers.js
Real-time push Webhook receiver server/routes/webhook.js

2. Request / Data Flow

Browser (SPA)
    │  POST /api/auth/login          → Auth routes → Emby verify → set httpOnly cookie
    │  GET  /api/dashboard/stream    → SSE stream → poller cache → matched downloads
    │  POST /api/webhook/*           ← Sonarr/Radarr push events
    │
    ▼
Express Server (:3001)
    ├── Helmet (CSP nonce, HSTS, X-Frame-Options, …)
    ├── express-rate-limit (300/15 min general; 60/1 min webhook)
    ├── cookie-parser (HMAC-signed session cookie)
    ├── verifyCsrf (double-submit cookie, all state-changing /api routes except auth + webhook)
    │
    ├── /api/auth          → login, logout, me, csrf
    ├── /api/webhook       → [rate-limit] → [secret validation] → [payload validation]
    │                        → [replay check] → updateWebhookMetrics → processWebhookEvent
    ├── /api/dashboard     → requireAuth → read cache → match downloads → SSE/JSON
    ├── /api/history       → requireAuth → historyFetcher (5 min cache) → filter + dedup
    ├── /api/sonarr|radarr → requireAuth → verifyCsrf → proxy to *arr API
    └── /api/sabnzbd|emby  → requireAuth → verifyCsrf → proxy

Background:
    Poller (setInterval POLL_INTERVAL ms)
        └── shouldSkipInstancePolling? ──yes──► extend cache TTL, increment pollsSkipped
                │ no (or fallback triggered)
                ▼
            PDCA Registry.getDownloadsByClientType()
            PALDRA Registry.getQueuesByType() / getHistoryByType() / getTagsByType()
                │
                ▼
            cache.set('poll:*', data, TTL)
                │
                ▼
            notify pollSubscribers → SSE push to all connected browsers

3. Pluggable Download Client Architecture (PDCA)

All download clients extend DownloadClient (abstract base in server/clients/DownloadClient.js):

DownloadClient (abstract)
├── SABnzbdClient       — REST API, API key auth
├── QBittorrentClient   — Sync API (incremental deltas), cookie auth, fallback to /torrents/info
├── TransmissionClient  — JSON-RPC, session-ID management
└── RTorrentClient      — XML-RPC, HTTP Basic Auth

DownloadClientRegistry (server/utils/downloadClients.js) initialises all configured clients from *_INSTANCES env vars, fetches from all in parallel, and returns a { sabnzbd, qbittorrent, transmission, rtorrent } map. Individual client failures are isolated.

Adding a new client: extend DownloadClient, implement getActiveDownloads() returning NormalizedDownload[], register in the registry factory.


4. Pluggable *Arr Retrieval Architecture (PALDRA)

server/utils/arrRetrievers.js provides arrRetrieverRegistry which:

  • Initialises one retriever per configured Sonarr/Radarr instance
  • Exposes getQueuesByType(), getHistoryByType(), getTagsByType() — returning results keyed by sonarr / radarr
  • Results carry { instance: instanceId, data: … } so callers can look up instance credentials

The poller and webhook processor both use the same registry, ensuring consistency.


5. Webhook Flow (Phase 15.1)

Sonarr/Radarr
    POST /api/webhook/sonarr  (X-Sofarr-Webhook-Secret: <secret>)
    {
      "eventType": "Grab",
      "instanceName": "Main Sonarr",
      "date": "2026-05-19T10:00:00.000Z",
      …
    }
        │
        ▼
    webhookLimiter (60 req/min/IP)
        │
        ▼
    validateWebhookSecret()  ──fail──► 401 Unauthorized
        │ ok
        ▼
    validatePayload()        ──fail──► 400 Bad Request
        │ ok
        ▼
    isReplay()               ──yes───► 200 { received: true, duplicate: true }
        │ no
        ▼
    cache.updateWebhookMetrics(instance.url)   ← activates smart polling skip
        │
        ▼
    processWebhookEvent('sonarr', 'Grab')  [fire-and-forget]
        ├── classify: Grab → QUEUE_EVENT
        ├── arrRetrieverRegistry.getQueuesByType()
        ├── cache.set('poll:sonarr-queue', …, CACHE_TTL)
        └── pollAllServices()  → pollSubscribers.forEach(cb) → SSE push
        │
        ▼
    200 { received: true }  (returned immediately, before fire-and-forget completes)

6. Smart Polling Optimization (Phase 5)

pollAllServices() called every POLL_INTERVAL ms:

  globalMetrics = cache.getGlobalWebhookMetrics()
  fallbackTriggered = lastGlobalWebhookTimestamp > WEBHOOK_FALLBACK_TIMEOUT ago
  
  for each service type (sonarr, radarr):
    shouldSkip = !fallbackTriggered
                 && all instances have metrics.eventsReceived > 0
                 && all instances have metrics.lastWebhookTimestamp within WEBHOOK_FALLBACK_TIMEOUT

    if shouldSkip:
      extend TTL of existing cached data        ← no API calls made
      increment metrics.pollsSkipped
      log "[Poller] Skipping sonarr polling for N instance(s) with active webhooks"
    else:
      fetch from *arr APIs → update cache

Result: zero *arr API calls per poll cycle when webhooks are active and recent. Falls back automatically after WEBHOOK_FALLBACK_TIMEOUT minutes of silence (default: 10).


7. Cache Keys

Key Content TTL
poll:sab-queue SABnzbd queue slots + status POLL_INTERVAL × 3
poll:sab-history SABnzbd history slots POLL_INTERVAL × 3
poll:sonarr-queue Sonarr queue records (with _instanceUrl) POLL_INTERVAL × 3
poll:sonarr-history Sonarr history records POLL_INTERVAL × 3
poll:sonarr-tags Sonarr tag list per instance POLL_INTERVAL × 3
poll:radarr-queue Radarr queue records (with _instanceUrl) POLL_INTERVAL × 3
poll:radarr-history Radarr history records POLL_INTERVAL × 3
poll:radarr-tags Radarr tag list POLL_INTERVAL × 3
poll:qbittorrent qBittorrent torrent list POLL_INTERVAL × 3
history:sonarr Sonarr history (on-demand, /api/history/recent) 5 min
history:radarr Radarr history (on-demand) 5 min
emby:users Emby user list 60 s

When polling is disabled (POLL_INTERVAL=0), all poll:* TTLs fall back to 30 s.


8. Security Model

Concern Mechanism
User authentication Emby credentials → httpOnly HMAC-signed cookie
Session validation requireAuth middleware on all /api/dashboard, /api/history, proxy routes
CSRF Double-submit cookie (X-CSRF-Token header) on all state-changing routes
Webhook auth Shared secret on X-Sofarr-Webhook-Secret header (webhook routes are outside CSRF)
Webhook input validatePayload() allowlists event types; rejects invalid shapes
Webhook replay 5-minute nonce cache keyed on (eventType, instanceName, date)
Rate limiting 300 req/15 min (general), 10 fails/15 min (login), 60 req/1 min (webhook)
Secret leakage sanitizeError() redacts all secrets from error messages and logs
Headers Helmet v7: CSP nonce, HSTS, X-Frame-Options DENY, noSniff, Referrer-Policy

9. Directory Structure (summary)

sofarr/
├── server/
│   ├── app.js                  Express factory (imported by tests + index.js)
│   ├── index.js                Entry point: logging, listen, start poller
│   ├── clients/                PDCA — one file per download client
│   ├── routes/
│   │   ├── auth.js             Login / logout / csrf / me
│   │   ├── dashboard.js        SSE stream, downloads, status, cover-art
│   │   ├── history.js          Recently completed downloads
│   │   ├── webhook.js          Webhook receiver (Phase 16)
│   │   ├── sonarr.js           Sonarr API proxy + webhook management
│   │   └── radarr.js           Radarr API proxy + webhook management
│   ├── middleware/
│   │   ├── requireAuth.js      Cookie auth enforcement
│   │   └── verifyCsrf.js       Double-submit CSRF check
│   └── utils/
│       ├── arrRetrievers.js    PALDRA — Sonarr/Radarr fetch registry
│       ├── cache.js            MemoryCache + webhook metrics helpers
│       ├── config.js           Multi-instance config parser
│       ├── downloadClients.js  PDCA registry + factory
│       ├── historyFetcher.js   History fetch + event classification
│       ├── poller.js           Smart background polling engine
│       ├── sanitizeError.js    Secret redaction from errors
│       └── tokenStore.js       Emby token store (JSON file, atomic writes)
├── public/                     Static SPA (HTML + CSS + vanilla JS)
├── tests/
│   ├── setup.js                Isolated DATA_DIR, SKIP_RATE_LIMIT
│   ├── unit/                   Pure unit tests
│   └── integration/            Supertest + nock integration tests
├── docs/ARCHITECTURE.md        Full deep-dive architecture documentation
├── ARCHITECTURE.md             This file — concise reference
├── SECURITY.md                 Threat model + hardening guide
├── CHANGELOG.md                Version history
└── .env.sample                 Annotated configuration template

For complete API reference, data-flow diagrams, download matching pipeline, qBittorrent Sync API details, and deployment guidance see docs/ARCHITECTURE.md.