Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
96dac02
refactor: add auth service and refactor token endpoints (#115)
TimilsinaBimal Mar 1, 2026
d348e97
refactor: refactor translation module
TimilsinaBimal Mar 20, 2026
5b93b06
refactor catalog updater and manifest services for improved readabili…
TimilsinaBimal Mar 20, 2026
5c2c8ed
refactor multiple services to accomodate auth service
TimilsinaBimal Mar 20, 2026
7034be9
refactor: profile service merge with integration and changes
TimilsinaBimal Mar 20, 2026
74d9d6d
refactor: refactor profile sectino
TimilsinaBimal Mar 28, 2026
4df4d70
feat: implement user context service for managing user data and settings
TimilsinaBimal Mar 28, 2026
3544336
feat: introduce new profile and scoring models for improved data han…
TimilsinaBimal Mar 28, 2026
4f7cd43
refactor: replace SmartSampler class with standalone sample_items fun…
TimilsinaBimal Mar 28, 2026
c3fdd04
feat: add new row building functions for dynamic content generation i…
TimilsinaBimal Mar 28, 2026
05db872
refactor: remove unused daily rotation genre whitelist and interest s…
TimilsinaBimal Mar 28, 2026
62ba484
Merge branch 'main' of github.com:TimilsinaBimal/Watchly into dev
TimilsinaBimal Mar 28, 2026
de4f7cc
fix: add tests and make defaults pass from BE to fe
TimilsinaBimal Mar 29, 2026
e207530
refactor: break down front end components into separate files
TimilsinaBimal Mar 31, 2026
8346e66
feat: Refactor recommendation services and introduce diversity handling
TimilsinaBimal Mar 31, 2026
1f31616
feat: add language-aware images and improve catalog translation
TimilsinaBimal Mar 31, 2026
8a39f2e
feat: add trakt and simkl watch history integration
TimilsinaBimal Apr 2, 2026
0167295
remove .get from stremio item
TimilsinaBimal Apr 4, 2026
f3182ef
fix: security issues
TimilsinaBimal Apr 4, 2026
58cc782
Merge branch 'main' of github.com:TimilsinaBimal/Watchly into dev
TimilsinaBimal Apr 4, 2026
d0b9732
Merge branch 'main' of github.com:TimilsinaBimal/Watchly into dev
TimilsinaBimal Apr 4, 2026
7b16dc1
chore: bump versin
TimilsinaBimal Apr 4, 2026
41dae28
fix: remove ip blocking
TimilsinaBimal Apr 5, 2026
5e47203
fix: use watch history based on user profile
TimilsinaBimal Apr 21, 2026
6023f26
chore: bump version to 1.10.0-rc.3
TimilsinaBimal Apr 21, 2026
59d85aa
Merge branch 'main' of github.com:TimilsinaBimal/Watchly into dev
TimilsinaBimal Apr 30, 2026
7a6d86f
chore: remove obsolete migration scripts and add bug tracker
TimilsinaBimal Apr 30, 2026
c08eea1
fix(security): fail-fast on default TOKEN_SALT in production
TimilsinaBimal Apr 30, 2026
3b266b7
fix(security): add CSRF state validation to OAuth callbacks
TimilsinaBimal Apr 30, 2026
754777a
fix(recs): apply era range filter for both single-year and range axes
TimilsinaBimal Apr 30, 2026
2f25ca5
fix(recs): correct year_to_era decade buckets
TimilsinaBimal Apr 30, 2026
1c08fa7
fix(recs): use real target limit for diversity caps
TimilsinaBimal Apr 30, 2026
67cb4cc
fix(recs): use with_crew for both movie and TV director discover
TimilsinaBimal Apr 30, 2026
b555d1b
fix(async): swallow per-task failures with return_exceptions=True
TimilsinaBimal Apr 30, 2026
0ae8474
fix(security): mark catalog responses Cache-Control: private
TimilsinaBimal Apr 30, 2026
bff6790
fix(library): require IMDb prefix consistently at ingestion
TimilsinaBimal Apr 30, 2026
f49b635
fix(auth): reduce token cache TTL from 12h to 5min
TimilsinaBimal Apr 30, 2026
9055cdb
fix(http): guard response.json() against empty/non-JSON bodies
TimilsinaBimal Apr 30, 2026
8ccd0d2
fix(security): pin OAuth postMessage target to configured app origin
TimilsinaBimal Apr 30, 2026
322329e
fix(recs): don't drop all items when popularity preset has no mapping
TimilsinaBimal Apr 30, 2026
7bb055e
fix(async): retain background catalog update tasks and log failures
TimilsinaBimal Apr 30, 2026
6cb5f76
refactor(recs): clarify item-based candidate fetch dead local
TimilsinaBimal Apr 30, 2026
2c19a7c
fix(translation): don't cache translation fallbacks on API failure
TimilsinaBimal Apr 30, 2026
e0fd546
fix(simkl): narrow exception handling in get_trending/get_item_details
TimilsinaBimal Apr 30, 2026
64c65ea
fix(tmdb): differentiate 404 from 5xx in find_by_imdb_id logging
TimilsinaBimal Apr 30, 2026
8a732a0
fix(http): reuse AsyncClient instances across calls
TimilsinaBimal Apr 30, 2026
4dc46ae
fix(simkl): use real watch count from Simkl response
TimilsinaBimal Apr 30, 2026
619e238
fix(api): don't leak raw exception details in catalog 500 response
TimilsinaBimal Apr 30, 2026
725ae1a
fix(api): make poster-rating and simkl validation endpoints return 200
TimilsinaBimal Apr 30, 2026
1ae0cd5
fix(manifest): deepcopy catalogs to avoid cross-user mutation
TimilsinaBimal Apr 30, 2026
4f0ef1c
fix(security): HTML-escape OAuth provider/username/error in callback …
TimilsinaBimal Apr 30, 2026
24a51bd
fix(api): tokens DELETE returns JSON object instead of bare string
TimilsinaBimal Apr 30, 2026
770f9b1
fix(profile): guard runtime parse against malformed cinemeta values
TimilsinaBimal Apr 30, 2026
3431b4f
fix(rows): respect exclude set in _pick instead of falling back
TimilsinaBimal Apr 30, 2026
c0f1ac2
chore(cleanup): assorted Low-severity fixes (L1, L4, L5, L7)
TimilsinaBimal Apr 30, 2026
2f374d0
chore(cleanup): tighten token regex (L2), startup log (L8), close L3/L6
TimilsinaBimal Apr 30, 2026
34c61f5
refactor(http): migrate Trakt and Simkl services to BaseClient
TimilsinaBimal Apr 30, 2026
c28e473
fix(profile): tag cached profiles with watch-history source and rebui…
TimilsinaBimal May 1, 2026
cc6becf
chore(profile): drop dead integration.py module
TimilsinaBimal May 1, 2026
3cc7f0f
fix(http): add jitter to BaseClient retry backoff
TimilsinaBimal May 1, 2026
615f4b7
feat(cache): bound per-user Redis caches to 90 days, refresh on read
TimilsinaBimal May 1, 2026
12a4ea0
refactor(oauth): route Simkl callback through BaseClient
TimilsinaBimal May 1, 2026
19aea31
fix(history): clear revoked Trakt/Simkl tokens on 401/403
TimilsinaBimal May 1, 2026
5179822
feat(trakt): refresh expired access tokens automatically
TimilsinaBimal May 1, 2026
98c2529
Merge branch 'main' of github.com:TimilsinaBimal/Watchly into dev
TimilsinaBimal May 6, 2026
6c94303
fix(library): build LibraryCollection from Trakt/Simkl when configure…
TimilsinaBimal May 6, 2026
a919ae4
refactor(catalogs): merge watchly.loved + watchly.watched into watchl…
TimilsinaBimal May 6, 2026
5fc37ba
add local files
TimilsinaBimal May 6, 2026
2f4c48c
refactor(catalogs): make watchly.item display name visible and read-only
TimilsinaBimal May 6, 2026
ef0a020
docs: add CLAUDE.md with architecture, commands, and conventions
TimilsinaBimal May 6, 2026
3b93f65
fix(creators): filter to recurring directors/cast instead of top-by-s…
TimilsinaBimal May 6, 2026
97bbb63
refactor(accounts): redesign provider connections as stacked cards
TimilsinaBimal May 7, 2026
d5971d6
refactor(welcome): swap Smart Recommendations card for Trakt & Simkl
TimilsinaBimal May 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,5 @@ migration.py
.github
.pytest_cache
.ruff_cache

*.local.*
105 changes: 105 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project

Watchly is a Stremio catalog addon that generates personalized movie/series recommendations from a user's watch history. It is a FastAPI service that speaks the Stremio addon protocol (manifest + catalog endpoints). Recommendations come from a taste profile built off the user's history, then candidates are pulled from TMDB / Simkl, scored, capped for diversity, enriched, and returned as a Stremio catalog.

A user installs Watchly through its `/configure` web page: they paste a Stremio email/password (or auth_key), optionally connect Trakt and/or Simkl via OAuth, optionally provide their own TMDB / Gemini / Simkl / RPDB API keys, pick which catalogs they want, and get an addon manifest URL to paste into Stremio. From then on, every catalog row in their Stremio home — "Top Picks for You", "Because you loved …", "Genre & Keyword Catalogs", etc. — is served by this app. State per user is keyed on a short opaque token embedded in the manifest URL; credentials are encrypted at rest in Redis. The app must work for users who store their library in Stremio, in Trakt, or in Simkl, and for users with mixed signals (rated, watched, loved, rewatched). That source flexibility is the central architectural constraint.

## Commands

Dependencies are managed with [uv](https://github.com/astral-sh/uv); a `requirements.txt` is also kept in sync for non-uv environments. Python 3.12+.

```bash
# Install
uv sync

# Run dev server (auto-reload when APP_ENV=development)
uv run main.py --dev
# or directly
uvicorn app.core.app:app --reload

# Tests (pytest is not in requirements-dev.txt — install once into the venv)
pip install pytest pytest-asyncio
pytest tests/ # all tests
pytest tests/test_catalog_endpoint.py -v # single file
pytest tests/test_catalog_endpoint.py::test_name # single test

# Lint / format (also runs on commit via pre-commit)
pre-commit run --all-files
black . # line length 120, py312
isort . # black profile
flake8 . # max-line-length 120, config in setup.cfg

# Docker
docker-compose up -d # uses env_file .env
```

The configure UI is served at `/configure`. Required env vars: `TMDB_API_KEY`, `TOKEN_SALT`, `HOST_NAME`. Redis is required (`REDIS_URL`).

## Architecture

### Request flow

Every catalog request resolves through one path:

1. **`app/services/context.py:load_user_context`** is the entry point for every authenticated endpoint. It reads the encrypted token from Redis, decrypts credentials, parses `UserSettings`, resolves a Stremio `auth_key`, and builds the `LibraryCollection`. The library is sourced from `user_settings.watch_history_source` — `"stremio"`, `"trakt"`, or `"simkl"`. For external sources the WatchHistory is converted to a `LibraryCollection` (rating ≥ 9 → loved, 7–8.9 → liked, no-rating + rewatch → loved fallback, else watched) so downstream catalog code is source-agnostic. The `LibraryCollection.source` field drives cache invalidation when a user switches sources.
2. **`app/services/recommendation/catalog_service.py`** routes the catalog ID to one of the recommendation engines:
- `watchly.rec` → `TopPicksService` (combines profile-driven Discover + library-seeded TMDB/Simkl recs)
- `watchly.theme.*` → `ThemeBasedService` (genre/keyword/era driven)
- `watchly.item.*` → `ItemBasedService` (seeded by a single library item — see "watchly.item" below)
- `watchly.creators` → `CreatorsService` (directors/cast)
- `watchly.all.loved`, `watchly.liked.all` → `AllBasedService`
3. The engine returns a list of items that are passed through metadata enrichment (`app/services/recommendation/metadata.py`), poster ratings overlay (`app/services/poster_ratings/`), translation, and serialization.

### Taste profile pipeline (`app/services/profile/`)

The `TasteProfile` is a numerical fingerprint of the user — top genres, keywords, directors, cast, eras, countries, runtime preference. It is built from the same source as the library: `ProfileService.build_and_cache_profile` checks the configured `watch_history_source` and feeds `WatchHistoryItem`s through the same vectorizer pipeline regardless of origin. Profiles are cached in Redis per-token-per-content-type and invalidated when the source field doesn't match. `_build_from_external_source` reuses the already-built `LibraryCollection` when its `source` matches the configured source, avoiding a duplicate Trakt/Simkl fetch.

### External API clients

All HTTP calls go through **`app/core/base_client.py:BaseClient`**, which provides retries (with jitter on 429/5xx), timeouts, structured error logging, and safe JSON parsing. `TraktService`, `SimklService`, and `TMDBService` are singletons that wrap `BaseClient`. The token-refresh + 401-revoke flow for Trakt/Simkl lives in `ProfileService.fetch_external_watch_history` and is shared between context loading and profile building.

### Caching (`app/services/user_cache.py`, `app/services/redis_service.py`)

Redis is the source of truth for user state. Per-token cached: encrypted credentials (`token_store`), library collection, taste profile (per content type), watched-id sets, library hash for incremental rebuilds. Many caches are TTL-bound (90d default for user data) and refresh on read so active users stay warm. **Invalidate library + profile on source switch**, not just on settings change — `load_user_context` and `build_and_cache_profile` both check the cached `source` field.

### Catalog config IDs

User catalog config IDs (in `UserSettings.catalogs`) and the IDs Stremio actually requests are different. Configs use the bare ID (`watchly.theme`, `watchly.item`); served catalogs append the seed (`watchly.theme.action`, `watchly.item.tt0468569`). `get_config_id` in `app/services/catalog_definitions.py` strips the suffix to look up settings.

**Legacy IDs**: the previously separate `watchly.loved` and `watchly.watched` were merged into a single `watchly.item` catalog. Routing in `catalog_service.py` and `get_config_id` still accept `watchly.loved.*` / `watchly.watched.*` prefixes because installed Stremio clients keep requesting them until the manifest refreshes; `_resolve_catalog_configs` synthesizes a `watchly.item` config from any legacy entries left in saved settings.

### Settings + catalog defaults

`app/core/settings.py:get_default_settings()` is the single source of truth for the default catalog list and shape. Frontend pulls these via `get_default_catalogs_for_frontend()` so the configure page and backend can't drift. When adding a new catalog: add the `CatalogConfig` to defaults, add a description to `CATALOG_DESCRIPTIONS`, register routing in `app/services/recommendation/catalog_service.py`, and emit it from `DynamicCatalogService.get_dynamic_catalogs` in `app/services/catalog_definitions.py`.

### Background work

`app/services/catalog_updater.py` runs on a schedule (`AUTO_UPDATE_CATALOGS=true` + `CATALOG_REFRESH_INTERVAL`) to refresh dynamic catalogs ahead of user requests. Background tasks created via `asyncio.create_task` must be retained (see `app/services/catalog_updater.py:125`) — bare creates are GC-eligible and silently swallow errors.

## Coding standards

The codebase aims for code that reads like prose: small functions, intention-revealing names, and as little ceremony as possible. Match that. New code that is denser, more abstract, or more defensive than the surrounding files is a regression.

- **Follow standard Python idioms.** PEP 8 spacing/naming, type hints on every public function and dataclass field, `pydantic` models for anything that crosses an API boundary, `loguru` for logging (don't import `logging`), `httpx` for HTTP (always through `BaseClient`), `async` end-to-end for I/O. No threads, no synchronous blocking calls inside async handlers.
- **Comments and docstrings: write them only when the WHY is non-obvious.** A function name and its signature should explain WHAT it does. Add a comment or docstring when there's a hidden constraint, a workaround, a subtle invariant, or behavior that would surprise the next reader (e.g. "Trakt list endpoints decode to a `list` despite the dict type hint" or "we drop the cached library on source switch because otherwise stale results are served"). Do not narrate happy-path code, do not write what-it-does docstrings, do not add `# added for X` rot.
- **Refactor when a function grows past ~40 lines or two responsibilities.** Examples already in the repo: `_build_from_external_source` was split out of `build_and_cache_profile` once dispatch logic appeared; `fetch_external_watch_history` was extracted once two call sites needed the same Trakt/Simkl flow. Don't pre-extract a helper that only has one caller.
- **No bloat.** Don't add error handling for cases that can't happen, don't validate input that's already typed, don't add backwards-compat shims unless an actual installed client depends on the old shape (Stremio manifest IDs are the main case — see legacy catalog ID handling). Three similar lines beat a premature abstraction. Delete dead code rather than leaving it with `# unused`.
- **Centralize, don't repeat.** TMDB / Trakt / Simkl calls go through their service classes, never raw `httpx`. Catalog defaults live in `get_default_settings`, not duplicated in templates. ID-prefix knowledge belongs in `get_config_id` and `_get_recommendations` routing, not scattered across modules.
- **Caches are part of the contract.** When you change the shape of something cached (LibraryCollection, TasteProfile, watched sets), think about cache invalidation. Adding a field is safe (Pydantic ignores unknowns or defaults them); changing semantics needs a versioned key or an explicit invalidate.
- **Line length 120** everywhere (black, isort, flake8 all aligned in `setup.cfg` and `pyproject.toml`). Pre-commit hooks enforce on every commit; black will reformat your file and the commit will need to be retried.

## Commit conventions

- **Never add a `Co-Authored-By` trailer.** Commits are authored by the human, not by the assistant. No `🤖 Generated with` lines either.
- **Stage only the files relevant to the commit** — `git add <paths>`, never `git add -A`/`git add .`. Unrelated working-tree changes (e.g. local `.gitignore` tweaks, scratch files) stay unstaged.
- **One fix per commit.** If a session produces two logically separate fixes, ship two commits so either can be reverted independently. Prefix with the area in the existing repo style: `fix(library): …`, `refactor(catalogs): …`, `feat(trakt): …`, `chore(profile): …`.

## Domain conventions

- **One source, one library**: never mix Stremio library items with Trakt/Simkl items in the same `LibraryCollection`. The whole collection is tagged with a single `source`.
- **Item exclusion uses both ID kinds**: `watched_imdb` (set of `tt…`) and `watched_tmdb` (set of TMDB ints). External sources only populate `watched_imdb` reliably; don't assume `watched_tmdb` is populated for Trakt/Simkl users.
- **`BaseClient.get/post` returns `dict` typed**, but JSON list responses (Trakt) decode to `list`. Defensive `_safe_list` guards in service layers handle this — preserve the pattern rather than tightening the type.
14 changes: 10 additions & 4 deletions app/api/endpoints/catalogs.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import re

from fastapi import APIRouter, HTTPException, Response
from loguru import logger

Expand All @@ -6,14 +8,18 @@

router = APIRouter()

# Stremio auth tokens are short (~24 char) hex/alphanumeric strings. Accept up
# to 32 chars of [A-Za-z0-9] as a sanity check; anything else is malformed.
_TOKEN_PATTERN = re.compile(r"^[A-Za-z0-9]{1,32}$")


@router.get("/{token}/catalog/{type}/{id}.json")
@router.get("/{token}/catalog/{type}/{id}/{extra}.json")
async def get_catalog(response: Response, type: str, id: str, token: str, extra: str | None = None) -> dict:
if type not in ("movie", "series"):
raise HTTPException(status_code=400, detail="Invalid content type. Must be 'movie' or 'series'.")

if len(token) > 30: # normal stremio tokens are 24 length. But we are using this just to be safe.
if not _TOKEN_PATTERN.match(token):
raise HTTPException(status_code=400, detail="Invalid token.")

try:
Expand All @@ -24,8 +30,8 @@ async def get_catalog(response: Response, type: str, id: str, token: str, extra:
for key, value in headers.items():
response.headers[key] = value

# if recommendations are none or empty, then set cache header to no-cache
if recommendations and not recommendations.get("meta"):
# If recommendations are empty, avoid caching the empty payload aggressively.
if recommendations is not None and not recommendations.get("metas"):
response.headers["Cache-Control"] = "no-cache"

return recommendations
Expand All @@ -34,4 +40,4 @@ async def get_catalog(response: Response, type: str, id: str, token: str, extra:
raise
except Exception as e:
logger.exception(f"[{redact_token(token)}] Error fetching catalog for {type}/{id}: {e}")
raise HTTPException(status_code=500, detail=f"Something went wrong. Please try again. Error: {e}")
raise HTTPException(status_code=500, detail="Something went wrong. Please try again.")
5 changes: 3 additions & 2 deletions app/api/endpoints/health.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
from fastapi import APIRouter
from fastapi.responses import JSONResponse

router = APIRouter(tags=["health"])


@router.get("/health", summary="Simple readiness probe")
async def health_check() -> dict[str, str]:
return {"status": "ok"}
async def health_check() -> JSONResponse:
return JSONResponse(status_code=200, content={"status": "healthy"})
33 changes: 33 additions & 0 deletions app/api/endpoints/languages.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
from fastapi import APIRouter, HTTPException, Query
from loguru import logger

from app.services.language_service import fetch_languages_list
from app.services.tmdb.service import get_tmdb_service

router = APIRouter()


@router.get("/api/languages")
async def get_languages():
try:
languages = await fetch_languages_list()
return languages
except Exception as e:
logger.error(f"Failed to fetch languages: {e}")
raise HTTPException(status_code=502, detail=f"Failed to fetch languages from TMDB: {e}")


@router.get("/api/meta/images")
async def get_meta_images(
media_type: str = Query(..., description="movie or tv"),
tmdb_id: int = Query(..., description="TMDB ID"),
language: str = Query("en-US", description="Language preference (e.g. en-US, fr-FR)"),
):
"""Fetch language-aware poster, logo, and background images for a title."""
try:
tmdb_service = get_tmdb_service(language=language)
images = await tmdb_service.get_images_for_title(media_type, tmdb_id, language=language)
return images
except Exception as e:
logger.error(f"Failed to fetch images for {media_type}/{tmdb_id}: {e}")
raise HTTPException(status_code=502, detail=f"Failed to fetch images from TMDB: {e}")
2 changes: 0 additions & 2 deletions app/api/endpoints/manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@

@router.get("/manifest.json")
async def manifest():
"""Get base manifest for unauthenticated users."""
manifest = manifest_service.get_base_manifest()
# since user is not logged in, return empty catalogs
manifest["catalogs"] = []
Expand All @@ -16,5 +15,4 @@ async def manifest():

@router.get("/{token}/manifest.json")
async def manifest_token(token: str):
"""Get manifest for authenticated user."""
return await manifest_service.get_manifest_for_token(token)
93 changes: 0 additions & 93 deletions app/api/endpoints/meta.py

This file was deleted.

Loading
Loading