|
| 1 | +# Project-wide RLS lockdown — apply & verification plan (#231) |
| 2 | + |
| 3 | +**Status: APPLIED to production 2026-06-13.** Anon is confirmed locked out |
| 4 | +(direct REST calls to `users`/`oauth_tokens`/`user_roles`/etc. now return |
| 5 | +`permission denied`, SQLSTATE 42501). This doc is the record of what was applied |
| 6 | +and how it was verified. The SQL scripts (`rls_lockdown.sql` apply, |
| 7 | +`rls_lockdown_rollback.sql` emergency revert) live in **PR #232** as the applied |
| 8 | +record — intentionally NOT merged to `main` (the change went straight to prod; |
| 9 | +nothing re-runs them from the repo). |
| 10 | + |
| 11 | +## Why this is safe for the backend |
| 12 | +The backend authenticates to Supabase with `SUPABASE_SERVICE_KEY` → the |
| 13 | +`service_role`, which has **`rolbypassrls = true`** (verified live: `SELECT |
| 14 | +rolname, rolbypassrls FROM pg_roles` → `service_role=t`, `anon=f`, |
| 15 | +`authenticated=f`). RLS does not apply to row-bypass roles, so **every backend |
| 16 | +query keeps working unchanged**. RLS only constrains `anon`/`authenticated`, |
| 17 | +which is exactly the public-anon-key path we're closing. |
| 18 | + |
| 19 | +## Expected breakage (accepted) |
| 20 | +Anon realtime on `room_messages` stops delivering once RLS is on / anon DML is |
| 21 | +revoked. This stays broken until the **option (a)** JWT bridge lands |
| 22 | +(`docs/security/realtime-jwt-bridge-design.md`). Per decision, the full-DB |
| 23 | +exposure outranks live chat updates. The #230 display fix already re-fetches via |
| 24 | +the (service-role) REST endpoint, so chat still works on load/refresh — only the |
| 25 | +live push is paused. |
| 26 | + |
| 27 | +## Test-first on a branch (if available) |
| 28 | +Supabase branching wasn't reachable via the MCP for this project (`list_branches` |
| 29 | +errored), so it may be on a plan/permission that doesn't expose it. If you have |
| 30 | +branching: |
| 31 | +1. Create a dev branch in the dashboard. |
| 32 | +2. Run `rls_lockdown.sql` against the branch. |
| 33 | +3. Run the verification below pointed at the branch. |
| 34 | +4. Merge the branch (or apply the same SQL to prod) once green. |
| 35 | + |
| 36 | +If branching is unavailable: apply to prod during a low-traffic window with |
| 37 | +`rls_lockdown_rollback.sql` open and ready. The change is transactional |
| 38 | +(`BEGIN/COMMIT`) and fast (DDL only, no table rewrites). |
| 39 | + |
| 40 | +## Pre-apply snapshot (record for diffing) |
| 41 | +```sql |
| 42 | +SELECT count(*) FILTER (WHERE relrowsecurity) AS rls_on, |
| 43 | + count(*) FILTER (WHERE NOT relrowsecurity) AS rls_off |
| 44 | +FROM pg_class WHERE relnamespace='public'::regnamespace AND relkind='r'; |
| 45 | +-- expected before: rls_on=2, rls_off=38 |
| 46 | +``` |
| 47 | + |
| 48 | +## Apply |
| 49 | +Run `backend/db/security/rls_lockdown.sql`. |
| 50 | + |
| 51 | +## Post-apply verification checklist |
| 52 | +1. **RLS now on for all public tables:** |
| 53 | + ```sql |
| 54 | + SELECT count(*) FILTER (WHERE NOT relrowsecurity) AS still_off |
| 55 | + FROM pg_class WHERE relnamespace='public'::regnamespace AND relkind='r'; |
| 56 | + -- expect: still_off = 0 |
| 57 | + ``` |
| 58 | +2. **anon has no table DML left:** |
| 59 | + ```sql |
| 60 | + SELECT count(*) AS anon_grants |
| 61 | + FROM information_schema.role_table_grants |
| 62 | + WHERE table_schema='public' AND grantee='anon' |
| 63 | + AND privilege_type IN ('SELECT','INSERT','UPDATE','DELETE'); |
| 64 | + -- expect: anon_grants = 0 |
| 65 | + ``` |
| 66 | +3. **anon is blocked at the REST endpoint** (the actual exposure): with the |
| 67 | + public anon key, |
| 68 | + ``` |
| 69 | + curl -s -o /dev/null -w "%{http_code}\n" \ |
| 70 | + "https://jxqcmjqtjlpuxfrxmrdv.supabase.co/rest/v1/users?select=id&limit=1" \ |
| 71 | + -H "apikey: <ANON_KEY>" -H "Authorization: Bearer <ANON_KEY>" |
| 72 | + ``` |
| 73 | + Expect **401** (or `[]` with permission-denied), not a row. Repeat for |
| 74 | + `user_roles`, `oauth_tokens`, `messages`. |
| 75 | +4. **Backend still works (service_role):** |
| 76 | + - `cd backend && python -m pytest tests/ -q` (suite is hermetic; sanity only). |
| 77 | + - Hit live read + write endpoints against the target DB and confirm normal |
| 78 | + behavior, e.g. `GET /api/auth/me` (read), a calendar/gradebook create |
| 79 | + (write), a notes save. All should succeed exactly as before (service_role |
| 80 | + bypasses RLS). |
| 81 | +5. **Realtime is paused (expected):** open a room — messages still load and |
| 82 | + refresh via REST; live push is down until option (a). No errors beyond the |
| 83 | + subscription returning nothing. |
| 84 | + |
| 85 | +## Rollback |
| 86 | +If something critical breaks: run |
| 87 | +`backend/db/security/rls_lockdown_rollback.sql` (re-grants anon, disables RLS on |
| 88 | +the 38). ⚠️ This restores the insecure state — re-apply the lockdown + option |
| 89 | +(a) as soon as the issue is understood. |
| 90 | + |
| 91 | +## Follow-ups (not in this script) |
| 92 | +- `authenticated` keeps its grants (RLS-with-no-policy denies it today); option |
| 93 | + (a) adds membership-scoped policies for it on `room_messages`. |
| 94 | +- Storage hardening is a separate track (`docs/security/storage-hardening-plan.md`). |
| 95 | +- The 2 already-RLS tables (`achievement_cosmetics`, `achievement_triggers`) |
| 96 | + have RLS on but **no policies** — confirm nothing legitimately reads them via |
| 97 | + anon (the backend uses service_role, so it's unaffected). |
0 commit comments