Feature: 001-prd-md Date: 2025-10-02 Storage: Appwrite Database (NoSQL document collections)
erDiagram
USERS ||--o{ TRAITS : has
USERS ||--o{ INTERESTS : has
USERS ||--o{ VALUES : has
USERS ||--o{ RESPONSES : submits
USERS ||--o| LOCATIONS : has
USERS ||--o{ NOTIFICATION_SCHEDULES : has
QUESTIONS ||--o{ RESPONSES : answered_by
USERS ||--o{ MATCHES : similar_to
Collection ID: users
Description: User profiles with anonymous identity and consent preferences
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Auto-generated by Appwrite Auth |
handle |
string | ✓ | unique | 3-20 chars, alphanumeric+underscore | User-chosen pseudonym |
avatar_id |
string | ✓ | - | Appwrite Storage file ID | Reference to avatar image |
created_at |
datetime | ✓ | ✓ | ISO 8601 | Account creation timestamp |
country |
string | ✓ | ✓ | ISO 3166-1 alpha-2 | User's country (for analytics, not discovery) |
age_band |
string | ✓ | - | enum: "18-24", "25-34", "35-44", "45+" | Age bracket (not exact age) |
visibility |
string | ✓ | ✓ | enum: "hidden", "heatmap", "coarse_pin" | Discovery visibility mode |
radius_km |
integer | ✓ | - | 5-200 | Discovery radius in km |
consent |
object | ✓ | - | See below | Granular consent flags |
notification_windows |
array[string] | ✓ | - | ["HH:MM-HH:MM"] | Time windows for notifications |
onboarding_completed |
boolean | ✓ | ✓ | - | Has user finished onboarding? |
last_active_at |
datetime | ✓ | ✓ | ISO 8601 | Last app interaction |
{
"profiling": {
"granted": true,
"timestamp": "2025-10-02T12:00:00Z"
},
"location": {
"granted": false,
"timestamp": "2025-10-02T12:00:00Z"
},
"notifications": {
"granted": true,
"timestamp": "2025-10-02T12:00:00Z"
},
"sensitive_questions": {
"granted": false,
"categories": [],
"timestamp": "2025-10-02T12:00:00Z"
}
}handle(unique)visibility+country(compound, for discovery queries)created_at(for cohort analysis)last_active_at(for inactive user cleanup)
- handle:
/^[a-zA-Z0-9_]{3,20}$/ - notification_windows: Max 3 windows, each <12 hours duration
- radius_km: 5 ≤ value ≤ 200
Collection ID: questions
Description: Profiling questions with metadata for adaptive selection
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Question ID (e.g., "q_47a") |
topic |
string | ✓ | ✓ | dot-notation | Category (e.g., "music.clubbing") |
dimension |
string | ✓ | ✓ | enum | Trait dimension measured |
type |
string | ✓ | - | enum: "likert_5", "choice", "multi_choice", "slider" | Answer format |
text |
string | ✓ | - | 10-200 chars | Question text |
answers |
array[string] | ✓ | - | Matches type | Answer options |
info_gain |
float | ✓ | ✓ | 0.0-1.0 | Expected information gain score |
safety |
object | ✓ | - | See below | Safety filter metadata |
why |
string | ✓ | - | 50-300 chars | Explanation for "Why this question?" |
active |
boolean | ✓ | ✓ | - | Is question in rotation? |
created_at |
datetime | ✓ | - | ISO 8601 | When question was added |
source |
string | ✓ | - | enum: "seed", "llm_generated", "manual" | Origin of question |
"Openness", "Extraversion", "Conscientiousness", "Agreeableness", "SensationSeeking", "RoutineVsNovelty", "SocialEnergy", "CreativeFocus"
{
"sensitive": false,
"categories": [], // If sensitive=true: ["health", "religion", "politics", "sexual_orientation"]
"requires_opt_in": false
}dimension+active(compound, for question selection)info_gaindesc (for prioritization)topic(for filtering)
Collection ID: responses
Description: User answers to questions
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Auto-generated |
user_id |
string | ✓ | ✓ | FK users.$id | Respondent |
question_id |
string | ✓ | ✓ | FK questions.$id | Question answered |
timestamp |
datetime | ✓ | ✓ | ISO 8601 | When answered |
answer |
object | ✓ | - | See below | Selected answer |
time_to_answer_ms |
integer | ✓ | - | >0 | Time from shown to submitted |
session_id |
string | ✓ | - | UUID v4 | App session ID (for analytics) |
{
"value": 4, // For likert/slider: numeric 0-4 or 0-100
"text": "Agree", // For choice: selected option text
"indices": [0, 2] // For multi_choice: selected option indices
}user_id+timestampdesc (compound, for user history)question_id(for question analytics)user_id+question_id(compound unique, prevent duplicate answers)
Collection ID: traits
Description: User trait scores derived from responses
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Auto-generated |
user_id |
string | ✓ | ✓ | FK users.$id | User |
dimension |
string | ✓ | ✓ | enum (same as questions) | Trait dimension |
score |
float | ✓ | - | 0.0-1.0 | Normalized trait score |
confidence |
float | ✓ | - | 0.0-1.0 | Confidence level (based on # answers) |
updated_at |
datetime | ✓ | ✓ | ISO 8601 | Last recalculation |
user_id+dimension(compound unique, one score per dimension per user)updated_at(for incremental updates)
- Initial confidence = 0.0 (no answers)
- Confidence after N answers:
min(1.0, N / 10)(full confidence at 10+ answers) - Score: weighted average of answers mapping to 0.0-1.0 scale
Collection ID: interests
Description: User interest tags with weights
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Auto-generated |
user_id |
string | ✓ | ✓ | FK users.$id | User |
tag |
string | ✓ | ✓ | lowercase, no spaces | Interest tag (e.g., "hiking") |
weight |
float | ✓ | - | 0.0-1.0 | Strength of interest |
source |
string | ✓ | - | enum: "explicit", "inferred" | How tag was derived |
user_id+tag(compound unique)tag(for tag popularity analytics)
Collection ID: values
Description: User value tags with weights
Same structure as Interests collection, different semantic meaning.
Example values: "sustainability", "ambition", "autonomy", "community", "creativity"
Collection ID: locations
Description: Coarse user locations (geohash level 5)
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Auto-generated |
user_id |
string | ✓ | unique | FK users.$id | User (one location per user) |
coarse_cell |
string | ✓ | ✓ | geohash length 5 | ~5km precision (e.g., "u33d8") |
country |
string | ✓ | ✓ | ISO 3166-1 alpha-2 | Country code |
updated_at |
datetime | ✓ | ✓ | ISO 8601 | Last location update |
user_id(unique)coarse_cell+updated_at(compound, for proximity queries)country(for country-level analytics)
- Only created if user consent.location.granted = true
- Deleted immediately if consent revoked
- Max update frequency: 1/hour (prevent tracking)
Collection ID: matches
Description: Cached similarity scores between users
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Auto-generated |
user_id |
string | ✓ | ✓ | FK users.$id | User A |
other_id |
string | ✓ | ✓ | FK users.$id | User B |
score |
float | ✓ | ✓ | 0.0-1.0 | Similarity score |
band |
string | ✓ | ✓ | enum: "very_similar", "similar", "some_overlap" | Display category |
shared_traits |
array[object] | ✓ | - | Max 3 | Top shared dimensions |
computed_at |
datetime | ✓ | ✓ | ISO 8601 | Cache timestamp |
[
{
"dimension": "Openness",
"score_a": 0.85,
"score_b": 0.88,
"delta": 0.03
}
]user_id+scoredesc (compound, for "most similar to me" queries)other_id(for bidirectional lookup)user_id+other_id(compound unique, prevent duplicate pairs)computed_at(for cache expiration cleanup)
- TTL: 24 hours (matches recomputed daily)
- Invalidate immediately if either user updates profile significantly
Collection ID: notification_schedules
Description: User notification preferences and engagement metrics
| Field | Type | Required | Indexed | Validation | Notes |
|---|---|---|---|---|---|
$id |
string | ✓ | PK | UUID v4 | Auto-generated |
user_id |
string | ✓ | unique | FK users.$id | User |
last_sent_at |
datetime | - | ✓ | ISO 8601 | Last notification sent |
next_eligible_at |
datetime | ✓ | ✓ | ISO 8601 | Earliest next send time (2hr cooldown) |
bandit_state |
object | ✓ | - | See below | Thompson sampling state |
metrics |
object | ✓ | - | See below | Engagement metrics |
{
"arms": [
{"hour": 9, "alpha": 2, "beta": 1}, // 9am: 2 opens, 1 ignore
{"hour": 12, "alpha": 1, "beta": 2}, // 12pm: 1 open, 2 ignores
{"hour": 18, "alpha": 5, "beta": 1} // 6pm: 5 opens, 1 ignore
]
}{
"notifications_sent": 42,
"notifications_opened": 28,
"open_rate": 0.67,
"avg_time_to_open_ms": 120000,
"last_7_days_opens": 5
}user_id(unique)next_eligible_at(for scheduler queries)
[New User]
→ onboarding_completed=false
→ (complete onboarding)
→ onboarding_completed=true
→ (answer 10 questions)
→ traits.confidence ≥ 0.3
→ (opt into location)
→ locations record created
→ (discovery enabled)
→ matches computed
[Created]
→ source="seed|llm_generated|manual", active=false
→ (safety review)
→ active=true
→ (shown to users, analytics)
→ info_gain updated
→ (poor engagement)
→ active=false
[No match record]
→ (user profiles updated)
→ similarity-matcher Function runs
→ score computed, band assigned
→ match record created
→ (24 hours pass)
→ match deleted, recomputed on next run
One-to-Many:
- users → traits (1:N, max 8 traits per user)
- users → interests (1:N, ~5-20 per user)
- users → values (1:N, ~3-10 per user)
- users → responses (1:N, unbounded growth)
- questions → responses (1:N, unbounded growth)
One-to-One:
- users → locations (1:1, nullable if consent.location=false)
- users → notification_schedules (1:1)
Many-to-Many:
- users ↔ users via matches (N:M, cached similarity graph)
| Collection | Retention Policy |
|---|---|
| users | Until account deletion |
| questions | Indefinite (seed questions), 90 days (LLM-generated if inactive) |
| responses | Until account deletion, or 2 years if user inactive |
| traits | Recomputed from responses, deleted with user |
| interests/values | Deleted with user |
| locations | Deleted immediately if consent revoked or user deleted |
| matches | TTL 24 hours (rolling cache) |
| notification_schedules | Deleted with user |
All collections follow Appwrite permission model:
// Example: responses collection
{
"read": ["user:{user_id}"], // Users can only read own responses
"create": ["user:{user_id}"], // Users can only create own responses
"update": [], // No updates (immutable after creation)
"delete": ["user:{user_id}"] // Users can delete own responses
}Special cases:
questions: read=["role:all"], write=["role:admin"]matches: read=["user:{user_id}", "user:{other_id}"], write=["role:function"]locations: read=["role:function"], write=["user:{user_id}"] (not publicly readable)