Young Pharaohs

An interactive AR-powered mobile app that brings ancient Egyptian pharaohs to life. Users explore pharaohs, chat with them using AI-driven conversations, view their monuments in AR, and discover nearby places.

Project Structure

Young-Pharoahs/
├── app/                          # FastAPI backend (Python)
│   ├── routers/                  # API endpoints
│   ├── services/                 # Business logic (LLM, TTS, STT, RAG, etc.)
│   ├── models/                   # Pydantic request/response models
│   └── utils/                    # Helpers (prompts, image processing)
├── RamsesARProject/
│   ├── MyMobileApp/              # React Native mobile app (TypeScript)
│   │   ├── src/
│   │   │   ├── screens/          # App screens
│   │   │   ├── components/       # Reusable UI components
│   │   │   ├── services/         # API & audio services
│   │   │   ├── data/             # Static pharaoh data
│   │   │   └── types/            # TypeScript type definitions
│   │   ├── android/              # Android native project
│   │   └── ios/                  # iOS native project
│   └── unity/
│       └── RamsesAR/             # Unity AR project (C#)
│           ├── Assets/Scripts/   # AR logic (ARManager, CharacterPlacer, etc.)
│           ├── Assets/Scenes/    # Unity scenes
│           ├── Assets/Models/    # 3D pharaoh models
│           └── ProjectSettings/  # Unity config
├── main.py                       # Server entry point
├── ingest.py                     # PDF ingestion into vector store
└── requirements.txt              # Python dependencies

Tech Stack

Layer	Technology
Mobile App	React Native 0.83, TypeScript, React Navigation
AR Engine	Unity (C#), integrated via `react-native-unity`
Backend	FastAPI (Python), Uvicorn
LLM / Vision	Google Gemini (RAG + Vision + query rewriting)
Embeddings	BGE-M3 (1024-dim)
Vector Store	Pinecone
Database	MongoDB (pharaohs, users, conversations)
Speech	Deepgram (STT + TTS), ElevenLabs (TTS)
Image Gen	GPT4All (Mistral 7B) + Stable Diffusion
Nearby Places	Google Places API

Mobile App

Screens

Screen	Description
HomeScreen	Landing page with pharaoh discovery
KingsScreen	Browse all pharaohs with search
ChatScreen	AI-powered conversation with a pharaoh (text + voice)
ConversationsListScreen	View and manage past conversations
MonumentDetailsScreen	Monument info, location, nearby restaurants & hotels
LocationsScreen	Map view of monument locations
ARScreen	Launch Unity AR experience to view 3D pharaoh models

Key Features

Voice Chat — Record audio → STT → RAG → TTS → playback, all in one flow
Image Recognition — Upload a photo of a monument/statue → AI identifies the pharaoh
Multi-turn Conversations — Context-aware follow-ups with automatic pharaoh persona switching
AR Experience — Place 3D animated pharaoh characters in the real world via Unity
Nearby Places — Discover restaurants and hotels near each monument (Google Places)

Setup

cd RamsesARProject/MyMobileApp
npm install

# Android
npx react-native run-android

# iOS
cd ios && pod install && cd ..
npx react-native run-ios

Unity Integration: The Unity AR project must be exported to android/unity/ (Android) before building. See the Unity Export Guide for details.

Unity AR Project

Located in RamsesARProject/unity/RamsesAR/.

Scripts

Script	Purpose
`ARManager.cs`	Core AR session manager, handles plane detection and tracking
`CharacterPlacer.cs`	Places and positions 3D pharaoh models on detected surfaces
`AudioController.cs`	Manages audio playback for AR narration
`UnityMessageManager.cs`	Bridge for React Native ↔ Unity communication

Opening in Unity

Open Unity Hub → Add → select RamsesARProject/unity/RamsesAR/
Open with Unity 2022.3 LTS or later
Install required packages from Packages/manifest.json (auto-resolved)

Backend Server

Setup

pip install -r requirements.txt

# Copy and fill in your API keys
cp .env.example .env

Environment Variables

GEMINI_API_KEY=           # Google Gemini API key
PINECONE_API_KEY=         # Pinecone vector store
MONGODB_URI=              # MongoDB connection string
DEEPGRAM_API_KEY=         # Deepgram STT/TTS
ELEVENLABS_API_KEY=       # ElevenLabs TTS (optional)
GOOGLE_PLACES_API_KEY=    # Google Places API
CLOUDINARY_URL=           # Image hosting

Ingest PDFs

python ingest.py --folder ../  # or any folder with PDFs

Start the Server

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Docs available at: http://localhost:8000/docs

API Endpoints

Authentication

Method	Endpoint	Description
`POST`	`/auth/signup`	Register a new user
`POST`	`/auth/signin`	Login → `access_token`

Pharaohs

Method	Endpoint	Description
`GET`	`/pharaohs`	List all pharaohs
`GET`	`/pharaohs/search?name=ramses`	Search by name
`GET`	`/pharaohs/{king_name}`	Pharaoh details + monuments
`GET`	`/pharaohs/{king_name}/monuments/{monument}/nearby`	Nearby restaurants & hotels

RAG & Chat

Method	Endpoint	Description
`POST`	`/query`	Text query with RAG (supports `conversation_id` for multi-turn)
`POST`	`/voice-query`	Audio in → transcript + RAG answer + audio out
`POST`	`/describe-images`	Upload images → pharaoh identification
`POST`	`/tts`	Text-to-speech with auto gender detection
`POST`	`/generate-image`	Generate scene image from conversation context

Conversations

Method	Endpoint	Description
`GET`	`/conversations`	List conversations with last message preview
`GET`	`/conversations/{id}`	Full conversation history
`POST`	`/conversations`	Create new conversation
`DELETE`	`/conversations/{id}`	Delete a conversation

Architecture

 Mobile App (React Native)
         │
         ├── AR Screen ──── Unity Engine (3D models + plane detection)
         │
         ▼
 ┌──────────────────────────────────────────────────────────────┐
 │                      FastAPI Server                          │
 │                                                              │
 │  ┌───────────────────────────────────────────────────────┐   │
 │  │ AI / ML Services                                      │   │
 │  │  Gemini Vision ──── image → pharaoh name              │   │
 │  │  BGE-M3 ─────────── text → 1024-dim embedding         │   │
 │  │  Gemini LLM ──────── RAG + query rewrite + gender det │   │
 │  │  Reranker ────────── rerank search results             │   │
 │  │  GPT4All (Mistral) ─ conversation → image prompt      │   │
 │  │  SD-Turbo ────────── prompt → generated image          │   │
 │  └───────────────────────────────────────────────────────┘   │
 │  ┌───────────────────────────────────────────────────────┐   │
 │  │ Speech Services                                       │   │
 │  │  Deepgram STT ────── audio → text                     │   │
 │  │  Deepgram TTS ────── text → audio (auto gender)       │   │
 │  └───────────────────────────────────────────────────────┘   │
 │  ┌───────────────────────────────────────────────────────┐   │
 │  │ External Integrations                                 │   │
 │  │  Google Places ───── nearby restaurants & hotels       │   │
 │  │  Uber ────────────── deep links to monument locations  │   │
 │  │  Cloudinary ──────── image hosting                     │   │
 │  └───────────────────────────────────────────────────────┘   │
 └──────────────────────┬───────────────────────────────────────┘
                        │
           ┌────────────┴────────────┐
           ▼                         ▼
    ┌─────────────┐          ┌──────────────┐
    │  Pinecone   │          │   MongoDB     │
    │  (vectors)  │          │   (data)      │
    │  BGE-M3     │          │  pharaohs     │
    │  embeddings │          │  users        │
    │             │          │  conversations │
    └─────────────┘          └──────────────┘

Query Pipeline

User prompt + (optional images / conversation_id)
        │
        ├── images? ────────── Gemini Vision → pharaoh descriptions
        ├── conversation_id? ─ MongoDB → load history
        │
        ▼
   Query Rewriting (Gemini LLM)
   "his temples" + history → "Ramses II temples"
        │
        ▼
   BGE-M3 Embedding → Pinecone Search (top-k) → Reranker (top-8)
        │
        ▼
   Gemini LLM (RAG generation with conversation history)
   Pharaoh speaks in first-person, auto-switches persona
        │
        ▼
   Response (answer + conversation_id + optional audio)

Voice Query Pipeline

Audio file → Deepgram STT → transcript
        │
        ▼
   RAG Pipeline (same as above)
        │
        ▼
   Auto gender detection (Gemini LLM) → Deepgram TTS → audio response
        │
        ▼
   Response (transcript + answer + audio_base64)

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
RamsesARProject		RamsesARProject
app		app
datasource		datasource
tests		tests
.gitignore		.gitignore
README.md		README.md
app.py		app.py
ingest.py		ingest.py
ingest_new.py		ingest_new.py
main.py		main.py
pharoahs.json		pharoahs.json
pharoahs_data.json		pharoahs_data.json
requirements.txt		requirements.txt
seed.py		seed.py
test_audio.py		test_audio.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Young Pharaohs

Project Structure

Tech Stack

Mobile App

Screens

Key Features

Setup

Unity AR Project

Scripts

Opening in Unity

Backend Server

Setup

Environment Variables

Ingest PDFs

Start the Server

API Endpoints

Authentication

Pharaohs

RAG & Chat

Conversations

Architecture

Query Pipeline

Voice Query Pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Young Pharaohs

Project Structure

Tech Stack

Mobile App

Screens

Key Features

Setup

Unity AR Project

Scripts

Opening in Unity

Backend Server

Setup

Environment Variables

Ingest PDFs

Start the Server

API Endpoints

Authentication

Pharaohs

RAG & Chat

Conversations

Architecture

Query Pipeline

Voice Query Pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages