KKU-NoteFlow
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 41 additions & 0 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎.github/workflows/docker.yml‎
Lines changed: 35 additions & 0 deletions b/‎.github/workflows/docker.yml‎
Lines changed: 35 additions & 0 deletions
diff --git a/‎Dockerfile‎
Lines changed: 20 additions & 0 deletions b/‎Dockerfile‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 55 additions & 0 deletions b/‎README.md‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎requirements.txt‎
Lines changed: 16 additions & 6 deletions b/‎requirements.txt‎
Lines changed: 16 additions & 6 deletions
@@ -0,0 +1,41 @@
+name: CI
+
+on:
+  pull_request:
+    branches: [ main ]
+  push:
+    branches: [ main ]
+    paths:
+      - '**/*.py'
+      - 'requirements.txt'
+      - '.github/workflows/ci.yml'
+
+jobs:
+  lint-test:
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+          cache: 'pip'
+
+      - name: Install deps
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r requirements.txt
+
+      - name: Syntax check
+        run: |
+          python -m py_compile $(git ls-files '*.py' | tr '\n' ' ')
+
+      - name: Import smoke
+        run: |
+          python - << 'PY'
+          from importlib import import_module
+          import_module('main')
+          print('Import OK')
+          PY
+
@@ -0,0 +1,35 @@
+name: Docker Build & Push (Backend)
+
+on:
+  push:
+    branches: [ main ]
+    paths:
+      - '**'
+      - '!README.md'
+
+jobs:
+  docker:
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Log in to GHCR
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Build and push
+        uses: docker/build-push-action@v6
+        with:
+          context: .
+          file: ./Dockerfile
+          push: true
+          tags: |
+            ghcr.io/${{ github.repository }}:backend-latest
+            ghcr.io/${{ github.repository }}:backend-${{ github.sha }}
+
@@ -0,0 +1,20 @@
+FROM python:3.11-slim
+
+ENV PYTHONDONTWRITEBYTECODE=1 \
+    PYTHONUNBUFFERED=1
+
+WORKDIR /app
+
+# System dependencies for pdf2image
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    poppler-utils \
+    && rm -rf /var/lib/apt/lists/*
+
+COPY Backend/requirements.txt ./requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
+
+COPY Backend/ ./
+
+EXPOSE 8080
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
+
@@ -0,0 +1,55 @@
+# Noteflow Backend (FastAPI)
+
+## Overview
+- FastAPI backend for Noteflow
+- OCR pipeline supports images, PDF, DOC/DOCX, HWP (via utilities and system tools)
+
+## Run (local)
+```
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+uvicorn main:app --host 0.0.0.0 --port 8080 --reload
+```
+
+Env (optional):
+- `SECRET_KEY`, `ACCESS_TOKEN_EXPIRE_MINUTES`
+- Database URLs if you connect a DB (current code uses provided models)
+
+## OCR system tools (optional but recommended)
+- PyMuPDF (Python) used by default for PDF text extraction
+- Optional fallbacks/tools:
+  - Poppler (`pdftoppm`) for `pdf2image`
+  - LibreOffice (`soffice`) for .doc → .pdf
+  - `hwp5txt` for .hwp text extraction
+- If missing, the API still returns 200 with `warnings` explaining limitations.
+
+## API Highlights
+- `POST /api/v1/files/ocr` — OCR and create note (accepts file + optional `folder_id`, `langs`, `max_pages`)
+- `POST /api/v1/files/upload` — Upload files to folder
+- `POST /api/v1/files/audio` — STT from audio, create/append to note
+
+## CI (GitHub Actions)
+- This folder includes `.github/workflows/ci.yml` to lint/smoke-test on push/PR.
+- Python 3.11, `pip install -r requirements.txt`, syntax check and import smoke.
+
+## Docker (optional; for later)
+- Dockerfile included. Build & run locally:
+```
+docker build -t noteflow-backend .
+docker run --rm -p 8080:8080 noteflow-backend
+```
+- GitHub Actions container build:
+  - `.github/workflows/docker.yml` pushes to GHCR:
+    - `ghcr.io/<owner>/<repo>:backend-latest`
+    - `ghcr.io/<owner>/<repo>:backend-<sha>`
+- Deployment example (SSH) once you’re ready:
+```
+docker login ghcr.io -u <USER> -p <TOKEN>
+docker pull ghcr.io/<owner>/<repo>:backend-latest
+docker run -d --name backend --restart=always -p 8080:8080 ghcr.io/<owner>/<repo>:backend-latest
+```
+
+## Notes
+- If you split this folder into its own repository root, the included `.github/workflows/*.yml` files will work as-is.
+- OCR uses model-first path (EasyOCR + TrOCR) and falls back to tesseract when available.
@@ -3,21 +3,31 @@ uvicorn
 pydantic
 sqlalchemy
 mysql-connector-python
-dotenv
+python-dotenv
 google-auth
 requests
 python-jose[cryptography]
 bcrypt
 
-torch==2.3.0+cu121
-torchaudio==2.3.0+cu121
-torchvision==0.18.0+cu121
---extra-index-url https://download.pytorch.org/whl/cu121
+# PyTorch (MacOS: CPU/MPS 빌드 자동 설치됨)
+torch==2.3.0
+torchvision==0.18.0
+torchaudio==2.3.0
 
 transformers>=4.40.0
 accelerate
 sentencepiece
 protobuf
 python-multipart
 easyocr
-whisper
+whisper
+pytesseract
+pdf2image
+PyMuPDF
+python-docx
+
+langchain>=0.2.0
+langchain-community
+langchain-core
+langchain-openai
+langchain-ollama