-
Notifications
You must be signed in to change notification settings - Fork 0
20251217 #1 postgre sql 연결 및 테이블 생성 #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
The head ref may contain hidden characters: "20251217_#1_PostgreSQL_\uC5F0\uACB0_\uBC0F_\uD14C\uC774\uBE14_\uC0DD\uC131"
Conversation
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
Walkthrough비동기 SQLAlchemy 엔진·세션, Pydantic 기반 설정, UUID 기본키 및 타임스탬프 믹스인, SourceType 열거형, GithubCursor ORM·레포지토리, DB 초기화 스크립트 및 마이그레이션, FastAPI 수명주기 훅과 로깅 설정이 추가되었습니다. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes 추가 확인 권장 항목:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (4)
app/models/github_cursor.py (2)
15-18: 중복 인덱스 제거 권장
UniqueConstraint는 자동으로 인덱스를 생성합니다. Line 17의Index는 동일한 컬럼에 대한 중복 인덱스이므로 제거하는 것이 좋습니다.🔎 제안된 수정
__table_args__ = ( UniqueConstraint("repository_name", "source_type", name="uq_github_cursor"), - Index("idx_github_cursor_repo_type", "repository_name", "source_type") )
20-20: UUID default 생성 방식 명시성 개선 권장
default=uuid.uuid4는 작동하지만, SQLAlchemy 2.0에서는default_factory가 더 명시적입니다. 현재 방식도 정상 작동하지만, 일관성을 위해 검토를 권장합니다.🔎 선택적 개선
SQLAlchemy 문서에서는 callable을 직접 전달하는 방식을 지원하므로 현재 구현도 유효합니다. 하지만 더 명시적인 표현을 원한다면:
# 현재 방식도 유효함 id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4) # 또는 더 명시적으로 (선택사항) from sqlalchemy import DefaultClause id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, server_default=DefaultClause("gen_random_uuid()"))서버 측 UUID 생성을 사용하면 PostgreSQL의
gen_random_uuid()함수를 활용할 수 있습니다.app/repositories/github_cursor_repository.py (1)
39-44: index_elements의 타입 안전성 개선 권장
index_elements에 문자열 리스트 대신 컬럼 객체를 사용하면 타입 안전성이 향상됩니다.🔎 선택적 개선
).on_conflict_do_update( - index_elements=["repository_name", "source_type"], + index_elements=[GithubCursorEntity.repository_name, GithubCursorEntity.source_type], set_={ "cursor_value": cursor_value, "updated_at": func.now(), }, )현재 구현도 정상 작동하지만, 컬럼명 변경 시 리팩토링 도구가 자동으로 감지할 수 있습니다.
sql/V20251222_141233__create_github_cursor_table.sql (1)
14-16: 인덱스가 중복될 수 있습니다.Line 8의
UNIQUE (repository_name, source_type)제약조건이 이미 해당 컬럼들에 대한 유니크 인덱스를 자동으로 생성합니다. 명시적인CREATE INDEX는 동일한 컬럼에 대해 두 번째 인덱스를 생성하여 불필요한 스토리지를 사용하게 됩니다.🔎 제안하는 수정
UNIQUE 제약조건이 이미 필요한 인덱스를 제공하므로 명시적인 인덱스 생성을 제거하세요:
- --- 인덱스 생성 -CREATE INDEX IF NOT EXISTS idx_github_cursor_repo_type - ON github_cursor (repository_name, source_type);UNIQUE 제약조건으로 생성된 인덱스는 조회 쿼리에도 효과적으로 사용됩니다.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.coderabbit.yamlapp/config/database.pyapp/config/settings.pyapp/models/base.pyapp/models/enums/__init__.pyapp/models/enums/source_type.pyapp/models/github_cursor.pyapp/models/github_issue.pyapp/repositories/github_cursor_repository.pyrequirements.txtsql/V20251222_141233__create_github_cursor_table.sql
💤 Files with no reviewable changes (1)
- app/models/github_issue.py
🧰 Additional context used
🧬 Code graph analysis (3)
app/config/database.py (1)
app/config/settings.py (1)
get_settings(45-46)
app/repositories/github_cursor_repository.py (2)
app/models/enums/source_type.py (1)
SourceType(4-17)app/models/github_cursor.py (1)
GithubCursorEntity(13-29)
app/models/github_cursor.py (2)
app/models/base.py (1)
Base(4-9)app/models/enums/source_type.py (1)
SourceType(4-17)
🔇 Additional comments (5)
.coderabbit.yaml (1)
1-18: 설정이 올바르게 구성되었습니다CodeRabbit 설정이 스키마에 맞게 정확하게 작성되었습니다. 한국어 리뷰와 자동 리뷰가 활성화되어 있으며, main과 test 브랜치에 대한 리뷰가 설정되었습니다.
app/models/enums/source_type.py (1)
4-17: 잘 구현된 Enum 타입입니다
str을 상속하여 JSON 직렬화와 문자열 비교가 용이하도록 구현되었습니다. FastAPI와 SQLAlchemy에서 사용하기에 적합한 패턴입니다.app/models/base.py (1)
4-9: 표준 SQLAlchemy 2.0 패턴입니다
DeclarativeBase를 사용한 표준적인 베이스 클래스 정의입니다.app/models/github_cursor.py (1)
29-29: onupdate의 서버 측 실행 확인 필요SQLAlchemy에서
onupdate=func.now()는 Python 측에서 실행됩니다. 서버 측(PostgreSQL)에서 자동 업데이트를 원한다면server_onupdate=func.now()를 사용해야 합니다.현재 구현이 의도한 동작인지 확인하세요. 서버 측 타임스탬프 관리를 원한다면:
🔎 제안된 수정
-updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, server_default=func.now(), onupdate=func.now()) +updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, server_default=func.now(), server_onupdate=func.now())app/repositories/github_cursor_repository.py (1)
13-24: 조회 메서드가 올바르게 구현되었습니다
scalar_one_or_none()을 사용하여 단일 결과를 안전하게 반환합니다.
| class Settings(BaseSettings): | ||
| """ | ||
| .env 에서 환경변수 로딩 | ||
| """ | ||
|
|
||
| model_config = SettingsConfigDict( | ||
| env_file=".env", | ||
| env_file_encoding="utf-8", | ||
| case_sensitive=False, | ||
| extra="ignore" | ||
| ) | ||
|
|
||
| # GitHub API 설정 | ||
| github_api_base_url: str | ||
|
|
||
| # Ollama 설정 | ||
| ollama_base_url: str | ||
| ollama_api_key: str | ||
| ollama_model: str | ||
| ollama_timeout_seconds: int | ||
|
|
||
| # Qdrant 설정 | ||
| qdrant_base_url: str | ||
| qdrant_collection: str | ||
| qdrant_api_key: str | ||
|
|
||
| # 텍스트 청크 설정 | ||
| text_chunk_max_chars: int | ||
| text_chunk_overlap_chars: int | ||
| text_chunk_hard_max_chars: int | ||
|
|
||
| # 동시성 설정 | ||
| concurrency_embedding_max_concurrency: int | ||
|
|
||
| # PostgreSQL 설정 | ||
| postgres_url: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Environment variable field and .env example file check
echo "=== Settings 클래스의 필수 필드 확인 ==="
rg -A 1 "^\s+[a-z_]+:" app/config/settings.py
echo -e "\n=== .env 또는 .env.example 파일 존재 확인 ==="
fd -t f "^\.env" -H
fd -t f "\.env\.example$" -HRepository: Team-Romi/romi-chatbot
Length of output: 506
🏁 Script executed:
# Check the full settings.py file to understand validation setup
cat -n app/config/settings.pyRepository: Team-Romi/romi-chatbot
Length of output: 1205
🏁 Script executed:
# Check for any Field imports or validation patterns
rg "Field|validator|field_validator" app/config/settings.pyRepository: Team-Romi/romi-chatbot
Length of output: 48
🏁 Script executed:
# Search for get_settings usage and error handling
rg "get_settings" --type py -B 2 -A 2
# Check main.py or app initialization
fd -t f "main\.py$|app\.py$|__main__\.py$"Repository: Team-Romi/romi-chatbot
Length of output: 603
🏁 Script executed:
# Check for .env documentation or example in README or docs
fd -t f "README|\.md$" | head -20
rg "\.env|환경변수|environment" --type md -B 2 -A 2 | head -50Repository: Team-Romi/romi-chatbot
Length of output: 75
🏁 Script executed:
# Check main.py for settings initialization
cat -n app/main.py
# Check claude.md for documentation
cat -n claude.md | head -100Repository: Team-Romi/romi-chatbot
Length of output: 1177
🏁 Script executed:
# Check database.py to see how _settings is used
cat -n app/config/database.pyRepository: Team-Romi/romi-chatbot
Length of output: 885
.env.example 파일 추가 및 필드 검증 명시 필요
모든 설정 필드가 필수이며 기본값이 없지만 .env.example 파일이 존재하지 않아 개발자가 필요한 환경변수를 알 수 없습니다. 또한 Field() 검증을 통한 명시적 설명이 없습니다. .env 파일이 누락되거나 필수 환경변수가 없을 경우 database.py 모듈 임포트 시점에 Pydantic 검증 에러가 발생합니다.
.env.example 파일을 작성하여 모든 필수 환경변수와 예제값을 문서화하고, 각 필드에 Field(description="...") 을 추가하여 검증 목적을 명확히 하세요.
🤖 Prompt for AI Agents
app/config/settings.py lines 6-41: The Settings class exposes many required env
fields but lacks Field(...) descriptions and there's no .env.example to show
required variables; create a .env.example at the repo root listing every
required variable with example values and add Field(description="...") to each
Settings attribute (github_api_base_url, ollama_base_url, ollama_api_key,
ollama_model, ollama_timeout_seconds, qdrant_base_url, qdrant_collection,
qdrant_api_key, text_chunk_max_chars, text_chunk_overlap_chars,
text_chunk_hard_max_chars, concurrency_embedding_max_concurrency, postgres_url)
so Pydantic validation messages and generated docs clearly state what each env
var means and expected types; keep fields required (no defaults) or add sensible
defaults where appropriate, and update documentation/README to reference
.env.example.
| async def upsert( | ||
| self, | ||
| session: AsyncSession, | ||
| repository_name: str, | ||
| source_type: SourceType, | ||
| cursor_value: str, | ||
| ) -> None: | ||
| query = insert(GithubCursorEntity).values( | ||
| id=uuid.uuid4(), | ||
| repository_name=repository_name, | ||
| source_type=source_type, | ||
| cursor_value=cursor_value, | ||
| ).on_conflict_do_update( | ||
| index_elements=["repository_name", "source_type"], | ||
| set_={ | ||
| "cursor_value": cursor_value, | ||
| "updated_at": func.now(), | ||
| }, | ||
| ) | ||
| await session.execute(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the GithubCursorRepository file
find . -name "*github_cursor_repository*" -type f
# Look for GithubCursorRepository usage
echo -e "\n=== GithubCursorRepository usages ==="
rg -n "GithubCursorRepository" --type py -B 2 -A 10 2>/dev/null | head -100
# Look for session management patterns in service/handler layers
echo -e "\n=== Session commit patterns ==="
rg -n "\.commit\(\)|\.execute\(" --type py -B 2 -A 2 | head -80Repository: Team-Romi/romi-chatbot
Length of output: 1764
🏁 Script executed:
#!/bin/bash
# List all repository files to understand the pattern
find . -path "*/repositories/*.py" -type f | head -20
# Check the github_cursor_repository file specifically
echo -e "\n=== GithubCursorRepository file ==="
rg -n "class GithubCursorRepository|def upsert" --type py -B 2 -A 15 2>/dev/nullRepository: Team-Romi/romi-chatbot
Length of output: 2676
🏁 Script executed:
#!/bin/bash
# Look for session setup patterns
echo "=== Session dependency setup ==="
rg -n "get_session|get_db|AsyncSession" --type py -B 1 -A 3 | grep -E "(def|class|yield|commit)" | head -50
# Check if there's a Unit of Work or similar pattern
echo -e "\n=== Unit of Work patterns ==="
rg -n "UnitOfWork|async with.*session|SessionLocal" --type py | head -30Repository: Team-Romi/romi-chatbot
Length of output: 439
🏁 Script executed:
#!/bin/bash
# Get the complete database.py file
echo "=== Complete database.py ==="
cat app/config/database.py
# Find all places where upsert is called
echo -e "\n=== Upsert method calls ==="
rg -n "\.upsert\(" --type py -B 3 -A 5 2>/dev/nullRepository: Team-Romi/romi-chatbot
Length of output: 754
🏁 Script executed:
#!/bin/bash
# Look for service layer that might use the repository
echo "=== Service/handler files ==="
find . -path "*/services/*.py" -o -path "*/handlers/*.py" -o -path "*/api/routes/*.py" | head -20
# Search for GithubCursor usage in services/handlers
echo -e "\n=== GithubCursor in services/handlers ==="
rg -n "GithubCursor" --type py -B 2 -A 8 | grep -v "^Binary" | head -100Repository: Team-Romi/romi-chatbot
Length of output: 3703
🏁 Script executed:
#!/bin/bash
# Search for where upsert is actually called
echo "=== Upsert calls in github_service ==="
cat app/services/github_service.py | grep -A 10 -B 5 "upsert" || echo "Not in github_service"
# Look at github routes
echo -e "\n=== Github route handlers ==="
cat app/api/routes/github.py 2>/dev/null || echo "File not found"
# Search more broadly for upsert usage
echo -e "\n=== All upsert usages ==="
rg -n "upsert" --type py -B 5 -A 10 2>/dev/nullRepository: Team-Romi/romi-chatbot
Length of output: 1409
🏁 Script executed:
#!/bin/bash
# Check if there's middleware or any auto-commit pattern
echo "=== Middleware/app setup ==="
find . -name "main.py" -o -name "app.py" | xargs cat 2>/dev/null | head -100
# Look for any @app or FastAPI setup
echo -e "\n=== FastAPI app initialization ==="
rg -n "FastAPI\(\|@app\.|Depends" --type py -B 1 -A 3 | head -80
# Check if there's any transaction/commit handling in middleware or lifespan
echo -e "\n=== Middleware or transaction handling ==="
rg -n "middleware|lifespan|Depends.*session" --type py -B 2 -A 5 | head -100Repository: Team-Romi/romi-chatbot
Length of output: 338
🏁 Script executed:
#!/bin/bash
# Check the complete github_cursor_repository.py file to see full context
echo "=== Complete github_cursor_repository.py ==="
cat app/repositories/github_cursor_repository.py
# Check all repository files to see the pattern
echo -e "\n=== Other repository methods (sample) ==="
rg -n "async def" app/repositories/ --type py -A 10 | head -150Repository: Team-Romi/romi-chatbot
Length of output: 3176
🏁 Script executed:
#!/bin/bash
# Final verification: check if any caller handles session commit
echo "=== Check for session.commit or session.rollback patterns ==="
rg -n "session\.commit|session\.rollback|async with.*session" --type py | head -50
# Verify database.py doesn't have commit in finally/except
echo -e "\n=== Database.py full context (with surrounding lines) ==="
cat -n app/config/database.pyRepository: Team-Romi/romi-chatbot
Length of output: 1082
세션 트랜잭션 커밋 처리 필수
get_async_session() 의존성 함수가 세션을 yield한 후 커밋 처리가 없습니다. 현재 코드는 upsert() 메서드의 데이터 변경사항이 데이터베이스에 반영되지 않습니다.
get_async_session()을 다음과 같이 수정하세요:
async def get_async_session() -> AsyncGenerator[AsyncSession, None]:
async with _async_session_factory() as session:
try:
yield session
await session.commit()
except Exception:
await session.rollback()
raise이는 FastAPI + SQLAlchemy 표준 패턴으로, 요청 범위 내 모든 데이터베이스 변경사항을 자동으로 커밋하고 오류 발생 시 롤백을 보장합니다.
🤖 Prompt for AI Agents
In app/repositories/github_cursor_repository.py around lines 26 to 45 the
upsert() call modifies the database but the session provider is not committing,
so changes are never persisted; update the get_async_session dependency (where
AsyncSession is yielded) to follow the FastAPI+SQLAlchemy pattern: use async
with _async_session_factory() as session: yield session; after the yield call
await session.commit(); and in an except block await session.rollback() then
re-raise, ensuring every request scope auto-commits on success and rolls back on
error so upsert() persists its changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai get_async_session()에서 매번 commit()을 수행하면 transaction 경계를 잡기 힘들지 않아??
spring boot 에서는 어노테이션을 통해서 트랜잭션을 관리할 수 있는데 python fastapi는 주로 어떤식으로 트랜잭션을 관리해??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
FastAPI SQLAlchemy async transaction management patterns best practices
💡 Result:
Short practical patterns and best practices for async transaction management with FastAPI + SQLAlchemy:
-
Use short-lived AsyncSession per request (dependency). Create async_sessionmaker once, yield a session from a dependency, and ensure rollback on exceptions and close after use. This avoids shared/global sessions. [3][5]
-
Use explicit transaction context managers. Wrap multi-step DB work in AsyncSession.begin() (async with session.begin(): ...) so commit/rollback is automatic and safe in async code. For single-statement reads you can still use the session without begin(). [1][2]
-
Savepoints / nested transactions for isolation in libraries/tests. Use begin_nested() (SAVEPOINT) when you need rollback inner units without aborting outer transaction (also useful in tests). For tests prefer binding sessions to a connection + top-level transaction and use savepoints to rollback between tests. [1][4]
-
Configure session options for async usage:
- use async engine (create_async_engine) with asyncpg for Postgres,
- create async_sessionmaker(..., class_=AsyncSession, expire_on_commit=False) to avoid surprises with expired objects after commit,
- in tests use join_transaction_mode="create_savepoint" if you want commits inside tests but still roll back outer transaction. [3][4]
-
Error handling: always catch exceptions in the dependency or transactional scope, await session.rollback() before re-raising, and avoid swallowing DB errors. Rely on context managers for most cases. [3][5]
-
Avoid long-lived transactions and large transactional work in request handlers. Keep transactions short (do heavy work outside transaction or in background jobs) to reduce lock contention and connection exhaustion. [1][6]
-
Connection-pool and driver considerations:
- tune pool size for async workers (uvicorn/gunicorn) and pgbouncer mode (transaction vs session) — prefer transaction pooling with short sessions,
- handle transient connection errors with retries/backoff at the engine/connection level or via retry wrappers. Monitor reconnect behavior on graceful shutdown. [5][3]
-
Layering & testing:
- put DB logic in repository/service layers that accept a session (dependency), so transactional boundaries are explicit and testable,
- for integration tests create an AsyncConnection, begin a top-level transaction, bind sessions to it and rollback at test end. [4][6]
-
When mixing sync and async code, avoid running sync DB calls in the async event loop — use async drivers and AsyncSession throughout, or run sync code in threadpool deliberately. [1]
Short checklist to implement in FastAPI:
-
create_async_engine(DATABASE_URL) with asyncpg
-
async_sessionmaker(..., class_=AsyncSession, expire_on_commit=False)
-
dependency:
async def get_session():
async with async_sessionmaker() as session:
try:
yield session
except:
await session.rollback()
raise -
use async with session.begin(): for transactional endpoints
-
use begin_nested() for savepoints where needed
-
tune pool and join_transaction_mode for tests
Sources: SQLAlchemy transaction docs and contextual sessions; FastAPI community patterns; test fixtures / savepoint examples; common pitfalls discussion. [1][2][3][4][5]
References
[1] SQLAlchemy — Transactions and Connection Management (Async-aware).
[2] SQLAlchemy — Contextual / session transaction docs.
[3] FastAPI community discussion: async_sessionmaker per-request dependency example.
[4] Test fixture example: AsyncConnection + savepoint pattern (gist).
[5] StackOverflow / GH issues on middleware/global-session pitfalls and pool/commit behavior.
@Chuseok22 좋은 질문입니다! 맞습니다. get_async_session()에서 매번 자동 커밋하면 복잡한 트랜잭션 경계 관리가 어려워집니다.
FastAPI의 트랜잭션 관리 패턴
Spring Boot의 @Transactional과 달리, FastAPI/Python에는 표준 어노테이션 기반 트랜잭션 관리가 없습니다. 대신 여러 패턴을 사용할 수 있습니다:
1. 명시적 트랜잭션 블록 (권장)
세션 의존성은 커밋 없이 세션만 제공하고, 서비스/라우터 레이어에서 명시적으로 트랜잭션을 관리:
# app/config/database.py
async def get_async_session() -> AsyncGenerator[AsyncSession, None]:
async with _async_session_factory() as session:
yield session
# 커밋하지 않음 - 호출자가 결정# 서비스 레이어에서 명시적 트랜잭션 관리
async def update_github_data(
repository_name: str,
session: AsyncSession = Depends(get_async_session)
):
async with session.begin(): # 트랜잭션 시작
# 여러 repository 작업을 하나의 트랜잭션으로 묶음
await github_cursor_repo.upsert(session, repository_name, "ISSUE", "cursor1")
await github_cursor_repo.upsert(session, repository_name, "PR", "cursor2")
# 블록 종료 시 자동 커밋, 예외 발생 시 자동 롤백장점:
- 트랜잭션 경계를 명시적으로 제어
- 여러 작업을 하나의 트랜잭션으로 묶기 쉬움
- Spring의
@Transactional과 유사한 제어 수준
2. 의존성 레벨 자동 커밋
단순한 CRUD 애플리케이션에 적합:
async def get_async_session() -> AsyncGenerator[AsyncSession, None]:
async with _async_session_factory() as session:
try:
yield session
await session.commit()
except Exception:
await session.rollback()
raise장점: 간단한 케이스에서 보일러플레이트 감소
단점: 복잡한 트랜잭션 경계 관리 어려움
3. Unit of Work 패턴
더 복잡한 도메인 로직을 위한 패턴:
class UnitOfWork:
def __init__(self, session: AsyncSession):
self.session = session
self.github_cursor_repo = GithubCursorRepository()
async def __aenter__(self):
await self.session.begin()
return self
async def __aexit__(self, exc_type, exc_val, exc_tb):
if exc_type:
await self.session.rollback()
else:
await self.session.commit()
# 사용
async with UnitOfWork(session) as uow:
await uow.github_cursor_repo.upsert(...)현재 코드에 대한 권장사항
현재 구조를 고려하면 **패턴 1 (명시적 트랜잭션 블록)**을 권장합니다:
get_async_session()은 세션만 제공 (커밋 없음)- 단순 조회는 그냥 사용
- 쓰기 작업은
async with session.begin():블록으로 감싸기
이렇게 하면:
- 단순 케이스에서는 간단하게 사용
- 복잡한 트랜잭션도 명시적으로 제어 가능
- 나중에 여러 repository 작업을 묶기 쉬움
현재 upsert() 메서드가 변경사항을 영구 저장하려면, 호출하는 곳에서 session.begin() 블록을 사용하거나 명시적으로 await session.commit()을 호출해야 합니다.
참고로, SQLAlchemy 2.0+의 async with session.begin()은 Spring의 @Transactional과 매우 유사한 선언적 트랜잭션 관리를 제공합니다.
| alembic==1.17.2 | ||
| annotated-doc==0.0.4 | ||
| annotated-types==0.7.0 | ||
| anyio==4.12.0 | ||
| asyncpg==0.31.0 | ||
| certifi==2025.11.12 | ||
| cffi==2.0.0 | ||
| charset-normalizer==3.4.4 | ||
| click==8.3.1 | ||
| fastapi==0.124.4 | ||
| cryptography==46.0.3 | ||
| Deprecated==1.3.1 | ||
| fastapi==0.127.0 | ||
| greenlet==3.3.0 | ||
| grpcio==1.76.0 | ||
| grpcio-tools==1.76.0 | ||
| h11==0.16.0 | ||
| h2==4.3.0 | ||
| hpack==4.1.0 | ||
| httpcore==1.0.9 | ||
| httpx==0.28.1 | ||
| hyperframe==6.1.0 | ||
| idna==3.11 | ||
| Mako==1.3.10 | ||
| MarkupSafe==3.0.3 | ||
| numpy==2.4.0 | ||
| portalocker==3.2.0 | ||
| protobuf==6.33.2 | ||
| pycparser==2.23 | ||
| pydantic==2.12.5 | ||
| pydantic-settings==2.12.0 | ||
| pydantic_core==2.41.5 | ||
| PyGithub==2.8.1 | ||
| PyJWT==2.10.1 | ||
| PyNaCl==1.6.1 | ||
| python-dotenv==1.2.1 | ||
| qdrant-client==1.16.2 | ||
| requests==2.32.5 | ||
| setuptools==80.9.0 | ||
| SQLAlchemy==2.0.45 | ||
| starlette==0.50.0 | ||
| typing-inspection==0.4.2 | ||
| typing_extensions==4.15.0 | ||
| uvicorn==0.38.0 | ||
| urllib3==2.6.2 | ||
| uvicorn==0.40.0 | ||
| wrapt==2.0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
의존성 버전 호환성 검증 및 테스트 필요
FastAPI 0.127.0과 SQLAlchemy 2.0.45, asyncpg 0.31.0의 조합에 대한 호환성 검증이 필요합니다. 특히 asyncpg 0.31.0은 SQLAlchemy 2.0에서 권장되는 버전 범위(0.29.0 이상)를 초과하므로, 프로덕션 배포 전에 데이터베이스 쿼리와 비동기 세션 동작을 철저히 테스트하시기 바랍니다.
🤖 Prompt for AI Agents
In requirements.txt lines 1-45, the combination FastAPI 0.127.0, SQLAlchemy
2.0.45 and asyncpg 0.31.0 may have compatibility risks; run compatibility and
integration tests (unit + DB async integration) against asyncpg 0.31.0 and
SQLAlchemy 2.0.45 to confirm async session/query behavior; if tests fail, change
the asyncpg version to a tested range (e.g., pin to a supported version such as
>=0.29.0,<0.31.0 or the specific version that passes tests) and update
requirements.txt accordingly, then re-run CI matrix including migrations and
real DB connections to validate the fix.
| @@ -0,0 +1,16 @@ | |||
| CREATE TABLE IF NOT EXISTS github_cursor ( | |||
| id UUID PRIMARY KEY, | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UUID PRIMARY KEY에 DEFAULT 값이 누락되었습니다.
id 컬럼에 기본값이 없어 INSERT 시 명시적으로 UUID를 제공해야 합니다. PostgreSQL에서는 자동 생성을 위해 DEFAULT를 추가하는 것이 일반적입니다.
🔎 제안하는 수정
- id UUID PRIMARY KEY,
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),참고: gen_random_uuid()는 PostgreSQL 13+에서 기본 제공됩니다. 이전 버전을 사용하는 경우 uuid-ossp 확장의 uuid_generate_v4()를 사용하세요.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| id UUID PRIMARY KEY, | |
| id UUID PRIMARY KEY DEFAULT gen_random_uuid(), |
🤖 Prompt for AI Agents
In sql/V20251222_141233__create_github_cursor_table.sql around line 2, the id
UUID PRIMARY KEY column lacks a DEFAULT value so inserts must always supply a
UUID; update the column definition to set a DEFAULT that generates a UUID (e.g.,
DEFAULT gen_random_uuid() for PG 13+, or DEFAULT uuid_generate_v4() if using the
uuid-ossp extension) and, if using uuid_generate_v4(), ensure the uuid-ossp
extension is created in the migration prior to table creation.
| source_type VARCHAR(50) NOT NULL, | ||
| cursor_value VARCHAR(500) NOT NULL, | ||
| created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), | ||
| updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated_at 자동 업데이트 메커니즘이 누락되었습니다.
updated_at 컬럼은 DEFAULT NOW()로 생성 시에만 설정되며, UPDATE 시 자동으로 갱신되지 않습니다. 트리거를 추가하거나 애플리케이션 레벨에서 처리해야 합니다.
🔎 제안하는 수정 (트리거 추가)
테이블 생성 후 다음 트리거를 추가하세요:
-- updated_at 자동 업데이트 함수 생성
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ language 'plpgsql';
-- 트리거 생성
CREATE TRIGGER update_github_cursor_updated_at
BEFORE UPDATE ON github_cursor
FOR EACH ROW
EXECUTE FUNCTION update_updated_at_column();🤖 Prompt for AI Agents
In sql/V20251222_141233__create_github_cursor_table.sql around line 7, the
updated_at column only has DEFAULT NOW() and won’t auto-refresh on UPDATE; add
an automatic update mechanism by creating a plpgsql trigger function that sets
NEW.updated_at = NOW() and then create a BEFORE UPDATE trigger on github_cursor
to call that function (alternatively, document that the application must
explicitly set updated_at on updates if you prefer app-level handling).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
app/config/settings.py (1)
6-42:.env.example파일 및 필드 검증 명시 필요이전 리뷰에서 지적된 내용과 동일하게, 필수 환경변수에 대한
.env.example파일과Field(description=...)를 통한 명시적 설명이 부족합니다.app/config/database.py (1)
8-20: 모듈 레벨 초기화 및 설정 검증 에러 처리이전 리뷰에서 지적된 내용과 동일하게, 모듈 import 시점에 설정과 엔진이 초기화되어
.env파일 부재 시 명확한 에러 메시지 제공이 어렵습니다.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.gitignoreapp/config/database.pyapp/config/settings.pyapp/db/__init__.pyapp/db/init_db.pyapp/main.pyapp/models/base.pyapp/models/github_cursor.pyapp/repositories/github_cursor_repository.pyapp/utils/logger.py
🚧 Files skipped from review as they are similar to previous changes (1)
- app/models/github_cursor.py
🧰 Additional context used
🧬 Code graph analysis (4)
app/main.py (2)
app/config/database.py (1)
get_async_engine(23-24)app/db/init_db.py (1)
create_tables_if_not_exists(11-17)
app/db/init_db.py (2)
app/models/base.py (1)
Base(9-14)app/models/github_cursor.py (1)
GithubCursorEntity(9-20)
app/repositories/github_cursor_repository.py (2)
app/models/enums/source_type.py (1)
SourceType(4-17)app/models/github_cursor.py (1)
GithubCursorEntity(9-20)
app/config/database.py (1)
app/config/settings.py (1)
get_settings(46-47)
🪛 Ruff (0.14.8)
app/main.py
12-12: Unused function argument: app
(ARG001)
🔇 Additional comments (9)
.gitignore (1)
88-88: 환경 파일 패턴 추가 확인
.env.*패턴 추가로 로컬/프로덕션 환경별 설정 파일을 안전하게 제외할 수 있습니다.app/utils/logger.py (1)
12-65: 로거 설정 로직 확인로거 초기화 흐름이 명확하게 재구성되었습니다. 핸들러 중복 방지, 로그 레벨 설정, UTF-8 인코딩 지원 등이 적절히 구현되어 있습니다.
app/config/database.py (1)
31-40: 트랜잭션 관리 패턴 확인이전 논의를 통해 명시적 트랜잭션 블록 패턴(
async with session.begin())을 사용하기로 합의된 내용이 반영되었습니다. 현재 의존성은 세션만 제공하며, 호출자가 트랜잭션 경계를 명시적으로 관리하는 구조입니다.app/models/base.py (1)
35-42: UUID 기본키 믹스인 확인UUID를 기본키로 사용하는 설계가 적절하며,
default=uuid.uuid4로 Python 레벨에서 UUID를 생성하는 방식이 올바르게 구현되어 있습니다.app/db/init_db.py (1)
7-17: DB 테이블 초기화 로직 확인모델 임포트 후
Base.metadata.create_all로 테이블을 생성하는 간결하고 명확한 구조입니다. 비동기 컨텍스트 내에서run_sync를 사용한 점도 적절합니다.app/main.py (1)
11-16: FastAPI lifespan 훅 구현 확인애플리케이션 시작 시 DB 테이블 생성과 로깅을 수행하는 lifespan 컨텍스트 매니저가 올바르게 구현되었습니다. 시작/종료 시점에 로그를 남겨 운영 중 애플리케이션 상태를 추적할 수 있습니다.
정적 분석 도구의
app파라미터 미사용 경고는 FastAPI의 lifespan 시그니처가app파라미터를 요구하기 때문에 발생하는 false positive입니다.app/repositories/github_cursor_repository.py (2)
13-27: 커서 조회 로직 확인복합키(repository_name, source_type)를 사용한 조회 로직이 올바르게 구현되었습니다.
scalar_one_or_none()을 사용하여 결과가 없을 때 None을 반환하는 처리도 적절합니다.
29-53: Upsert 구현 및updated_at명시적 갱신 확인PostgreSQL의
on_conflict_do_update를 사용한 upsert 로직이 올바르게 구현되었습니다. 특히 Line 48에서 충돌 시updated_at을 명시적으로func.now()로 갱신하는 점이 좋습니다.다만, 이는 upsert 시나리오에만 적용되며, 일반적인 ORM 업데이트(
session.add+commit)에서는app/models/base.py의TimestampMixin에onupdate설정이 없어updated_at이 자동 갱신되지 않습니다. 해당 이슈는base.py에서 별도로 지적하였습니다.app/config/settings.py (1)
19-20:github_api_token의 선택적 인증 설계 의도 확인 필요GitHub API는 공개 리소스만 요청할 경우 인증 없이 사용할 수 있으며, 인증 없는 요청은 시간당 60개 요청만 허용됩니다. 현재
github_api_token이 Optional로 선언되어 있는데, 이것이 의도적인 설계인지 확인이 필요합니다:
- GitHub API 구현 시 토큰이 필수라면 Optional 제거
- 공개 저장소만 접근하는 경우라면, 토큰이 None일 때 적절히 처리하는 로직 필요
- 선택적 인증을 지원한다면 그 제약 사항(낮은 요청 한계, 공개 리소스만 접근 등)을 문서화
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR establishes PostgreSQL database connectivity and implements a cursor tracking system for GitHub data sources. The implementation includes SQLAlchemy-based async database access, environment-based configuration management, and automatic table creation on application startup.
Key Changes:
- Added PostgreSQL connection infrastructure with async SQLAlchemy support
- Implemented
github_cursortable for tracking GitHub data ingestion cursors across repositories, issues, pull requests, commits, and releases - Introduced environment-based configuration with pydantic-settings for managing database credentials and API keys
Reviewed changes
Copilot reviewed 12 out of 16 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
sql/V20251222_141233__create_github_cursor_table.sql |
SQL migration defining the github_cursor table schema with constraints and indexes |
app/models/base.py |
SQLAlchemy base classes including declarative base, timestamp mixin, and UUID primary key mixin |
app/models/github_cursor.py |
GithubCursorEntity model definition matching the SQL schema |
app/models/enums/source_type.py |
SourceType enum defining GitHub data source types (REPOSITORY, ISSUE, PULL_REQUEST, COMMIT, RELEASE) |
app/repositories/github_cursor_repository.py |
Repository layer implementing cursor upsert and query operations |
app/config/settings.py |
Environment-based settings configuration for APIs, database, and application parameters |
app/config/database.py |
Async database engine and session management with connection pooling |
app/db/init_db.py |
Database initialization logic to auto-create tables on startup |
app/main.py |
FastAPI lifespan integration for database initialization |
app/utils/logger.py |
Indentation reformatting (spaces to 2-space indents) |
requirements.txt |
Updated dependencies including SQLAlchemy, asyncpg, PyGithub, qdrant-client, and related packages |
.gitignore |
Added pattern to ignore .env.* files |
.coderabbit.yaml |
CodeRabbit configuration for Korean language reviews and auto-reply chat |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| finally: | ||
| await session.close() |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly closing the session in the finally block is unnecessary when using the async context manager. The context manager automatically handles session closure. Remove the explicit await session.close() call to avoid potential double-close issues and rely on the context manager's cleanup.
| finally: | |
| await session.close() |
| nullable=False, | ||
| server_default=func.now(), # DB 레벨 - INSERT 시 자동 | ||
| ) | ||
|
|
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing blank line before the class definition. According to PEP 8, there should be two blank lines before top-level class definitions for better readability.
| # GitHub API 설정 | ||
| github_api_base_url: str | ||
| github_api_token: str | None = None | ||
|
|
||
| # Ollama 설정 | ||
| ollama_base_url: str | ||
| ollama_api_key: str | ||
| ollama_model: str | ||
| ollama_timeout_seconds: int | ||
|
|
||
| # Qdrant 설정 | ||
| qdrant_base_url: str | ||
| qdrant_collection: str | ||
| qdrant_api_key: str | ||
|
|
||
| # 텍스트 청크 설정 | ||
| text_chunk_max_chars: int | ||
| text_chunk_overlap_chars: int | ||
| text_chunk_hard_max_chars: int | ||
|
|
||
| # 동시성 설정 | ||
| concurrency_embedding_max_concurrency: int | ||
|
|
||
| # PostgreSQL 설정 | ||
| postgres_url: str |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The settings contain sensitive credentials (API keys, tokens, database URLs) but there's no validation to ensure these required security-sensitive fields are actually provided. Consider adding validation to ensure that critical security credentials like github_api_token, ollama_api_key, qdrant_api_key, and postgres_url are not empty strings or contain placeholder values when the application starts.
| created_at, updated_at 자동 관리 Mixin | ||
| - created_at: DB 레벨 자동 설정 | ||
| - updated_at: 애플리케이션 레벨에서 명시적 관리 |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment claims that updated_at is managed at the application level, but the implementation only sets server_default=func.now(), which means it will only be set automatically on INSERT, not on UPDATE. The field will not automatically update when a record is modified. Either the comment should be corrected to reflect the actual behavior, or an onupdate parameter should be added to automatically update this field on modifications.
| created_at, updated_at 자동 관리 Mixin | |
| - created_at: DB 레벨 자동 설정 | |
| - updated_at: 애플리케이션 레벨에서 명시적 관리 | |
| created_at, updated_at 타임스탬프 관리 Mixin | |
| - created_at: DB 레벨에서 INSERT 시 자동 설정 | |
| - updated_at: DB 레벨에서 INSERT 시 자동 설정, UPDATE 시에는 별도 애플리케이션/DB 로직으로 명시적으로 갱신 필요 |
| source_type VARCHAR(50) NOT NULL, | ||
| cursor_value VARCHAR(500) NOT NULL, | ||
| created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), | ||
| updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SQL schema lacks an automatic update trigger for the updated_at column. Currently, updated_at only has a DEFAULT NOW() which sets it on INSERT but won't automatically update on modifications. Consider adding a trigger to automatically update this timestamp on UPDATE operations, or ensure it's explicitly updated in application code.
| updated_at: Mapped[datetime] = mapped_column( | ||
| DateTime(timezone=True), | ||
| nullable=False, | ||
| server_default=func.now(), # DB 레벨 - INSERT 시 자동 |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated_at field is missing an onupdate trigger to automatically update the timestamp when records are modified. Currently, with only server_default=func.now(), the field will be set on INSERT but not automatically updated on UPDATE operations. Add onupdate=func.now() to ensure the timestamp is updated on modifications, or explicitly update it in application code when records change.
| server_default=func.now(), # DB 레벨 - INSERT 시 자동 | |
| server_default=func.now(), # DB 레벨 - INSERT 시 자동 | |
| onupdate=func.now(), # UPDATE 시 자동 |
| try: | ||
| yield session | ||
| except Exception as e: | ||
| logger.error(f"Postgres DB 에러: {e}") |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message uses "Postgres" but the official name of the database is "PostgreSQL". For consistency and accuracy, use "PostgreSQL" in error messages and logging.
| logger.error(f"Postgres DB 에러: {e}") | |
| logger.error(f"PostgreSQL DB 에러: {e}") |
| ); | ||
|
|
||
| -- 인덱스 생성 | ||
| CREATE INDEX IF NOT EXISTS idx_github_cursor_repo_type | ||
| ON github_cursor (repository_name, source_type); No newline at end of file |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The index idx_github_cursor_repo_type is redundant because it's created on the same columns as the unique constraint uq_github_cursor. PostgreSQL automatically creates an index for unique constraints, so this separate index is unnecessary and wastes storage space. Remove this index definition or, if you need it for performance reasons in specific query patterns, add a comment explaining why.
| ); | |
| -- 인덱스 생성 | |
| CREATE INDEX IF NOT EXISTS idx_github_cursor_repo_type | |
| ON github_cursor (repository_name, source_type); | |
| ); |
| async with engine.begin() as conn: | ||
| await conn.run_sync(Base.metadata.create_all) | ||
|
|
||
| logger.info("DB 테이블 생성완료") |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing space after "생성" in the Korean log message. The message should be "DB 테이블 생성 완료" with a space between "생성" and "완료" for proper Korean grammar and readability.
| logger.info("DB 테이블 생성완료") | |
| logger.info("DB 테이블 생성 완료") |
Cassiiopeia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
이거 이슈만 임베딩 처리하는것으로 할까요?
이슈에 대해서 본문, 댓글 다 한번에 저장하는것으로
✨ 변경 사항
✅ 테스트
Summary by CodeRabbit
릴리스 노트
새로운 기능
인프라 개선
기타
✏️ Tip: You can customize this high-level summary in your review settings.