Skip to content

一个基于大语言模型的智能播客制作工具,能够自动生成播客时间线和对话内容,支持自我反思机制确保高质量内容输出。

License

Notifications You must be signed in to change notification settings

amazingchow/PodcastAgent

Repository files navigation

🎙️ Podcast Agent

Python 3.12+ License: MIT Code style: ruff

An intelligent podcast creation tool based on large language models that automatically generates podcast timelines and dialogue content, with built-in self-reflection mechanisms to ensure high-quality content output.

✨ Features

  • 🤖 Intelligent Timeline Generation: Automatically designs podcast structure based on theme, keywords, and duration
  • 💬 Natural Dialogue Creation: Generates dialogue content that fits podcast style
  • 🌍 Multi-language Support: Supports both Chinese and English podcast creation
  • 🎯 Flexible Configuration: Supports custom hosts, guests, style, and other parameters
  • 📊 Quality Scoring: Provides quality scores for each generated content segment
  • 🔄 Self-reflection Optimization: Built-in quality assessment mechanism for automatic iterative content optimization
  • 💰 Cost Tracking: Real-time monitoring of LLM usage costs with detailed cost statistics and analysis

🚀 Quick Start

Requirements

  • Python 3.12+
  • LLM API Key (via OpenRouter)

Installation

  1. Install the package
pip install podcast-agent
  1. Configure environment variables Create a .env file with the following variables:
OPENAI_API_KEY=your_api_key_here
OPENAI_BASE_URL=https://openrouter.ai/api/v1
MODEL_FOR_PODCAST=openai/gpt-5
MODEL_FOR_REFLECTION=google/gemini-2.5-pro

Usage Examples

Chinese Podcast Example

import podcast_agent as pa

pa.create_podcast(
    theme="人工智能在医疗领域的应用",
    keywords_str="AI诊断, 医疗影像, 个性化治疗, 伦理问题",
    host="潘乱(乱翻书主理人)",
    guests_str="Adam Zhou(坏小子AI CEO), 王建硕(某国内顶尖医院主任医师)",
    duration=20,
    style="专业且易懂, 口语化, 轻松幽默",
    language="中文"
)

English Podcast Example

import podcast_agent as pa

pa.create_podcast(
    theme="The Future of Artificial Intelligence in Healthcare",
    keywords_str="AI diagnosis, medical imaging, personalized treatment, ethical concerns, machine learning",
    host="Dr. Sarah Chen (AI Research Director)",
    guests_str="Dr. Michael Rodriguez (Chief Medical Officer), Prof. Lisa Thompson (AI Ethics Expert)",
    duration=25,
    style="Professional and engaging, conversational, accessible to general audience",
    language="English"
)

📖 Workflow

  1. Timeline Generation: Generate podcast structure based on input parameters
  2. Quality Assessment: Use reflection mechanism to evaluate timeline quality
  3. Content Creation: Generate dialogue content for each timeline segment
  4. Iterative Optimization: Automatically optimize content based on quality scores
  5. File Output: Save timeline and content to files

📁 Output Files

  • timeline_YYYYMMDD_HHMMSS.json: Timeline structure file
  • podcast_YYYYMMDD_HHMMSS.txt: Podcast content file
  • cost_stats_YYYYMMDD_HHMMSS.json: LLM usage cost statistics file

💰 Cost Tracking

The system automatically tracks all LLM API calls and provides detailed cost statistics:

  • Real-time Cost Monitoring: Every API call records token usage and estimated cost
  • Multi-dimensional Analysis: Statistics broken down by operation type (timeline generation, content generation, reflection evaluation, etc.) and model
  • Cost Optimization Suggestions: Helps users understand which operations consume the most resources
  • Detailed Reports: Generates JSON report files containing all usage records

Supported Model Pricing

The system includes built-in pricing information for mainstream LLM models, including:

  • OpenAI GPT series
  • Google Gemini series

Cost Statistics Example

======================================================================
                     💰 LLM Usage Cost Statistics                      
======================================================================

📊 Overall Statistics
  Total Tokens: 39,390
  Total Cost: $0.30458
  Average Cost per Token: $0.00000773
  Total Calls: 14

🔧 Breakdown by Operation
  Timeline Generation:
    Calls: 1
    Tokens: 2,851
    Cost: $0.02529
    Average per Call: $0.02529
  Reflection:
    Calls: 7
    Tokens: 20,083
    Cost: $0.13189
    Average per Call: $0.01884
  Content Generation:
    Calls: 6
    Tokens: 16,456
    Cost: $0.14740
    Average per Call: $0.02457

🤖 Breakdown by Model
  google/gemini-2.5-pro:
    Calls: 7
    Tokens: 19,307
    Cost: $0.17269
    Average per Call: $0.02467
  openai/gpt-5:
    Calls: 7
    Tokens: 20,083
    Cost: $0.13189
    Average per Call: $0.01884

🛠️ Development

Project Structure

PodcastAgent/
├── src/
│   └── podcast_agent/
│       ├── __init__.py
│       ├── agent.py          # Podcast creation Agent
│       ├── models.py         # Data models
│       ├── openai_client.py  # OpenAI client wrapper
│       ├── reflection.py     # Reflection Agent for evaluating and improving generated content
│       └── utils.py          # Utility functions
├── examples/
│   ├── example_zh.py         # Chinese podcast example
│   └── example_en.py         # English podcast example
├── output/                   # Output files directory
├── pyproject.toml            # Project configuration
└── README.md

Development Environment Setup

  1. Install development dependencies
uv sync
  1. Install pre-commit hooks
pre-commit install

Local Development Test Run

make local_run

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👨‍💻 Author

Adam Zhou - [email protected]


⭐ If this project helps you, please give it a star!

About

一个基于大语言模型的智能播客制作工具,能够自动生成播客时间线和对话内容,支持自我反思机制确保高质量内容输出。

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published