A personalized New Zealand news aggregator with intelligent content curation.
Forcible is a command-line tool that collects news from New Zealand sources, stores them in a local SQLite database, and provides intelligent analysis using LLMs to extract key facts, detect PR content, and personalize the news feed.
- Multi-source aggregation: Fetches news from Radio New Zealand RSS feeds
- LLM-powered analysis: Uses OpenAI's structured outputs to analyze articles
- Extracts key facts and statistics with importance scoring
- Scores article relevance for NZ news interests
- Detects potential PR-planted stories
- Classifies content as "headline-only" or "clickthrough"
- Generates concise summaries
- Flexible configuration: Supports both INI and JSON config formats
- SQLite storage: Local database for all articles and analysis
- CLI interface: Easy-to-use command-line tools
- Install Python dependencies:
pip install -r requirements.txt- Initialize the configuration:
python forcible.py initThis creates a config.ini file from the example template.
Alternatively, you can use JSON configuration:
python forcible.py --config config.json init- Edit
config.ini(orconfig.json) and add your OpenAI API key:
For INI format:
[openai]
api_key = your-actual-api-key-hereFor JSON format:
{
"openai": {
"api_key": "your-actual-api-key-here"
}
}Fetch articles from all configured sources:
python forcible.py fetchFetch only from Radio New Zealand:
python forcible.py fetch --source rnzList recent articles:
python forcible.py listList articles from a specific source:
python forcible.py list --source rnz_nationalLimit the number of articles shown:
python forcible.py list --limit 10Show database statistics:
python forcible.py statsProcess unprocessed articles with LLM analysis:
python forcible.py processProcess a limited number of articles:
python forcible.py process --limit 10Process a specific article by ID:
python forcible.py process --article-id 42Show detailed analysis results:
python forcible.py process --verboseView a specific article with its LLM analysis:
python forcible.py view 42This displays the article headline, content, and structured LLM analysis including:
- Key facts and statistics
- Relevance score
- PR probability assessment
- Content classification
- Summary
- config.py: Configuration management (supports both INI and JSON formats)
- database.py: SQLite database interface for storing articles
- rnz_ingester.py: Radio New Zealand RSS feed ingester
- llm_processor.py: LLM-based article analysis with structured outputs
- forcible.py: Command-line interface
articles table:
id: Primary keyurl: Unique article URLsource: Source identifier (e.g., 'rnz_national')headline: Article headlinepublished_date: Publication date (ISO format)fetched_date: Date fetched from sourcecontent: Article content/summarydata: JSON field for LLM analysis results (facts, relevance, PR probability, etc.)created_at: Record creation timestampupdated_at: Last update timestamp
The data field contains structured LLM analysis:
{
"key_facts": [
{"fact": "...", "importance": 8}
],
"relevance_score": 7,
"pr_probability": 25,
"content_classification": "headline-only",
"summary": "...",
"reasoning": "...",
"processed_at": "2024-01-01T12:00:00"
}source_tracking table:
source_name: Source identifier (primary key)last_scraped: Last scrape timestamplast_article_date: Most recent article date seen
The configuration file (INI or JSON format) supports:
- OpenAI API key: For LLM processing
- LLM model: Model to use (default: gpt-4o-mini)
- Prompts: Configurable prompts for different analysis tasks (legacy, not used with structured outputs)
- Sources: RSS/Atom feed URLs for different news sources
- Database: Database file path
You can use either config.ini (INI format) or config.json (JSON format). The JSON format provides better structure for complex configurations.
Currently supported:
- National news
- World news
- Business news
- Political news
All via RSS feeds.
- Additional sources (Stuff, NZ Herald)
- Web scraping for sources without RSS feeds
- archive.is integration for paywall bypass
- User preference management
- Web interface for viewing personalized feed
- Batch processing optimizations
- Caching and rate limiting for LLM API calls