Skip to content

codechrysalis/rail-score

Β 
Β 

Repository files navigation

RAIL Score Python SDK

PyPI version Python Versions Downloads Downloads per Month License: MIT

GitHub Stars GitHub Forks GitHub Issues GitHub Pull Requests

Build Status Documentation Status Code style: black Typing: mypy

Responsible AI Research Paper

Evaluate and generate responsible AI content with the official Python client for RAIL Score API

Documentation β€’ API Reference β€’ Examples β€’ Report Issues


🌟 Features at a Glance

Feature Description
🎯 8 RAIL Dimensions Evaluate content across Reliability, Accountability, Interpretability, Legal Compliance, Safety, Privacy, Transparency, and Fairness
⚑ Multiple Evaluation Tiers Choose from basic, dimension-specific, custom, weighted, detailed, advanced, and batch evaluation
πŸ€– AI Generation Generate RAG-grounded responses, reprompt suggestions, and protected content
βœ… Compliance Checks Built-in support for GDPR, HIPAA, CCPA, and EU AI Act compliance
πŸ“Š Batch Processing Evaluate up to 100 items per request efficiently
πŸ”’ Type-Safe Full typing support with structured dataclasses for better IDE experience
πŸ”„ Auto-Retry Built-in error handling and automatic retries
πŸ“ˆ Usage Tracking Monitor credits, usage history, and API health

πŸš€ Quick Start

Installation

pip install rail-score

Basic Usage

from rail_score import RailScore

# Initialize client
client = RailScore(api_key="your-rail-api-key")

# Evaluate content
result = client.evaluation.basic("Our AI system ensures user privacy and data security.")

# Access scores
print(f"Overall RAIL Score: {result.rail_score.score}")
print(f"Confidence: {result.rail_score.confidence}")
print(f"Privacy Score: {result.scores['privacy'].score}")
print(f"Credits Used: {result.metadata.credits_consumed}")

πŸ“– Table of Contents


πŸ”§ Configuration

from rail_score import RailScore

client = RailScore(
    api_key="your-rail-api-key",
    base_url="https://api.responsibleailabs.ai",  # Optional
    timeout=60  # Request timeout in seconds
)

Getting an API Key: Visit responsibleailabs.ai to sign up and get your API key.


πŸ“Š Evaluation API

Basic Evaluation

Evaluate content across all 8 RAIL dimensions:

result = client.evaluation.basic(
    content="Your AI-generated content here",
    weights=None  # Optional custom weights
)

# Access results
print(result.rail_score.score)  # Overall score (0-10)
print(result.rail_score.confidence)  # Confidence (0-1)

# Individual dimensions
for dim_name, dim_score in result.scores.items():
    print(f"{dim_name}: {dim_score.score} (confidence: {dim_score.confidence})")
    print(f"  Explanation: {dim_score.explanation}")
    if dim_score.issues:
        print(f"  Issues: {', '.join(dim_score.issues)}")

# Metadata
print(f"Request ID: {result.metadata.req_id}")
print(f"Credits Used: {result.metadata.credits_consumed}")
print(f"Processing Time: {result.metadata.processing_time_ms}ms")

Dimension-Specific Evaluation

Evaluate on one specific dimension only:

result = client.evaluation.dimension(
    content="We collect user data with consent",
    dimension="privacy"  # One of: reliability, accountability, interpretability,
                         # legal_compliance, safety, privacy, transparency, fairness
)

print(result['result']['score'])
print(result['result']['explanation'])

Custom Evaluation

Evaluate only specific dimensions:

result = client.evaluation.custom(
    content="Healthcare AI system",
    dimensions=["safety", "privacy", "reliability"],
    weights={"safety": 40, "privacy": 35, "reliability": 25}
)

print(result.rail_score.score)
print(result.scores.keys())  # Only evaluated dimensions

Weighted Evaluation

Custom dimension weights:

weights = {
    "safety": 30,
    "privacy": 25,
    "reliability": 20,
    "accountability": 15,
    "transparency": 5,
    "fairness": 3,
    "inclusivity": 1,
    "user_impact": 1
}

result = client.evaluation.weighted("Content here", weights=weights)

Detailed Evaluation

Get detailed breakdown with strengths and weaknesses:

result = client.evaluation.detailed("AI model description")

summary = result['result']['summary']
print(f"Strengths: {summary['strengths']}")
print(f"Weaknesses: {summary['weaknesses']}")
print(f"Improvements needed: {summary['improvements_needed']}")

Advanced Evaluation

Ensemble evaluation with higher confidence:

result = client.evaluation.advanced(
    content="Critical AI system",
    context="Healthcare decision support system"  # Optional
)

print(result.rail_score.confidence)  # Typically 0.90+

Batch Evaluation

Evaluate multiple items in one request:

items = [
    {"content": "First AI-generated text"},
    {"content": "Second AI-generated text"},
    {"content": "Third AI-generated text"}
]

result = client.evaluation.batch(
    items=items,
    dimensions=["safety", "privacy", "fairness"],
    tier="balanced"  # "fast", "balanced", or "advanced"
)

print(f"Processed: {result.successful}/{result.total_items}")

for item_result in result.results:
    print(f"Score: {item_result.rail_score.score}")

RAG Evaluation

Evaluate RAG responses for hallucinations:

result = client.evaluation.rag_evaluate(
    query="What is the capital of France?",
    response="The capital of France is Paris.",
    context_chunks=[
        {"content": "Paris is the capital city of France."},
        {"content": "France is a country in Western Europe."}
    ]
)

metrics = result['result']['rag_metrics']
print(f"Hallucination Score: {metrics['hallucination_score']}")  # Lower is better
print(f"Grounding Score: {result['result']['grounding_score']}")  # Higher is better
print(f"Overall Quality: {metrics['overall_quality']}")

πŸ€– Generation API

RAG Chat

Generate context-grounded responses:

result = client.generation.rag_chat(
    query="What are the benefits of GDPR compliance?",
    context="GDPR provides data protection and privacy rights to EU citizens...",
    max_tokens=300,
    model="gpt-4o-mini"
)

print(result.generated_text)
print(f"Tokens used: {result.usage['total_tokens']}")
print(f"Credits: {result.metadata.credits_consumed}")

Reprompting

Get improvement suggestions:

current_scores = {
    "transparency": {"score": 4.5},
    "accountability": {"score": 5.0}
}

result = client.generation.reprompt(
    content="AI makes decisions automatically",
    current_scores=current_scores,
    target_score=8.0,
    focus_dimensions=["transparency", "accountability"]
)

suggestions = result['result']['improvement_suggestions']
print(suggestions['text_replacements'])
print(suggestions['expected_improvements'])

Protected Generation

Generate content with safety filters:

result = client.generation.protected_generate(
    prompt="Write a description for an AI hiring tool",
    max_tokens=200,
    min_rail_score=8.0
)

print(result.generated_text)
print(f"RAIL Score: {result.rail_score}")
print(f"Safety Passed: {result.safety_passed}")

βœ… Compliance API

GDPR Compliance

result = client.compliance.gdpr(
    content="We collect user emails for marketing purposes",
    context={"data_type": "personal", "region": "EU"},
    strict_mode=True  # Use 7.5 threshold instead of 7.0
)

print(f"Compliance Score: {result.compliance_score}")
print(f"Passed: {result.passed}/{result.requirements_checked}")

for req in result.requirements:
    print(f"{req.requirement} ({req.article}): {req.status}")
    if req.status == "FAIL":
        print(f"  Issue: {req.issue}")

Other Compliance Checks

# CCPA
result = client.compliance.ccpa("Content here")

# HIPAA
result = client.compliance.hipaa("Healthcare AI system")

# EU AI Act
result = client.compliance.ai_act("AI system description")

πŸ› οΈ Utilities

Check Credits

credits = client.get_credits()

print(f"Plan: {credits['plan']}")
print(f"Monthly Limit: {credits['credits']['monthly_limit']}")
print(f"Used This Month: {credits['credits']['used_this_month']}")
print(f"Remaining: {credits['credits']['remaining']}")

Get Usage History

usage = client.get_usage(limit=50, from_date="2025-01-01T00:00:00Z")

print(f"Total records: {usage['total_records']}")
print(f"Total credits used: {usage['total_credits_used']}")

for entry in usage['history']:
    print(f"{entry['timestamp']}: {entry['endpoint']} - {entry['credits_used']} credits")

Health Check

health = client.health_check()

print(f"Status: {health['ok']}")
print(f"Version: {health['version']}")

⚠️ Error Handling

from rail_score import (
    RailScore,
    AuthenticationError,
    InsufficientCreditsError,
    ValidationError,
    RateLimitError,
    PlanUpgradeRequired
)

client = RailScore(api_key="your-api-key")

try:
    result = client.evaluation.basic("Your content")
except AuthenticationError:
    print("Invalid API key")
except InsufficientCreditsError as e:
    print(f"Not enough credits. Balance: {e.balance}, Required: {e.required}")
except ValidationError as e:
    print(f"Invalid parameters: {e}")
except RateLimitError as e:
    print(f"Rate limit exceeded. Retry after: {e.retry_after} seconds")
except PlanUpgradeRequired:
    print("This endpoint requires a Pro or higher plan")

πŸ“¦ Response Structure

All endpoints return responses with this structure:

{
  "result": {
    "rail_score": {"score": 8.7, "confidence": 0.90},
    "scores": {
      "privacy": {"score": 9.1, "confidence": 0.94, "explanation": "..."},
      ...
    },
    "processing_time": 2.5
  },
  "metadata": {
    "req_id": "abc-123",
    "tier": "pro",
    "queue_wait_time_ms": 1200.0,
    "processing_time_ms": 2500.0,
    "credits_consumed": 2.0,
    "timestamp": "2025-11-03T10:30:00Z"
  }
}

πŸ’‘ Use Cases

Content Moderation

from rail_score import RailScore

client = RailScore(api_key="your-key")

# Check user-generated content for safety
result = client.evaluation.dimension(
    content="User comment here",
    dimension="safety"
)

if result['result']['score'] < 7.0:
    print("Content flagged for review")
    print(f"Issues: {result['result']['issues']}")

Batch Content Evaluation

# Evaluate multiple pieces of content
items = [{"content": text} for text in content_list]

result = client.evaluation.batch(
    items=items[:100],  # Max 100 items
    dimensions=["safety", "fairness", "privacy"]
)

# Filter by score
safe_content = [
    items[i]
    for i, res in enumerate(result.results)
    if res.rail_score.score >= 7.5
]

Compliance Checking

# Check GDPR compliance
result = client.compliance.gdpr(
    content="AI system for user profiling",
    context={"purpose": "marketing", "data_type": "personal"}
)

if result.failed > 0:
    print("GDPR compliance issues found:")
    for req in result.requirements:
        if req.status == "FAIL":
            print(f"- {req.requirement}: {req.issue}")

πŸ”¨ Development

Requirements

  • Python 3.8+
  • requests >= 2.28.0

Setup

# Clone repository
git clone https://github.com/Responsible-AI-Labs/rail-score.git
cd rail-score

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black rail_score/

# Type checking
mypy rail_score/

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Quick Links


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ“ž Support


🌐 Related Resources


⭐ Star History

If you find RAIL Score useful, please consider giving it a star! ⭐

Star History Chart


Made with ❀️ by Responsible AI Labs

Website β€’ Documentation β€’ GitHub β€’ Twitter

About

Python SDK

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%