Exception handling is a critical part of modern software systems, especially as applications scale and complexity increases. Traditional exception handling can be rigid and reactive, often lacking the intelligence to adapt or escalate issues in real time. By integrating AI-driven exception handling with automated escalation and human-in-the-loop (HITL) feedback, organizations can achieve more resilient, adaptive, and auditable systems.
In this playbook, you'll learn how to build a real-time AI-powered exception handling workflow. This workflow will automatically classify exceptions, escalate critical issues, and route ambiguous cases for human review, leveraging feedback to improve over time.
Prerequisites
- Python 3.10+ installed
- Familiarity with Python exception handling
- Basic knowledge of REST APIs
- Docker (for running supporting services, e.g., message queues)
- PostgreSQL (or another relational database) for logging and feedback
- OpenAI GPT-3.5/4 API access or HuggingFace Transformers
- Experience with
fastapiandsqlalchemyis helpful
Step 1: Set Up the Project Structure
-
Initialize a new project directory:
mkdir ai-exception-handler && cd ai-exception-handler
-
Create a virtual environment and activate it:
python3 -m venv venv source venv/bin/activate
-
Install required dependencies:
pip install fastapi uvicorn sqlalchemy psycopg2-binary openai pydantic
-
Project layout:
main.py— FastAPI app and exception handlerai_classifier.py— AI exception classification logicmodels.py— SQLAlchemy modelsdatabase.py— DB connection utilitiesescalation.py— Automated escalation logicfeedback.py— Human-in-the-loop feedback endpoints
Step 2: Implement AI-Based Exception Classification
-
Set up your OpenAI API key:
export OPENAI_API_KEY=sk-...
-
Create
ai_classifier.py:This module sends exception details to the AI model and receives a classification and escalation recommendation.
import openai def classify_exception(exception_message: str) -> dict: prompt = ( f"Exception: {exception_message}\n" "Classify the exception as 'critical', 'warning', or 'info'. " "Should this be escalated automatically? (yes/no). " "If unsure, set 'escalate': 'human'." ) response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}], max_tokens=50, temperature=0 ) content = response.choices[0].message["content"].strip() # Example expected output: {"level": "critical", "escalate": "yes"} try: return eval(content) except Exception: # Fallback in case of parsing issues return {"level": "warning", "escalate": "human"}Note: In production, use
json.loads()and prompt the model for valid JSON only.
Step 3: Integrate AI Exception Handling into FastAPI
-
Create
main.pywith a custom exception handler:from fastapi import FastAPI, Request from fastapi.responses import JSONResponse from ai_classifier import classify_exception from escalation import escalate_issue from feedback import route_to_human app = FastAPI() @app.exception_handler(Exception) async def ai_exception_handler(request: Request, exc: Exception): exception_message = str(exc) ai_result = classify_exception(exception_message) if ai_result["escalate"] == "yes": escalate_issue(exception_message, ai_result["level"]) return JSONResponse( status_code=500, content={"detail": "Critical exception escalated automatically."} ) elif ai_result["escalate"] == "human": route_to_human(exception_message) return JSONResponse( status_code=500, content={"detail": "Exception routed for human review."} ) else: return JSONResponse( status_code=400, content={"detail": f"Exception classified as {ai_result['level']}."} ) @app.get("/") async def read_root(): raise ValueError("Simulated exception for testing.")Test it:
uvicorn main:app --reload
Visit
http://localhost:8000/in your browser. You should see a JSON response indicating how the exception was handled.
Step 4: Automated Escalation and Logging
-
Set up SQLAlchemy models in
models.py:from sqlalchemy import Column, Integer, String, DateTime from sqlalchemy.ext.declarative import declarative_base import datetime Base = declarative_base() class ExceptionLog(Base): __tablename__ = "exception_logs" id = Column(Integer, primary_key=True) message = Column(String) level = Column(String) escalation_status = Column(String) timestamp = Column(DateTime, default=datetime.datetime.utcnow) -
Database connection utility (
database.py):from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker DATABASE_URL = "postgresql://user:password@localhost/ai_exception_db" engine = create_engine(DATABASE_URL) SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)Replace
user,password, andai_exception_dbwith your actual credentials. -
Implement
escalation.pyfor logging and escalation:from models import ExceptionLog from database import SessionLocal def escalate_issue(message: str, level: str): db = SessionLocal() log = ExceptionLog( message=message, level=level, escalation_status="escalated" ) db.add(log) db.commit() db.close() # Here, you could add code to send alerts (email, Slack, PagerDuty, etc.)
Step 5: Implement Human-in-the-Loop Feedback
-
Route ambiguous exceptions to human review (
feedback.py):from models import ExceptionLog from database import SessionLocal def route_to_human(message: str): db = SessionLocal() log = ExceptionLog( message=message, level="unknown", escalation_status="pending_human" ) db.add(log) db.commit() db.close() # Optionally, send notification to a dashboard or queue for review -
Expose a FastAPI endpoint for humans to provide feedback and close the loop:
from fastapi import APIRouter, Body router = APIRouter() @router.post("/feedback") def feedback(exception_id: int = Body(...), resolution: str = Body(...)): db = SessionLocal() log = db.query(ExceptionLog).filter(ExceptionLog.id == exception_id).first() if log: log.escalation_status = resolution db.commit() db.close() return {"status": "updated"}Include this router in your
main.py:from feedback import router as feedback_router app.include_router(feedback_router)Now, human operators can POST feedback, e.g.:
curl -X POST "http://localhost:8000/feedback" -H "Content-Type: application/json" -d '{"exception_id":1,"resolution":"resolved"}'For more on HITL design, see Best Practices for Human-in-the-Loop AI Workflow Automation.
Step 6: Learning from Feedback (Continuous Improvement)
-
Periodically retrain or fine-tune your AI model using labeled data from human feedback logs.
- Export resolved exceptions and feedback from your database.
- Use this data to improve prompt engineering or train a custom model.
from models import ExceptionLog from database import SessionLocal def export_feedback(): db = SessionLocal() logs = db.query(ExceptionLog).filter(ExceptionLog.escalation_status != "pending_human").all() for log in logs: print(f"{log.message}\t{log.level}\t{log.escalation_status}") db.close()For more on HITL value, read Human-in-the-Loop AI in Workflow Automation: When Does It Actually Add Value?.
Common Issues & Troubleshooting
- OpenAI API errors: Ensure your API key is valid and you have network access. Check your usage limits.
-
Database connection errors: Double-check your
DATABASE_URLand that PostgreSQL is running. -
AI model returns invalid format: Prompt the model explicitly for JSON output and use
json.loads()for parsing. - Exceptions not being routed correctly: Add debug logs to your exception handler and verify the AI output.
- Feedback endpoint not updating status: Ensure IDs are correct and DB session commits changes.
Next Steps
- Expand escalation logic to integrate with real alerting systems (Slack, PagerDuty, etc.).
- Build a dashboard for monitoring exceptions and human feedback.
- Experiment with different AI models and prompt engineering to improve classification accuracy.
- Automate feedback loop for continuous learning and model improvement.
- Review Human-in-the-Loop AI in Workflow Automation: When Does It Actually Add Value? and Best Practices for Human-in-the-Loop AI Workflow Automation for deeper insights.
By following this playbook, you can deploy an AI-powered, real-time exception handling system that combines automated escalation with human-in-the-loop feedback. This hybrid approach ensures resilience, adaptability, and continuous improvement in your operational workflows.
