Document AI is rapidly transforming how organizations handle unstructured information, automating everything from invoice approvals to contract data extraction. As we covered in our complete guide to automating complex document workflows with AI, prompt engineering is the linchpin for unlocking real-world value from these systems. This tutorial provides a practical, hands-on deep dive into prompt engineering for Document AI—focusing on approval and extraction use cases, with reusable templates, code examples, and troubleshooting tips.
Whether you’re integrating LLMs into your document workflow, building custom extraction pipelines, or seeking robust approval automation, this guide delivers actionable steps. For a broader view of available platforms and compliance considerations, see our sibling articles: AI Document Workflow Tools: A 2026 Buyer’s Guide and Automating Document Workflows in Regulated Industries: AI Compliance Techniques That Work.
Prerequisites
- Python 3.10+ (tested with Python 3.11)
- OpenAI API access (or Azure OpenAI, or Google Vertex AI with Gemini)
- openai Python package (v1.2+)
- Basic knowledge of prompt engineering concepts (see: Prompt Engineering for Approval Workflows: Patterns, Anti-Patterns, and Real-World Templates)
- Familiarity with
requestsandjsonlibraries - Sample documents for testing (PDF or plain text)
1. Setting Up Your Environment
-
Install Python and required packages:
python3 -m venv .venv source .venv/bin/activate pip install openai==1.2.3 python-dotenvIf you plan to parse PDFs, also install
pypdf:pip install pypdf -
Configure your API key:
- Create a
.envfile in your project directory:
OPENAI_API_KEY=sk-...- Load the key in your Python scripts using
python-dotenv:
from dotenv import load_dotenv import os load_dotenv() api_key = os.getenv("OPENAI_API_KEY") - Create a
2. Extracting Text from Documents
-
Extract text from a PDF (optional):
from pypdf import PdfReader def extract_pdf_text(pdf_path): reader = PdfReader(pdf_path) text = "" for page in reader.pages: text += page.extract_text() + "\n" return text doc_text = extract_pdf_text("sample_invoice.pdf")For plain text files, simply read with
open("file.txt").read().
3. Designing Effective Prompts for Document Extraction
Prompt engineering is all about clarity and structure. For document extraction, we want to guide the LLM to return structured outputs (ideally JSON) and to ignore irrelevant content. Below are real-world prompt templates for extracting key fields from invoices and contracts.
-
Invoice Extraction Prompt Template:
extraction_prompt = f""" You are an expert in document data extraction. Extract the following fields from the document below: - Invoice Number - Invoice Date - Vendor Name - Total Amount Return your answer as a valid JSON object with keys: invoice_number, invoice_date, vendor_name, total_amount. Document: \"\"\" {doc_text} \"\"\" """ -
Contract Extraction Prompt Template:
contract_prompt = f""" Extract the following key terms from the contract below: - Effective Date - Termination Clause (copy the full clause) - Governing Law Return as JSON with keys: effective_date, termination_clause, governing_law. Contract: \"\"\" {doc_text} \"\"\" """
4. Running Extraction with OpenAI GPT
We'll use the openai.ChatCompletion API for structured extraction, which supports models like gpt-4-turbo or gpt-3.5-turbo.
-
Send the prompt to the API:
import openai response = openai.ChatCompletion.create( model="gpt-4-turbo", messages=[ {"role": "system", "content": "You are a helpful assistant for document data extraction."}, {"role": "user", "content": extraction_prompt} ], temperature=0.0, max_tokens=512 ) extracted_json = response.choices[0].message.content print(extracted_json)Tip: Use
temperature=0.0for deterministic outputs. -
Parse and validate the JSON output:
import json try: data = json.loads(extracted_json) print("Extracted fields:", data) except json.JSONDecodeError: print("Output is not valid JSON. Raw output:", extracted_json)
5. Prompt Templates for Approval Automation
Approval workflows often require the LLM to decide if a document meets certain criteria and to provide justification. Here are robust prompt templates for such scenarios, inspired by patterns from our deep dive on generative AI prompt engineering for approval workflow automation.
-
Approval Decision Prompt Template:
approval_prompt = f""" You are an automated approval assistant. Review the following document and determine if it should be approved based on these criteria: - The invoice amount is less than $10,000 - The invoice date is within the last 90 days Return your answer as JSON: {{ "approved": true/false, "justification": "Explain your decision" }} Document: \"\"\" {doc_text} \"\"\" """ -
Send the approval prompt and parse results:
response = openai.ChatCompletion.create( model="gpt-4-turbo", messages=[ {"role": "system", "content": "You are an approval automation assistant."}, {"role": "user", "content": approval_prompt} ], temperature=0.0, max_tokens=300 ) approval_result = response.choices[0].message.content try: result = json.loads(approval_result) print(f"Approved: {result['approved']}\nJustification: {result['justification']}") except json.JSONDecodeError: print("Approval output is not valid JSON. Raw output:", approval_result)
6. Advanced Prompt Engineering Patterns
For more complex workflows, consider these enhancements:
- Few-shot prompting: Provide 1-2 example Q&A pairs to improve accuracy.
- Chain-of-Thought reasoning: Ask the model to explain its reasoning step by step before outputting a final answer.
-
Schema enforcement: Use
response_format={"type": "json_object"}(if your LLM supports it) to force valid JSON output.
cot_approval_prompt = f"""
You are an approval automation assistant. Review the following document and determine if it should be approved based on these criteria:
- The invoice amount is less than $10,000
- The invoice date is within the last 90 days
First, explain your reasoning step by step. Then, return your answer as JSON:
{{
"approved": true/false,
"justification": "Your reasoning here"
}}
Document:
\"\"\"
{doc_text}
\"\"\"
"""
7. Testing and Evaluating Your Prompts
- Prepare a test corpus: Gather a set of real-world sample documents (invoices, contracts, etc.).
-
Automate prompt testing:
test_docs = ["sample_invoice1.pdf", "sample_invoice2.pdf"] for doc_path in test_docs: doc_text = extract_pdf_text(doc_path) prompt = extraction_prompt.format(doc_text=doc_text) # ...send to API and evaluate results... - Evaluate accuracy: Compare extracted fields or approval decisions to ground truth data.
- Iterate on prompts: Adjust instructions, add examples, or clarify schema as needed.
Common Issues & Troubleshooting
-
Model returns unstructured or partial output:
- Use explicit instructions: “Return your answer as a valid JSON object with these keys...”
- Set
temperature=0.0for more deterministic results.
-
Invalid JSON returned:
- Post-process output with
json.loads()and handle exceptions. - Try schema enforcement or few-shot examples.
- Post-process output with
-
Hallucinated data (fields invented):
- Instruct the model: “If a field is missing, set its value to null.”
-
API rate limits or timeouts:
- Implement retry logic and exponential backoff.
-
Extraction accuracy varies by document type:
- Segment your prompts by document type, or use model fine-tuning for high-volume use cases.
-
Security and compliance:
- Never send sensitive documents to public APIs without proper encryption and compliance checks. See our guide to AI compliance in regulated industries.
Next Steps
You’ve now built a foundation for prompt engineering in Document AI—covering both extraction and approval automation, with practical templates and troubleshooting strategies. For more advanced workflows, explore chaining multiple prompts, integrating with RAG (Retrieval-Augmented Generation), and leveraging specialized document AI platforms (see: AI Document Workflow Tools: A 2026 Buyer’s Guide). For a broader strategic perspective, revisit our pillar article on automating complex document workflows with AI.
To deepen your expertise, check out related guides on prompt engineering for real-time incident response workflows and our explainer on extracting data from unstructured documents with AI-powered workflow solutions.
Prompt engineering is an iterative process. Test, refine, and adapt your templates as your document workflows evolve—and stay tuned for more playbooks from Tech Daily Shot.