Skip to content

AWS Bedrock Agent: Intelligent PDF to Markdown Converter🔗︎

A production-ready serverless application that leverages AWS Bedrock Agents with Claude AI to convert PDF documents into structured Markdown with intelligent image positioning and bulk processing capabilities.

Project Overview🔗︎

This project demonstrates how to build a sophisticated document processing pipeline using AWS Bedrock Agents, Lambda functions, and Claude AI. The system automatically extracts text and images from PDF files, analyses their spatial relationships, and generates high-quality Markdown output with contextually positioned images.

Using AI to build this AI solution

This project showcases an interesting meta-application of AI: using Claude Code in VS Code to build an AI-powered document processing system. The development process itself demonstrates semi-agentic AI assistance in action.

The Development Partnership: As an Azure-focused developer with limited AWS experience, I relied heavily on Claude Code's autonomous capabilities to navigate the AWS ecosystem. Claude Code acted as both a knowledgeable AWS consultant and a hands-on development partner, executing commands and analysing results in real-time.

Semi-Agentic Development in Action:

  • AWS CLI Operations: Claude Code autonomously executed AWS CLI commands to discover available Bedrock models, check account permissions, and configure service access
  • Model Discovery: When the initial Claude model IDs failed, Claude Code systematically explored available inference profiles, eventually discovering the correct model path: us.anthropic.claude-sonnet-4-20250514-v1:0
  • Configuration Troubleshooting: Claude Code automatically ran diagnostic commands to resolve authentication issues, check IAM permissions, and validate service configurations
  • Code Generation: Real-time creation of Lambda functions, Docker configurations, and deployment scripts based on iterative testing and feedback
  • Error Resolution: When faced with runtime errors, Claude Code autonomously analysed CloudWatch logs, identified issues, and proposed code fixes

Platform Choice: AWS vs Azure: The decision to use AWS instead of my familiar Azure platform was driven by AI model availability. At the time of development, Azure's AI services were heavily based on OpenAI's ChatGPT models, whilst AWS Bedrock offered direct access to Anthropic's Claude family - specifically Claude Sonnet 4, which provided superior document analysis capabilities.

Interestingly, Microsoft and OpenAI are evolving their partnership away from the exclusive relationship model, with Microsoft diversifying to include Anthropic's Claude models in Office 365 applications. This shift may significantly change the Azure AI landscape in the future.

The Human-AI Development Flow:

  1. Problem Definition: I described the goal (PDF to Markdown conversion)
  2. Architecture Exploration: Claude Code suggested AWS Bedrock Agents and researched the implementation approach
  3. Hands-on Implementation: Claude Code wrote code, executed AWS commands, and debugged issues autonomously
  4. Iterative Refinement: Together we refined the solution through multiple deployment and testing cycles
  5. Production Optimisation: Claude Code implemented bulk processing, error handling, and performance improvements

Version Control Strategy: To maintain clarity during the rapid iterative development process, I implemented a simple but effective version control approach: each Lambda function deployment included a hardcoded version number that Claude Code would automatically increment with every code change. This seemingly simple practice proved invaluable:

  • Deployment Verification: Instant confirmation that the correct code version was running in AWS
  • Communication Clarity: Easy reference to specific iterations during troubleshooting ("version 0.2.8 had the timeout issue")
  • Development Tracking: Clear progression through the evolution from basic PDF processing to full bulk operations
  • Rollback Identification: Quick identification of which version to revert to when issues arose

This human-directed, AI-executed versioning approach exemplified the collaborative nature of the development process - strategic oversight combined with autonomous implementation.

The Crucial Role of Context7 MCP: A critical component of this development success was the Context7 MCP (Model Context Protocol) integration. Context7 served as Claude Code's "fact-checker" and up-to-date documentation source, preventing the common AI pitfall of working with outdated information.

How Context7 Kept Development on Track:

  • Real-time Documentation: Provided current AWS Bedrock API specifications and model availability
  • Version Verification: Ensured PyMuPDF installation commands matched the latest best practices
  • Parameter Validation: Confirmed correct AWS CLI syntax and Lambda configuration options
  • Dependency Management: Verified compatible Python package versions for Lambda runtime

MCP Configuration: Context7 MCP is configured in %userprofile%\.claude.json (not the .claude directory):

%userprofile%\.claude.json
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "mcpServers": {
    "context7": {
      "command": "npx",
      "args": ["-y", "@upstash/context7-mcp"],
      "env": {
        "CONTEXT7_API_KEY": "optional-for-basic-usage"
      }
    }
  }
}

Key Learning: This project demonstrates how AI assistants can serve as domain experts in unfamiliar technology stacks, effectively acting as both consultant and implementation partner. The semi-agentic nature of Claude Code - autonomously executing commands whilst maintaining human oversight - combined with Context7's real-time documentation access, proved invaluable for cross-platform development.

Production Status

✅ PRODUCTION READY - Version 0.3.2 deployed and operational

  • Real PDF text and image extraction using PyMuPDF
  • Intelligent image positioning with Claude AI analysis
  • Bulk processing with smart skip logic
  • Conversational interface with progress indicators
  • 15-minute timeout, 1GB memory, comprehensive error handling

Architecture Overview🔗︎

System Components🔗︎

![AWS Architecture Diagram Placeholder] Screenshot needed: AWS Console showing Bedrock Agent with action groups, model configuration, and Lambda function integration

The solution consists of several integrated AWS services:

  • Amazon Bedrock Agent: Orchestrates the conversion workflow with conversational interface
  • AWS Lambda: Serverless compute for PDF processing (Python 3.12)
  • Amazon S3: Storage for input PDFs and output Markdown files
  • Amazon Bedrock: Claude AI model access for intelligent content analysis
  • Amazon CloudWatch: Logging and monitoring

Data Flow🔗︎

graph TD
    A[User Request] --> B[Bedrock Agent]
    B --> C[Lambda Function]
    C --> D[S3 Input Bucket]
    D --> E[PyMuPDF Processing]
    E --> F[Claude AI Analysis]
    F --> G[Markdown Generation]
    G --> H[S3 Output Bucket]
    H --> I[Optional Zip Creation]
    I --> J[User Response]

Key Features🔗︎

🔄 Intelligent PDF Processing🔗︎

The system uses PyMuPDF for real PDF text and image extraction, going beyond simple text parsing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def extract_pdf_content_with_pymupdf(pdf_bytes):
    """Extract text and images from PDF with positional data"""
    doc = fitz.open(stream=pdf_bytes, filetype="pdf")

    all_text = ""
    images = []

    for page_num in range(len(doc)):
        page = doc.load_page(page_num)

        # Extract text with positioning
        text = page.get_text()
        all_text += f"\n--- Page {page_num + 1} ---\n{text}\n"

        # Extract images with context
        image_list = page.get_images()
        for img_index, img in enumerate(image_list):
            # Process and save images with contextual positioning

🧠 Claude AI Integration🔗︎

The extracted content is sent to Claude Sonnet 4 for intelligent analysis:

Model Configuration

  • Model: Claude Sonnet 4 (inference profile)
  • API Version: bedrock-2023-05-31
  • Processing Time: ~2-5 minutes per PDF
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
def process_with_claude(content, images_info, output_filename):
    """Send content to Claude for intelligent Markdown conversion"""
    bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-east-1')

    prompt = f"""
    Convert this PDF content to professional Markdown format.

    Content: {content}
    Images: {images_info}

    Requirements:
    - Analyze spatial relationships between text and images
    - Position image references contextually in the Markdown
    - Create descriptive alt text based on surrounding content
    - Maintain logical document flow
    """

📦 Bulk Processing Capabilities🔗︎

Smart Processing Features

  • Auto-discovery: Finds all PDFs in specified S3 paths
  • Skip logic: Avoids reprocessing existing files
  • Batch limits: Max 10 files, 100MB total, 200 pages
  • Error resilience: Continues processing if individual files fail

💬 Conversational Interface🔗︎

The Bedrock Agent provides an intuitive chat interface:

Example Conversation:

User: Process all PDFs in s3://pdf-input-bucket/
Agent: Found 3 PDF(s) to process. Would you like the output zipped?
       1. Yes - Create zip file
       2. No - Keep individual folders
User: Yes
Agent: ✅ Processed 3 PDFs. Zip created: s3://pdf-output-bucket/PDF2MD-bulk-20250121-143022.zip

Implementation Details🔗︎

Lambda Function Architecture🔗︎

![Lambda Configuration Screenshot Placeholder] Screenshot needed: AWS Lambda console showing function configuration with Runtime (Python 3.12), Memory (1024 MB), Timeout (15 min), and attached layers

Function Configuration:

  • Runtime: Python 3.12
  • Memory: 1GB
  • Timeout: 15 minutes (900 seconds)
  • Layer: Custom PyMuPDF layer (50MB)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
def lambda_handler(event, context):
    """Main Lambda handler for Bedrock Agent events"""
    try:
        # Extract parameters from Bedrock Agent event
        agent_id = event.get('agent', {}).get('agentId')
        session_id = event.get('sessionId')
        input_text = event.get('inputText', '')

        # Process based on request type
        if 'bulk' in input_text.lower() or 's3://' in input_text:
            return process_pdf_batch(event, context)
        else:
            return process_single_pdf(event, context)

    except Exception as e:
        logger.error(f"❌ Error in lambda_handler: {str(e)}")
        return create_bedrock_response(f"Error processing request: {str(e)}")

PyMuPDF Integration🔗︎

The system uses a custom Lambda layer for PyMuPDF integration:

Lambda Layer Requirements

  • Size: 50MB PyMuPDF 1.26.4 compiled for AWS Lambda
  • Paths: /opt/python/ and /opt/python/lib/python3.12/site-packages/
  • Build: Docker-based compilation using AWS Lambda Python 3.12 base image
Creating the PyMuPDF Lambda Layer

PyMuPDF requires native libraries that must be compiled specifically for the AWS Lambda runtime environment. The path to the final Docker-based solution involved several failed attempts and important lessons about Lambda's constraints.

Initial Attempts and Why They Failed:

Before settling on the Docker approach, Claude Code first attempted to find pre-built PyMuPDF packages compatible with AWS Lambda:

  1. PyPI Wheel Search: Searched for existing wheels (1) compiled for Linux x86_64 that might work in Lambda's Amazon Linux environment
  2. AWS Lambda Layers: Looked for community-contributed layers containing PyMuPDF
  3. Conda Packages: Investigated conda-forge packages as an alternative source

The Size Problem: AWS Lambda has strict size limits that made PyMuPDF particularly challenging:

  • Layer Limit: 250MB uncompressed (50MB compressed)
  • PyMuPDF Dependencies: Includes large graphics libraries (MuPDF, FreeType, OpenJPEG)
  • Architecture Mismatch: Windows/Mac wheels wouldn't work on Lambda's Linux environment
  • Wheel Bloat: Standard PyMuPDF wheels often include unnecessary components for our PDF-only use case

Why Docker Became Necessary: After the initial approaches failed, Claude Code determined that building from source in Lambda's exact runtime environment was the only reliable solution:

  • Environment Matching: Docker uses the official AWS Lambda Python 3.12 base image
  • Dependency Control: Can exclude unnecessary components during compilation
  • Size Optimisation: Targeted installation to specific paths reduces bloat
  • Reproducible Builds: Ensures consistent results across different development machines

Here's the complete build process with the actual scripts used:

Build Files Overview:

  • requirements.txt - Python dependencies
  • build-layer.dockerfile - Docker configuration for Lambda environment
  • build-layer.bat - Windows batch script for automated building
  • create-layer.py - Python script alternative for cross-platform building

requirements.txt

requirements.txt
PyMuPDF==1.26.4

build-layer.dockerfile

build-layer.dockerfile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
FROM public.ecr.aws/lambda/python:3.12

# Create the proper layer directory structure
RUN mkdir -p /opt/python/lib/python3.12/site-packages

# Install PyMuPDF with all dependencies to the correct location
RUN pip install --upgrade pip
RUN pip install PyMuPDF==1.26.4 -t /opt/python/lib/python3.12/site-packages/ --no-cache-dir

# Also install to /opt/python for compatibility
RUN pip install PyMuPDF==1.26.4 -t /opt/python/ --no-cache-dir

# Verify installation
RUN echo "=== /opt/python contents ===" && ls -la /opt/python/
RUN echo "=== /opt/python/lib/python3.12/site-packages contents ===" && ls -la /opt/python/lib/python3.12/site-packages/

# Create layer structure
CMD ["cp", "-r", "/opt/python", "/output/"]

build-layer.bat (Windows)

build-layer.bat
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
@echo off
echo Building PyMuPDF Lambda Layer using Docker...

REM Build the Docker image
docker build -t pymupdf-layer-builder -f build-layer.dockerfile .

REM Create container and copy layer files
docker create --name temp-container pymupdf-layer-builder
docker cp temp-container:/opt/python ./
docker rm temp-container

echo Creating layer ZIP file...
powershell -Command "Compress-Archive -Path python -DestinationPath pymupdf-layer.zip -Force"

echo PyMuPDF layer created: pymupdf-layer.zip
echo Layer size:
powershell -Command "(Get-Item pymupdf-layer.zip).Length / 1MB"
echo MB

echo Ready to deploy to AWS Lambda!

create-layer.py (Cross-platform Python)

create-layer.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
#!/usr/bin/env python3
"""
Create PyMuPDF Lambda Layer using a temporary container
"""
import os
import subprocess
import tempfile
import zipfile

def create_pymupdf_layer():
    print("Creating PyMuPDF Lambda layer...")

    # Create temporary directory
    with tempfile.TemporaryDirectory() as temp_dir:
        python_dir = os.path.join(temp_dir, "python")
        os.makedirs(python_dir)

        # Use Docker to install PyMuPDF for Linux
        dockerfile_content = """
FROM public.ecr.aws/lambda/python:3.12
RUN pip install PyMuPDF==1.26.4 -t /opt/python/
"""

        dockerfile_path = os.path.join(temp_dir, "Dockerfile")
        with open(dockerfile_path, "w") as f:
            f.write(dockerfile_content)

        # Build Docker image
        print("Building Docker image...")
        subprocess.run([
            "docker", "build", "-t", "pymupdf-builder", "-f", dockerfile_path, temp_dir
        ], check=True)

        # Extract layer files from container
        print("Extracting layer files...")
        subprocess.run([
            "docker", "create", "--name", "temp-pymupdf", "pymupdf-builder"
        ], check=True)

        subprocess.run([
            "docker", "cp", "temp-pymupdf:/opt/python/", python_dir
        ], check=True)

        subprocess.run([
            "docker", "rm", "temp-pymupdf"
        ], check=True)

        # Create ZIP file
        print("Creating ZIP file...")
        zip_path = os.path.join(os.getcwd(), "pymupdf-layer.zip")

        with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
            for root, dirs, files in os.walk(python_dir):
                for file in files:
                    file_path = os.path.join(root, file)
                    arc_path = os.path.relpath(file_path, temp_dir)
                    zipf.write(file_path, arc_path)

        # Check file size
        size_mb = os.path.getsize(zip_path) / (1024 * 1024)
        print(f"Layer created: {zip_path}")
        print(f"Size: {size_mb:.2f} MB")

        if size_mb > 250:
            print("WARNING: Layer exceeds 250MB limit!")
        else:
            print("Layer size is within AWS Lambda limits.")

        return zip_path

if __name__ == "__main__":
    create_pymupdf_layer()

Build Process:

  1. Using Windows Batch Script:

    # Simply run the batch file
    ./build-layer.bat
    

  2. Using Python Script (Cross-platform):

    # Run the Python script
    python create-layer.py
    

  3. Manual Docker Commands: ```bash linenum="1" # Build the Docker image docker build -t pymupdf-layer-builder -f build-layer.dockerfile .

# Extract layer files docker create --name temp-container pymupdf-layer-builder docker cp temp-container:/opt/python ./ docker rm temp-container

# Create ZIP file (Linux/Mac) zip -r pymupdf-layer.zip python/

# Create ZIP file (Windows PowerShell) Compress-Archive -Path python -DestinationPath pymupdf-layer.zip -Force ```

Upload to AWS Lambda: The resulting pymupdf-layer.zip file (approximately 50MB) is uploaded to AWS Lambda:

  1. Navigate to AWS Lambda → Layers → Create layer
  2. Upload the pymupdf-layer.zip file
  3. Set compatible runtimes to Python 3.12
  4. Attach the layer to the PDF processing Lambda function

Why This Approach?

  • Native Compatibility: Uses AWS Lambda's exact runtime environment
  • Automated Process: Scripts handle the complex Docker operations
  • Size Optimisation: Dual-path installation ensures compatibility while staying under limits
  • Reproducible: Version-locked dependencies ensure consistent builds
  1. Wheels are pre-compiled Python packages that include binary dependencies. They're faster to install than source distributions because they don't require compilation, but they must match the target platform's architecture and operating system exactly.

Image Processing Pipeline🔗︎

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
def extract_and_process_images(page, page_num, base_filename):
    """Extract images with contextual positioning"""
    images = []
    image_list = page.get_images()

    for img_index, img in enumerate(image_list):
        try:
            # Get image data
            xref = img[0]
            pix = fitz.Pixmap(doc, xref)

            # Convert CMYK to RGB if needed
            if pix.n - pix.alpha < 4:
                img_data = pix.tobytes("png")
            else:
                pix1 = fitz.Pixmap(fitz.csRGB, pix)
                img_data = pix1.tobytes("png")
                pix1 = None

            # Generate descriptive filename
            img_filename = f"{base_filename}_image_{img_index+1:03d}_page{page_num+1}.png"

            images.append({
                'filename': img_filename,
                'data': base64.b64encode(img_data).decode(),
                'page': page_num + 1,
                'position': img_index
            })

        except Exception as e:
            logger.warning(f"⚠️ Could not extract image {img_index}: {str(e)}")

    return images

Setup and Configuration🔗︎

Prerequisites🔗︎

AWS Requirements

  • AWS Account with appropriate permissions
  • Bedrock service access in us-east-1 region
  • Claude model access (requires separate request)
  • S3 buckets for input and output

Why us-east-1?

The us-east-1 region is chosen because it has the best availability of newer Claude models and Bedrock features compared to other regions. While this may introduce slightly higher latency for users outside North America, the access to latest AI capabilities outweighs the marginal performance difference for this use case.

1. S3 Bucket Configuration🔗︎

![S3 Bucket Configuration Screenshot Placeholder] Screenshot needed: S3 console showing both input and output buckets with bucket policies, permissions tab, and access control lists

Create two S3 buckets:

  • Input bucket: pdf-input-bucket
  • Output bucket: pdf-output-bucket
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::YOUR-ACCOUNT-ID:role/lambda-execution-role"
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR-INPUT-BUCKET/*",
                "arn:aws:s3:::YOUR-OUTPUT-BUCKET/*"
            ]
        }
    ]
}

2. Lambda Function Deployment🔗︎

# Package and deploy Lambda function
cd aws/lambda/pdf-processing-function/
zip -r function.zip .
aws lambda update-function-code \
    --function-name pdf-processing-function \
    --zip-file fileb://function.zip

3. Bedrock Agent Configuration🔗︎

![Bedrock Agent Configuration Screenshot Placeholder] Screenshot needed: Bedrock console showing Agent details page with model selection (Claude Sonnet 4), action groups configured, and instructions panel

Agent Configuration:

  • Model: Claude Sonnet 4
  • Instructions: Custom instructions for PDF processing
  • Action Groups: Lambda function integration
  • Knowledge Base: Optional for enhanced context
Agent Instructions: |
  You are a PDF to Markdown conversion specialist. Your role is to:
  1. Accept PDF file paths from users
  2. Process PDFs through the Lambda function
  3. Provide conversational feedback on processing status
  4. Offer zip options for bulk processing
  5. Handle errors gracefully with helpful messages

4. IAM Permissions🔗︎

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR-INPUT-BUCKET",
                "arn:aws:s3:::YOUR-INPUT-BUCKET/*",
                "arn:aws:s3:::YOUR-OUTPUT-BUCKET",
                "arn:aws:s3:::YOUR-OUTPUT-BUCKET/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*"
        }
    ]
}

Usage Examples🔗︎

Single PDF Processing🔗︎

1
2
3
4
5
# Upload PDF to input bucket
aws s3 cp document.pdf s3://pdf-input-bucket/

# Process via Bedrock Agent
"Process PDF at s3://pdf-input-bucket/document.pdf and save to pdf-output-bucket"

Bulk Processing🔗︎

# Upload multiple PDFs
aws s3 sync ./documents/ s3://pdf-input-bucket/

# Process all PDFs
"Process all PDFs in s3://pdf-input-bucket/ and save to pdf-output-bucket"

Output Structure🔗︎

The system generates organised output:

pdf-output-bucket/
├── document-name/
│   ├── document-name.md          # Main Markdown file
│   └── images/                   # Extracted images
│       ├── document-name_image_001_page1.png
│       ├── document-name_image_002_page2.png
│       └── ...
└── PDF2MD-bulk-20250121-143022.zip  # Optional bulk zip

Performance and Limitations🔗︎

Performance Metrics🔗︎

Production Performance

  • Processing Time: 2-5 minutes per PDF
  • Throughput: Up to 10 PDFs per batch
  • Size Limits: 100MB total batch size
  • Page Limits: 200 pages estimated per batch
  • Memory: 1GB Lambda allocation
  • Timeout: 15 minutes maximum

Current Limitations🔗︎

Known Limitations

  • Complex Layouts: Multi-column layouts simplified to linear flow
  • Table Processing: Tables converted to simple Markdown format
  • Font Information: Font styles not preserved in output
  • Vector Graphics: Only raster images extracted
  • File Size: Large PDFs may timeout (>50MB individual files)

Monitoring and Troubleshooting🔗︎

CloudWatch Logging🔗︎

![CloudWatch Logs Screenshot Placeholder] Screenshot needed: CloudWatch Logs console showing log stream with emoji progress indicators (🔍 Step ⅕, 📥 Step ⅖, etc.) and processing details

The system provides detailed logging with emoji indicators:

logger.info("🔍 Step 1/5: Discovering PDFs to process...")
logger.info("📥 Step 2/5: Downloading and extracting PDF content...")
logger.info("🧠 Step 3/5: Processing content with Claude AI...")
logger.info("💾 Step 4/5: Saving Markdown and images to S3...")
logger.info("📦 Step 5/5: Creating zip file (if requested)...")

Common Issues and Solutions🔗︎

Troubleshooting Guide

Issue: Model access denied Solution: Ensure Claude model access is granted in Bedrock console

Issue: Lambda timeout Solution: Reduce batch size or increase timeout limit

Issue: Image extraction fails Solution: Check PDF format compatibility with PyMuPDF

Issue: Memory errors Solution: Process smaller batches or increase Lambda memory

Future Enhancements🔗︎

Planned Web Interface🔗︎

Roadmap: Web Interface Development

The next major enhancement involves creating a modern web interface to replace the Bedrock Agent chat interface:

  • Modern UI: Drag & drop upload with progress indicators
  • Real-time Progress: WebSocket updates for processing status
  • Batch Management: Visual controls for bulk operations
  • Authentication: Entra ID OIDC integration
  • Mobile Support: Responsive design for all devices

Technical Architecture for Web Interface🔗︎

graph TD
    A[React/Vue.js Frontend] --> B[API Gateway]
    B --> C[Lambda Functions]
    C --> D[Existing PDF Processor]
    C --> E[WebSocket API]
    E --> F[Real-time Progress]
    A --> G[Entra ID OIDC]
    G --> H[JWT Validation]

Cost Analysis🔗︎

AWS Service Costs🔗︎

Estimated Monthly Costs (100 PDFs/month)

  • Lambda: ~$5-10 (execution time and memory)
  • Bedrock: ~$20-30 (Claude model usage)
  • S3: ~$1-2 (storage and requests)
  • CloudWatch: ~$1-2 (logging)
  • Total: ~$27-44/month

Cost Optimization Tips🔗︎

  • Use S3 lifecycle policies for old outputs
  • Implement CloudWatch log retention policies
  • Monitor Bedrock token usage
  • Consider Reserved Capacity for high volume

Security Considerations🔗︎

Data Protection🔗︎

Security Best Practices

  • No Hardcoded Credentials: Uses IAM roles exclusively
  • Encrypted Storage: S3 buckets use server-side encryption
  • Access Logging: All API calls logged to CloudTrail
  • Network Security: Lambda runs in VPC if needed
  • Content Validation: PDF malware scanning recommended

Access Control🔗︎

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::YOUR-INPUT-BUCKET/*",
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalServiceName": "lambda.amazonaws.com"
                }
            }
        }
    ]
}

Conclusion🔗︎

This AWS Bedrock Agent project demonstrates the power of combining serverless computing with AI services to create intelligent document processing solutions. The system successfully processes PDFs with real text extraction, intelligent image positioning, and bulk processing capabilities, all wrapped in a conversational interface.

The production-ready implementation provides a solid foundation for enterprise document processing workflows, with clear paths for enhancement through web interfaces and additional AI capabilities.

Next Steps

  1. Deploy the web interface for improved user experience
  2. Add OCR capabilities for scanned documents
  3. Implement table recognition for better data extraction
  4. Add support for additional formats (Word, PowerPoint)
  5. Integrate with enterprise systems via APIs

This project showcases the integration of AWS Bedrock Agents, Lambda functions, and Claude AI to create a sophisticated document processing pipeline that balances automation with intelligent content analysis.

Meta-Documentation Note

In a delightful case of AI recursion, this entire article was written by Claude Code using the project's own CLAUDE.md context file as source material. With just a little human guidance (and the occasional "fix the spelling mistakes" reminder), Claude Code transformed a technical development log into comprehensive project documentation.

It's rather fitting that an AI assistant wrote the documentation for an AI-powered document processing system - we've essentially created an AI that helps write about AI that processes documents! 🤖📄✨


Comments