# Legal Infrastructure Organization

## Directory Structure

```
~/Legal/
├── bin/                           # All scripts and tools
│   ├── email/                     # Email processing tools
│   │   ├── o365_downloader.py     # O365 Graph API downloader
│   │   ├── process_emails.py      # Email processing pipeline
│   │   └── verify_pipeline.py     # Pipeline verification
│   └── utils/                     # Utility scripts
│       └── verify_tracking.py     # Tracking verification
├── config/                        # Configuration files
│   ├── o365_config.json          # O365 API configuration
│   └── pipeline_config.json      # Pipeline configuration
├── data/                         # Working data directory
│   ├── current/                  # Current processing
│   └── archive/                  # Archived data
└── evidence/                     # Evidence structure (existing)
    ├── email_evidence/           # Email evidence (existing)
    ├── pdf_extractions/         # PDF extractions (existing)
    └── unified_evidence/        # Unified evidence (existing)

## Key Files

1. Infrastructure:
- INFRASTRUCTURE.md (this file) - Central documentation
- STATUS.md - Current processing status
- TRACKING.md - Tracking system documentation

2. Scripts:
- bin/email/o365_downloader.py - Email download (from sessions)
- bin/email/process_emails.py - Email processing
- bin/utils/verify_tracking.py - Track verification

3. Configuration:
- config/o365_config.json - API configuration 
- config/pipeline_config.json - Pipeline settings

## Key Commands:

1. Email Processing:
```bash
# Download new emails
./bin/email/o365_downloader.py

# Process downloaded emails
./bin/email/process_emails.py

# Verify pipeline status
./bin/email/verify_pipeline.py
```

2. Tracking Verification:
```bash
# Verify tracking status
./bin/utils/verify_tracking.py

# Check specific email
./bin/utils/verify_tracking.py --email <email_id>
```

## Status Tracking

The system maintains tracking at multiple levels:

1. Email Level:
- Download status
- Processing status
- Attachment extraction
- Markdown conversion
- Frontmatter completion

2. Attachment Level:
- Extraction status
- OCR completion
- PDF processing
- Quality metrics

3. Overall Status:
- Total emails processed
- Processing quality
- Coverage metrics
- System health

## Continuous Integration

Key processes:

1. Email Download:
- Daily check for new emails
- Deduplication against existing
- Backup creation

2. Processing Pipeline:
- Email body extraction
- Attachment processing
- Metadata enhancement
- Legal document creation

3. Quality Control:
- Automated verification
- Quality metrics
- Coverage checking
- Error detection

## Current Status

Status is always available in STATUS.md, updated by scripts