Architecture

Pillow’s architecture is designed for scalability, performance, and reliability. This page provides an overview of the system design and component interactions.

System Overview

Pillow follows a microservices architecture with clear separation of concerns:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Pillow App    │    │   Mill API      │    │   Load Balancer │
│   (Next.js)     │────│   (Go + Doris)  │────│                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         │                       │                       │
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Data Layer    │    │   Cache Layer   │    │   Messaging     │
│  (Apache Doris) │    │    (Redis)      │    │   (Redpanda)    │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         │                       │                       │
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Harvesters    │    │   External      │    │   Monitoring    │
│ (Data Sources)  │    │     APIs        │    │   & Logging     │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Core Components

1. The Mill (API Service)

Language: Go 1.24.9+
Database: Apache Doris (MySQL-compatible)
Purpose: Core API server and business logic
Features:
- RESTful API endpoints
- Authentication & authorization (JWT/service tokens)
- Data validation and processing
- Rate limiting and security
- Health monitoring
- OpenAPI 3.0 documentation

2. Pillow App (Frontend)

Language: TypeScript/Next.js 14
Purpose: Modern web application for property search and exploration
Features:
- Property search and filtering
- Interactive maps and visualizations
- Market analytics and insights
- Responsive, mobile-optimized UI
- Real-time data updates

3. Harvesters (Data Collectors)

Language: Go
Purpose: Automated data collection and processing from various sources
Features:
- Multi-source data aggregation
- Property data normalization
- Scheduled data updates
- Error handling and retry logic
- Batch and single property submission
- JWT-based authentication with Mill API

4. Documentation Site

Framework: Starlight/Astro
Purpose: API documentation and guides
Features:
- Interactive API explorer
- Code examples
- Developer guides
- Search functionality

Data Flow

1. Data Ingestion

External Sources → Harvesters → Validation → Mill API → Database
     ↓               ↓            ↓           ↓          ↓
  MLS APIs      Normalization   Business    Geocoding  Apache Doris
  Public Data    Enrichment     Rules       Dedup      + Indexes
  User Input     Rate Limiting   Validation Storage    + Distribution

2. API Processing

Client Request → Authentication → Rate Limiting → Business Logic → Response
      ↓              ↓               ↓              ↓              ↓
   HTTP/REST      JWT/API Key    Redis Counter   Data Query    JSON/XML
   WebSocket      OAuth 2.0      Per User/IP     Caching       Real-time
   GraphQL        Role-based     Time Windows    Processing    Streaming

3. Frontend Rendering

User Interaction → API Calls → Data Processing → UI Updates → User Feedback
       ↓             ↓           ↓               ↓            ↓
   Click/Type    HTTP Requests  JSON Parsing   DOM Updates   Visual Changes
   Search        Authentication Data Caching   State Mgmt    Notifications
   Filter        Error Handling Validation     Re-rendering  Loading States

Scalability Design

Horizontal Scaling

API Layer: Stateless design allows multiple Mill instances
Database: Apache Doris supports distributed architecture and read replicas
Frontend: CDN deployment with edge caching
Harvesters: Distributed across multiple workers/containers

Performance Optimization

Caching Strategy: Multi-layer caching (Redis, CDN, browser)
Database Optimization: Proper indexing and query optimization
Connection Pooling: Efficient database connection management
Rate Limiting: Prevents abuse and ensures fair resource usage

High Availability

Load Balancing: Traffic distribution across multiple instances
Health Checks: Automatic service discovery and failover
Data Replication: Cross-region database backups
Monitoring: Real-time alerting and performance tracking

Security Architecture

Authentication & Authorization

JWT Tokens: Stateless authentication for API access
API Keys: Service-to-service authentication
OAuth Integration: Third-party authentication support
Role-Based Access: Granular permission system

Data Protection

Encryption: Data encrypted at rest and in transit
Input Validation: All user inputs validated and sanitized
SQL Injection Prevention: Parameterized queries and ORM usage
Rate Limiting: Protection against brute force attacks

Network Security

HTTPS Only: All traffic encrypted with TLS
CORS Configuration: Proper cross-origin resource sharing
Security Headers: HSTS, CSP, and other security headers
API Gateway: Centralized security policy enforcement

Technology Stack

Backend Services

API Server: Go with Gin framework
Database: Apache Doris (MySQL-compatible, OLAP database)
Caching: Redis for session and data caching
Message Queue: Redpanda (Kafka-compatible) for event streaming
Monitoring: Prometheus + Grafana (planned)

Frontend Technologies

Framework: Next.js with React
Styling: Tailwind CSS
State Management: Zustand
Maps: Mapbox GL JS
Charts: Chart.js / D3.js

Infrastructure

Containerization: Docker + Docker Compose
Orchestration: Kubernetes (production)
CI/CD: GitHub Actions
Cloud Provider: AWS/GCP/Azure compatible
CDN: CloudFlare or AWS CloudFront

Development Workflow

Local Development

Docker Compose: Single command setup
Hot Reload: Automatic code reloading
Local Databases: Containerized services
Testing Environment: Isolated test data

CI/CD Pipeline

Code Push: Trigger automated tests
Testing: Unit, integration, and E2E tests
Build: Create container images
Deploy: Automated deployment to staging/production

Monitoring & Observability

Logs: Structured logging with correlation IDs
Metrics: Performance and business metrics
Tracing: Distributed request tracing
Alerts: Automated incident response

Next Steps

To dive deeper into specific components:

The Mill - Core API service details
Connectors - Data collection architecture
Views - Frontend architecture patterns
Deployment - Production deployment strategies