Geocoding System
The Mill API includes a powerful geocoding system that normalizes addresses, extracts structured components, and enriches property data with precise coordinates and location metadata.
Overview
Section titled “Overview”The geocoding system provides:
- Address Parsing: Extracts structured components (street, city, state, postal code, etc.) from free-form address strings
- Coordinate Lookup: Retrieves latitude and longitude coordinates for addresses
- Address Normalization: Standardizes address formats across different countries and regions
- Multi-Provider Support: Uses Nominatim (OpenStreetMap) and optionally libpostal for enhanced parsing
Architecture
Section titled “Architecture”The geocoding system consists of:
- GeocodingService: Core service that orchestrates address parsing and geocoding
- Nominatim Integration: Primary geocoding provider using OpenStreetMap data
- Libpostal Integration: Optional advanced address parsing library (when configured)
- Fallback Mechanisms: Graceful degradation when external services are unavailable
API Endpoint
Section titled “API Endpoint”POST /api/v1/geocode
Section titled “POST /api/v1/geocode”Parse and geocode an address string.
Request
Section titled “Request”curl -X POST http://localhost:4000/api/v1/geocode \ -H "Content-Type: application/json" \ -d '{ "address": "123 Main Street, San Francisco, CA 94102" }'Response (Success)
Section titled “Response (Success)”{ "status": "OK", "components": { "street_number": "123", "street_name": "Main Street", "city": "San Francisco", "state": "California", "postal_code": "94102", "country": "United States" }, "raw_input": "123 Main Street, San Francisco, CA 94102"}Response (Invalid Request)
Section titled “Response (Invalid Request)”{ "status": "INVALID_REQUEST", "components": {}, "raw_input": ""}Response (Parse Error)
Section titled “Response (Parse Error)”{ "status": "PARSE_ERROR", "components": {}, "raw_input": "invalid address"}Geocoding Providers
Section titled “Geocoding Providers”Nominatim (OpenStreetMap)
Section titled “Nominatim (OpenStreetMap)”Default Provider - Always available
- Uses OpenStreetMap’s Nominatim service
- Provides both address parsing and coordinate lookup
- Free and open-source
- Rate limits: 1 request per second (with proper User-Agent)
- Global coverage with high accuracy
Configuration: No configuration required - works out of the box.
Libpostal (Optional)
Section titled “Libpostal (Optional)”Advanced Address Parsing - Requires separate service
Libpostal provides advanced address parsing capabilities:
- Handles international address formats
- Better parsing accuracy for complex addresses
- Supports 60+ languages
- Requires a separate libpostal service instance
When Libpostal is Available:
- Libpostal handles address parsing (component extraction)
- Nominatim handles geocoding (coordinate lookup)
- Best of both worlds: accurate parsing + reliable coordinates
When Libpostal is Not Available:
- Falls back to Nominatim for both parsing and geocoding
- Still provides full functionality
Enabling Libpostal
Section titled “Enabling Libpostal”Set environment variables:
export MILL_LIBPOSTAL_HOST=localhostexport MILL_LIBPOSTAL_PORT=4400Or configure via libpostal config section:
libpostal: enabled: true host: localhost port: 4400Address Components
Section titled “Address Components”The geocoding system extracts the following structured components:
| Component | Description | Example |
|---|---|---|
street_number | House/building number | ”123” |
street_name | Street name | ”Main Street” |
route | Full street address | ”123 Main Street” |
locality | City or town | ”San Francisco” |
administrative_area_level_1 | State or province | ”California” |
administrative_area_level_2 | County | ”San Francisco County” |
postal_code | ZIP/postal code | ”94102” |
country | Country name | ”United States” |
formatted_address | Complete formatted address | ”123 Main St, San Francisco, CA 94102, USA” |
Integration with Property Submission
Section titled “Integration with Property Submission”When properties are submitted to the Mill API, geocoding happens automatically:
- Address Validation: The submitted address is parsed and validated
- Component Extraction: Structured components are extracted
- Coordinate Lookup: Latitude and longitude are retrieved
- Normalization: Address is normalized to a standard format
- Storage: Normalized address and coordinates are stored with the property
Example: Property Submission with Geocoding
Section titled “Example: Property Submission with Geocoding”curl -X POST http://localhost:4000/harvesters/properties/single \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_TOKEN" \ -H "X-Harvester-Source: my-source" \ -d '{ "title": "Beautiful Home", "address": { "street": "123 Main St", "city": "San Francisco", "state": "CA", "country": "USA" } }'The Mill will automatically:
- Parse the address components
- Geocode to get coordinates
- Normalize the address format
- Store everything with the property
Response Status Codes
Section titled “Response Status Codes”| Status | Description |
|---|---|
OK | Address successfully parsed and geocoded |
INVALID_REQUEST | Empty or invalid input address |
PARSE_ERROR | Address could not be parsed |
Best Practices
Section titled “Best Practices”1. Provide Complete Addresses
Section titled “1. Provide Complete Addresses”Good:
"123 Main Street, San Francisco, CA 94102, United States"Better:
"123 Main Street, San Francisco, California 94102, USA"2. Handle Multiple Formats
Section titled “2. Handle Multiple Formats”The geocoding system handles various address formats:
- US format:
"123 Main St, San Francisco, CA 94102" - International format:
"123 Main Street, Auckland, New Zealand" - Partial addresses:
"San Francisco, CA"(will geocode to city center)
3. Error Handling
Section titled “3. Error Handling”Always check the status field in the response:
async function handleGeocode(address) { const response = await geocodeAddress(address);
if (response.status === "OK") { // Use response.components } else if (response.status === "INVALID_REQUEST") { // Handle invalid input } else { // Handle parse error }}4. Rate Limiting
Section titled “4. Rate Limiting”When making multiple geocoding requests:
- Respect rate limits (1 request/second for Nominatim)
- Use batch processing for multiple addresses
- Cache results when possible
Performance Considerations
Section titled “Performance Considerations”- Caching: Geocoded addresses are cached to avoid redundant API calls
- Timeout: 10-second timeout per geocoding request
- Fallback: System gracefully handles service unavailability
- Async Processing: Large batches can be processed asynchronously
Troubleshooting
Section titled “Troubleshooting”Address Not Found
Section titled “Address Not Found”If an address cannot be geocoded:
- Verify the address format is correct
- Try a more complete address (include city, state, country)
- Check if the address exists in OpenStreetMap
- Consider using libpostal for better parsing
Rate Limiting
Section titled “Rate Limiting”If you encounter rate limits:
- Reduce request frequency
- Implement request queuing
- Use batch endpoints when submitting multiple properties
- Contact support for higher rate limits
Libpostal Not Working
Section titled “Libpostal Not Working”If libpostal is configured but not working:
- Verify the libpostal service is running
- Check network connectivity to libpostal service
- Verify environment variables or config are correct
- Check logs for connection errors
Related Documentation
Section titled “Related Documentation”- Property Submission - Learn how geocoding integrates with property submission
- API Reference - Complete API documentation
- Configuration - Configure geocoding services