Skip to content

Geocoding System

The Mill API includes a powerful geocoding system that normalizes addresses, extracts structured components, and enriches property data with precise coordinates and location metadata.

The geocoding system provides:

  • Address Parsing: Extracts structured components (street, city, state, postal code, etc.) from free-form address strings
  • Coordinate Lookup: Retrieves latitude and longitude coordinates for addresses
  • Address Normalization: Standardizes address formats across different countries and regions
  • Multi-Provider Support: Uses Nominatim (OpenStreetMap) and optionally libpostal for enhanced parsing

The geocoding system consists of:

  1. GeocodingService: Core service that orchestrates address parsing and geocoding
  2. Nominatim Integration: Primary geocoding provider using OpenStreetMap data
  3. Libpostal Integration: Optional advanced address parsing library (when configured)
  4. Fallback Mechanisms: Graceful degradation when external services are unavailable

Parse and geocode an address string.

Terminal window
curl -X POST http://localhost:4000/api/v1/geocode \
-H "Content-Type: application/json" \
-d '{
"address": "123 Main Street, San Francisco, CA 94102"
}'
{
"status": "OK",
"components": {
"street_number": "123",
"street_name": "Main Street",
"city": "San Francisco",
"state": "California",
"postal_code": "94102",
"country": "United States"
},
"raw_input": "123 Main Street, San Francisco, CA 94102"
}
{
"status": "INVALID_REQUEST",
"components": {},
"raw_input": ""
}
{
"status": "PARSE_ERROR",
"components": {},
"raw_input": "invalid address"
}

Default Provider - Always available

  • Uses OpenStreetMap’s Nominatim service
  • Provides both address parsing and coordinate lookup
  • Free and open-source
  • Rate limits: 1 request per second (with proper User-Agent)
  • Global coverage with high accuracy

Configuration: No configuration required - works out of the box.

Advanced Address Parsing - Requires separate service

Libpostal provides advanced address parsing capabilities:

  • Handles international address formats
  • Better parsing accuracy for complex addresses
  • Supports 60+ languages
  • Requires a separate libpostal service instance

When Libpostal is Available:

  • Libpostal handles address parsing (component extraction)
  • Nominatim handles geocoding (coordinate lookup)
  • Best of both worlds: accurate parsing + reliable coordinates

When Libpostal is Not Available:

  • Falls back to Nominatim for both parsing and geocoding
  • Still provides full functionality

Set environment variables:

Terminal window
export MILL_LIBPOSTAL_HOST=localhost
export MILL_LIBPOSTAL_PORT=4400

Or configure via libpostal config section:

libpostal:
enabled: true
host: localhost
port: 4400

The geocoding system extracts the following structured components:

ComponentDescriptionExample
street_numberHouse/building number”123”
street_nameStreet name”Main Street”
routeFull street address”123 Main Street”
localityCity or town”San Francisco”
administrative_area_level_1State or province”California”
administrative_area_level_2County”San Francisco County”
postal_codeZIP/postal code”94102”
countryCountry name”United States”
formatted_addressComplete formatted address”123 Main St, San Francisco, CA 94102, USA”

When properties are submitted to the Mill API, geocoding happens automatically:

  1. Address Validation: The submitted address is parsed and validated
  2. Component Extraction: Structured components are extracted
  3. Coordinate Lookup: Latitude and longitude are retrieved
  4. Normalization: Address is normalized to a standard format
  5. Storage: Normalized address and coordinates are stored with the property

Example: Property Submission with Geocoding

Section titled “Example: Property Submission with Geocoding”
Terminal window
curl -X POST http://localhost:4000/harvesters/properties/single \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "X-Harvester-Source: my-source" \
-d '{
"title": "Beautiful Home",
"address": {
"street": "123 Main St",
"city": "San Francisco",
"state": "CA",
"country": "USA"
}
}'

The Mill will automatically:

  • Parse the address components
  • Geocode to get coordinates
  • Normalize the address format
  • Store everything with the property
StatusDescription
OKAddress successfully parsed and geocoded
INVALID_REQUESTEmpty or invalid input address
PARSE_ERRORAddress could not be parsed

Good:

"123 Main Street, San Francisco, CA 94102, United States"

Better:

"123 Main Street, San Francisco, California 94102, USA"

The geocoding system handles various address formats:

  • US format: "123 Main St, San Francisco, CA 94102"
  • International format: "123 Main Street, Auckland, New Zealand"
  • Partial addresses: "San Francisco, CA" (will geocode to city center)

Always check the status field in the response:

async function handleGeocode(address) {
const response = await geocodeAddress(address);
if (response.status === "OK") {
// Use response.components
} else if (response.status === "INVALID_REQUEST") {
// Handle invalid input
} else {
// Handle parse error
}
}

When making multiple geocoding requests:

  • Respect rate limits (1 request/second for Nominatim)
  • Use batch processing for multiple addresses
  • Cache results when possible
  • Caching: Geocoded addresses are cached to avoid redundant API calls
  • Timeout: 10-second timeout per geocoding request
  • Fallback: System gracefully handles service unavailability
  • Async Processing: Large batches can be processed asynchronously

If an address cannot be geocoded:

  1. Verify the address format is correct
  2. Try a more complete address (include city, state, country)
  3. Check if the address exists in OpenStreetMap
  4. Consider using libpostal for better parsing

If you encounter rate limits:

  1. Reduce request frequency
  2. Implement request queuing
  3. Use batch endpoints when submitting multiple properties
  4. Contact support for higher rate limits

If libpostal is configured but not working:

  1. Verify the libpostal service is running
  2. Check network connectivity to libpostal service
  3. Verify environment variables or config are correct
  4. Check logs for connection errors