New Homie

Analytics platform for finding a place to live in Australia.

Turning messy property data into answers to questions like:

Should I buy or rent?
Which property best fits my constraints?
How do options compare under different buying or renting strategies?

Features

Fast interactive dashboards for exploring property data.
Cross-filter and highlight listings across multiple property attributes.
Compare buy and rent options in one place.
Save and share dashboard configurations with a link.
Customize dashboards and SQL for deeper analysis.

Note: running in production since October 2025.

1775803312038943.mp4

Design

Architecture

This is the core architecture of the app:

Data sources
  AusPost / Domain / ACARA
          │
          ▼
Scrape pipeline
  preprocess → batch workers → postprocess
          │
          ├─ retries / validation / quality filters
          ├─ logs / metrics / traces
          ▼
Primary storage
  Supabase Postgres + PostGIS
          │
          ▼
API delivery
  Hono + ORPC + OpenAPI
          │
          ├─ CloudFront caching
          ▼
Frontend
  React + TanStack + DuckDB WASM + Arrow
          │
          ▼
Interactive dashboards
  filters / cross-highlighting / shareable URL state

1. Data pipeline

Important characteristics:

Data quality:
- Quality data leads to quality insights.
- Reduces data poisoning for upstream services.
Observability:
- Ambiguous data and errors are recorded.
- CPU and RAM usage are tracked to improve cost efficiency.
Regional resilience - Multi AZ 99.9% SLA:
- Scrape timing is flexible, so a lower end-to-end SLA can be tolerated.
- Recovery is handled with resilient pipelines using Step Functions and AWS Batch on Fargate - rated at 99.99% SLA.
- Small blast radius through separate workers: preprocessor, 15 minute batch scraper, postprocessor.
- Weakest link is database insertion with lower SLA - Supabase at 99.9% SLA.

These data sources are filtered for quality control:

AusPost - for locality data:
- Some localities don't exist and need to be detected in production.
Domain - for sale, rent and locality data:
- Missing data and inconsistent formats for price.
- Data integrity issues such as duplicate addresses, autogenerated prices, and changed addresses.
- Junk data cleanup, such as car parks and garages listed for sale or rent
- Pagination and termination handling.
- Retries on scrape failures.
Acara - for school data:
- Mostly clean data, with filtering applied for relevance.

Limitations:

No anomaly detection yet.
Some valid data still missed.

2. Data format and local-first database

Important characteristics:

Accessibility:
- query with SQL.
Prefer OLAP performance to OLTP - as complex queries are required:
- Column-based databases reduce IOPs for OLAP queries.
- Choose best performer from local-first database benchmark
Supports spatial queries:
- Zero copy data - so data does not have to move.

Uses Arrow IPC to insert data into DuckDB with zero-copy transfer where possible. This improves ingestion speed. The main downside is the initial WASM size, which increases CDN cost and startup latency, so precaching is used to improve subsequent starts.

3. Data delivery

Important characteristics:

Cacheability:
- Reduce backend load as much as possible for cost and latency reasons.
- Improves availability and scalability.
- Use CloudFront caching at 99.9% SLA.
Type safe - Generate OpenAPI:
- Use Hono and ORPC to generate OpenAPI.
- openapi-ts for client type safety.

4. Local-first dashboard as code

Accessible:
- Colorblind friendly.
Interactive:
- Filtering and cross-highlighting UX.
- Prefer interactions like hover and drag to clicks and forms.
Shareable:
- Dashboards are configured as code through URL search params.

SQL queries are executed against the local database and cached.

Developer notes

Tech stack - overview

Monorepo:
- Nx - to manage local development with caching.
- Pnpm - to manage monorepo scripts.
Observability:
- Grafana LGTM - for flexibility and ability to test locally.
Infra:
- AWS + Supabase Postgres.
- IaC managed by SST (built on top of Pulumi) - chosen for fast serverless iteration.
API:
- Hono, ORPC for OpenAPI generation.
Data:
- Postgres + PostGIS - for spatial data and local testing.
- DuckDB WASM + Apache Arrow for local-first database.
Frontend:
- React - for ecosystem support.
- TanStack Query - for cache control.
- TanStack Router - for search param and link type safety.

Monorepo structure

This monorepo is arranged in the following format:

.github - CICD and repo management.
- actions - Composite actions to be used in workflows.
- workflows - Define jobs for CICD.
apps - architectural quantums
- observability - Grafana LGTM dashboard management and OpenTelemetry library.
- service-scrape - House data webscrape and access.
- service-auth - Authentication lambda authoriser.
- service-user - User management.
- web - Web app.
infra - Manage infrastructure on AWS using CDK.

Deployment

CI checks ensure the infrastructure is deployable and the code meets standards. Preview branches are used to review changes live in public.

Scripts

Install using pnpm only - git hooks should auto configure:

pnpm i

Watch all build and tests as you develop:

pnpm watch

Spin up docker for development and generation scripts:

pnpm docker:up

Or spin down docker:

pnpm docker:down

Detect stale code:

pnpm knip

Upgrade all dependencies:

pnpm bump

Visualise local package dependency:

pnpm graph

Feedback

If you have experience with:

data pipelines
observability
analytical frontends

I’d love to hear your thoughts or suggestions.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.github		.github
apps		apps
.editorconfig		.editorconfig
.gitignore		.gitignore
.jscpd.json		.jscpd.json
.nxignore		.nxignore
.syncpackrc.ts		.syncpackrc.ts
README.md		README.md
biome.json		biome.json
docker-compose.yaml		docker-compose.yaml
knip.json		knip.json
new-homie.code-workspace		new-homie.code-workspace
nx.json		nx.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
sst.config.ts		sst.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

New Homie

Features

Design

Architecture

1. Data pipeline

2. Data format and local-first database

3. Data delivery

4. Local-first dashboard as code

Developer notes

Tech stack - overview

Monorepo structure

Deployment

Scripts

Feedback

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

New Homie

Features

Design

Architecture

1. Data pipeline

2. Data format and local-first database

3. Data delivery

4. Local-first dashboard as code

Developer notes

Tech stack - overview

Monorepo structure

Deployment

Scripts

Feedback

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages