Skip to content

Open Architecture Decisions

This document tracks architectural decisions that require resolution. Use this as a central reference when planning sprints or making technology choices.

Document Status: Active Last Updated: 2025-12-29


Decision Status Legend

Status Meaning
🔴 Blocked Cannot proceed; waiting on external input
🟡 Pending Needs decision; work can continue
🟢 Decided Decision made; implementation pending
✅ Resolved Decision implemented; move to decisions.md

Active Open Decisions

OD-001: Log Aggregation Platform

Field Value
Status 🟡 Pending
Priority Medium
Blocking Phase 8 (Observability implementation)
Source observability.md
Created 2025-12-29

Context: Need to select a log aggregation platform for centralized logging across central server and edge collectors.

Options:

Option Pros Cons Cost
Grafana Loki Lightweight, pairs with Prometheus, low resource usage Less powerful querying than ELK Free (self-hosted)
ELK Stack Powerful search, widely used, mature Resource-heavy, complex setup Free (self-hosted)
Datadog SaaS, APM integration, minimal ops Expensive at scale, data leaves premises $$-$$$
AWS CloudWatch Native if on AWS, managed AWS lock-in, cost at scale $-$$

Recommendation: Grafana Loki - pairs well with existing Prometheus consideration, lightweight for edge deployment.

Next Steps: 1. Evaluate Loki resource requirements on Pi 2. Prototype Loki + Grafana setup 3. Test log ingestion from edge collector

Decision Owner: TBD Target Date: Before Phase 8


OD-002: Backup Target Host

Field Value
Status 🟡 Pending
Priority High
Blocking Phase 7 (Backup implementation)
Source backup-disaster-recovery.md
Created 2025-12-29

Context: Need to determine where backups will be stored. This affects hardware procurement and security posture.

Options:

Option Pros Cons Cost
Dedicated backup server Full control, fast restore Hardware cost, maintenance $500-2000
NAS device (Synology/QNAP) Easy setup, RAID built-in Limited performance $500-1500
Cloud storage No hardware, off-site PII concerns, egress costs $50-200/month
Off-site co-location Disaster recovery, off-site Complexity, ongoing cost $100-500/month

Recommendation: NAS device initially (simple, cost-effective), with cloud backup for critical data pending security approval (OD-003).

Dependencies: - OD-003 (Cloud backup decision) affects this decision - Budget approval required

Next Steps: 1. Get budget approval for NAS hardware 2. Evaluate Synology vs QNAP options 3. Size storage for 1 year of images + 30 days of DB backups

Decision Owner: TBD Target Date: Before Phase 7


OD-003: Cloud Backup for Images (S3/Glacier)

Field Value
Status 🔴 Blocked
Priority Medium
Blocking Off-site disaster recovery
Source backup-disaster-recovery.md
Created 2025-12-29

Context: Cloud backup (AWS S3/Glacier) would provide off-site disaster recovery at low cost, but requires security approval due to PII in images (license plates, vehicle photos).

Options:

Option Pros Cons PII Risk
AWS S3 Reliable, scalable Cost, data leaves premises High
AWS Glacier Very cheap long-term Slow retrieval High
No cloud backup No PII concerns No off-site DR None
Encrypted cloud Off-site + protected Key management complexity Medium

Blocker: Security team review required before storing PII in cloud: - [ ] Security team review of PII implications - [ ] Legal review of data residency requirements - [ ] Compliance check (if applicable regulations) - [ ] Encryption key management plan - [ ] Access audit procedures - [ ] Incident response plan update

Next Steps: 1. Schedule security review meeting 2. Prepare PII data flow documentation 3. Draft encryption key management proposal

Decision Owner: Security Team Target Date: TBD (blocked on security review)


OD-004: MinIO Replication Strategy

Field Value
Status 🟡 Pending
Priority Low
Blocking None (can use mc mirror initially)
Source backup-disaster-recovery.md
Created 2025-12-29

Context: Need to decide between simple backup (mc mirror) vs real-time replication for MinIO image storage.

Options:

Option Pros Cons Use Case
mc mirror (daily) Simple, low resource Up to 24h data loss Basic backup
MinIO Replication Real-time, automatic failover Requires second MinIO High availability
Hybrid Best of both More complex Production recommendation

Recommendation: Start with mc mirror for POC; evaluate MinIO replication post-POC if HA requirements emerge.

Next Steps: 1. Implement mc mirror for POC 2. Monitor image volume and backup times 3. Revisit after POC based on actual requirements

Decision Owner: TBD Target Date: Post-POC


Recently Resolved Decisions

Move decisions here when resolved, with date and outcome.

ID Decision Outcome Date
- - - -

Decision Record Template

When adding a new open decision, use this template:

### OD-XXX: [Decision Title]

| Field | Value |
|-------|-------|
| **Status** | 🟡 Pending |
| **Priority** | High/Medium/Low |
| **Blocking** | [What is blocked by this decision] |
| **Source** | [Link to source document] |

**Context:**
[Why this decision is needed]

**Options:**
[Table of options with pros/cons]

**Recommendation:** [If applicable]

**Dependencies:** [Other decisions this depends on]

**Next Steps:**
1. [Action item]
2. [Action item]

**Decision Owner:** [Person/team responsible]
**Target Date:** [When decision is needed by]

Cross-Reference

Document Open Decision Section
ARCHITECTURE.md Section 8.3: Open Decisions
observability.md Log Aggregation Platform
backup-disaster-recovery.md Open Decisions table

Document Maintainer: Architecture Team Review Cycle: Weekly during active development