Open Architecture Decisions¶
This document tracks architectural decisions that require resolution. Use this as a central reference when planning sprints or making technology choices.
Document Status: Active Last Updated: 2025-12-29
Decision Status Legend¶
| Status | Meaning |
|---|---|
| 🔴 Blocked | Cannot proceed; waiting on external input |
| 🟡 Pending | Needs decision; work can continue |
| 🟢 Decided | Decision made; implementation pending |
| ✅ Resolved | Decision implemented; move to decisions.md |
Active Open Decisions¶
OD-001: Log Aggregation Platform¶
| Field | Value |
|---|---|
| Status | 🟡 Pending |
| Priority | Medium |
| Blocking | Phase 8 (Observability implementation) |
| Source | observability.md |
| Created | 2025-12-29 |
Context: Need to select a log aggregation platform for centralized logging across central server and edge collectors.
Options:
| Option | Pros | Cons | Cost |
|---|---|---|---|
| Grafana Loki | Lightweight, pairs with Prometheus, low resource usage | Less powerful querying than ELK | Free (self-hosted) |
| ELK Stack | Powerful search, widely used, mature | Resource-heavy, complex setup | Free (self-hosted) |
| Datadog | SaaS, APM integration, minimal ops | Expensive at scale, data leaves premises | $$-$$$ |
| AWS CloudWatch | Native if on AWS, managed | AWS lock-in, cost at scale | $-$$ |
Recommendation: Grafana Loki - pairs well with existing Prometheus consideration, lightweight for edge deployment.
Next Steps: 1. Evaluate Loki resource requirements on Pi 2. Prototype Loki + Grafana setup 3. Test log ingestion from edge collector
Decision Owner: TBD Target Date: Before Phase 8
OD-002: Backup Target Host¶
| Field | Value |
|---|---|
| Status | 🟡 Pending |
| Priority | High |
| Blocking | Phase 7 (Backup implementation) |
| Source | backup-disaster-recovery.md |
| Created | 2025-12-29 |
Context: Need to determine where backups will be stored. This affects hardware procurement and security posture.
Options:
| Option | Pros | Cons | Cost |
|---|---|---|---|
| Dedicated backup server | Full control, fast restore | Hardware cost, maintenance | $500-2000 |
| NAS device (Synology/QNAP) | Easy setup, RAID built-in | Limited performance | $500-1500 |
| Cloud storage | No hardware, off-site | PII concerns, egress costs | $50-200/month |
| Off-site co-location | Disaster recovery, off-site | Complexity, ongoing cost | $100-500/month |
Recommendation: NAS device initially (simple, cost-effective), with cloud backup for critical data pending security approval (OD-003).
Dependencies: - OD-003 (Cloud backup decision) affects this decision - Budget approval required
Next Steps: 1. Get budget approval for NAS hardware 2. Evaluate Synology vs QNAP options 3. Size storage for 1 year of images + 30 days of DB backups
Decision Owner: TBD Target Date: Before Phase 7
OD-003: Cloud Backup for Images (S3/Glacier)¶
| Field | Value |
|---|---|
| Status | 🔴 Blocked |
| Priority | Medium |
| Blocking | Off-site disaster recovery |
| Source | backup-disaster-recovery.md |
| Created | 2025-12-29 |
Context: Cloud backup (AWS S3/Glacier) would provide off-site disaster recovery at low cost, but requires security approval due to PII in images (license plates, vehicle photos).
Options:
| Option | Pros | Cons | PII Risk |
|---|---|---|---|
| AWS S3 | Reliable, scalable | Cost, data leaves premises | High |
| AWS Glacier | Very cheap long-term | Slow retrieval | High |
| No cloud backup | No PII concerns | No off-site DR | None |
| Encrypted cloud | Off-site + protected | Key management complexity | Medium |
Blocker: Security team review required before storing PII in cloud: - [ ] Security team review of PII implications - [ ] Legal review of data residency requirements - [ ] Compliance check (if applicable regulations) - [ ] Encryption key management plan - [ ] Access audit procedures - [ ] Incident response plan update
Next Steps: 1. Schedule security review meeting 2. Prepare PII data flow documentation 3. Draft encryption key management proposal
Decision Owner: Security Team Target Date: TBD (blocked on security review)
OD-004: MinIO Replication Strategy¶
| Field | Value |
|---|---|
| Status | 🟡 Pending |
| Priority | Low |
| Blocking | None (can use mc mirror initially) |
| Source | backup-disaster-recovery.md |
| Created | 2025-12-29 |
Context: Need to decide between simple backup (mc mirror) vs real-time replication for MinIO image storage.
Options:
| Option | Pros | Cons | Use Case |
|---|---|---|---|
| mc mirror (daily) | Simple, low resource | Up to 24h data loss | Basic backup |
| MinIO Replication | Real-time, automatic failover | Requires second MinIO | High availability |
| Hybrid | Best of both | More complex | Production recommendation |
Recommendation: Start with mc mirror for POC; evaluate MinIO replication post-POC if HA requirements emerge.
Next Steps: 1. Implement mc mirror for POC 2. Monitor image volume and backup times 3. Revisit after POC based on actual requirements
Decision Owner: TBD Target Date: Post-POC
Recently Resolved Decisions¶
Move decisions here when resolved, with date and outcome.
| ID | Decision | Outcome | Date |
|---|---|---|---|
| - | - | - | - |
Decision Record Template¶
When adding a new open decision, use this template:
### OD-XXX: [Decision Title]
| Field | Value |
|-------|-------|
| **Status** | 🟡 Pending |
| **Priority** | High/Medium/Low |
| **Blocking** | [What is blocked by this decision] |
| **Source** | [Link to source document] |
**Context:**
[Why this decision is needed]
**Options:**
[Table of options with pros/cons]
**Recommendation:** [If applicable]
**Dependencies:** [Other decisions this depends on]
**Next Steps:**
1. [Action item]
2. [Action item]
**Decision Owner:** [Person/team responsible]
**Target Date:** [When decision is needed by]
Cross-Reference¶
| Document | Open Decision Section |
|---|---|
| ARCHITECTURE.md | Section 8.3: Open Decisions |
| observability.md | Log Aggregation Platform |
| backup-disaster-recovery.md | Open Decisions table |
Document Maintainer: Architecture Team Review Cycle: Weekly during active development