Gateway (Edge) Roadmap
This document outlines the roadmap for the Voltimax Edge Gateway software.
Reference: src/Voltimax.Edge.Gateway.Service/, src/Voltimax.Edge.Modbus/, src/Voltimax.Edge.Ocpp/
Phase G1: Strategy Expansion
Reference Implementation: NetZeroStrategy in src/Voltimax.Edge.Gateway.Service/Strategies/
Architecture:
- Interface:
IStrategywithApplyStrategy(StrategyContext, CancellationToken) - Base class:
AbstractStrategy(provides IMediator, GetBatteryName helper) - Factory:
StrategyFactorywith keyed DI registration - Enum:
Strategy(None, NetZero, Discharge, Charge, PeakShaving, Manual)
| Strategy | Status | Priority | Description |
|---|---|---|---|
| ChargeStrategy | ✅ Complete | - | Charges to MaximumStateOfCharge at MaxChargeRateWatts |
| DischargeStrategy | ✅ Complete | - | Discharges from MinimumStateOfCharge at MaxDischargeRateWatts |
| NetZeroStrategy | ✅ Complete | - | PID-based net-zero optimization with rate limiting, quantization, integral decay |
| NullStrategy | ✅ Complete | - | No-op strategy for idle mode |
| PeakShavingStrategy | ❌ Not Started | High | Reduce grid peak demand by discharging during high-usage periods |
| TimeOfUseStrategy | ❌ Not Started | High | Optimize based on electricity tariff schedules (charge cheap, discharge expensive) |
| SelfConsumptionStrategy | ❌ Not Started | High | Maximize use of solar generation, minimize grid export |
| GridServicesStrategy | ❌ Not Started | Medium | Participate in grid frequency regulation and demand response |
| ArbitrageStrategy | ❌ Not Started | Medium | Buy low/sell high based on spot market prices (EPEX integration) |
| BackupPowerStrategy | ❌ Not Started | Medium | Maintain reserve SoC for outage protection |
| EVChargingStrategy | ❌ Not Started | Low | Coordinate battery with EV charging schedules |
Implementation Status:
Completed:
- ✅ Core strategy framework with IStrategy, AbstractStrategy, StrategyFactory
- ✅ NetZeroStrategy (258 lines) with:
- PID controller (P, I, D gains configurable)
- Deadband detection, rate limiting, register quantization
- Integral decay to prevent windup
- DecisionMetrics for observability
- Cache-based last setpoint tracking (Redis, 5-min TTL)
- ✅ Basic Charge/Discharge/Null strategies
- ✅ Strategy execution via ApplyStrategyJob (NCronJob scheduled)
- ✅ Command handlers: ApplyStrategyCommand, SetBatterySetpointCommand
Not Started:
- ❌ 7 advanced strategies (PeakShaving, TimeOfUse, SelfConsumption, GridServices, Arbitrage, BackupPower, EVCharging)
Each strategy should follow NetZeroStrategy patterns:
- Configuration class with validation (e.g.,
PeakShavingStrategyConfiguration) - Keyed DI registration in
ServiceCollectionExtensions - DecisionMetrics for observability
- Threshold-based or PID control logic (deterministic, no ML)
Success Criteria:
- Each strategy reduces energy costs by measurable % vs baseline
- Strategies honor battery safety limits (SoC, temperature, cycle life)
- DecisionMetrics telemetry shows strategy reasoning in real-time
Dependencies:
- TimeOfUseStrategy requires tariff configuration API (blocks F1)
- ArbitrageStrategy requires EPEX pricing (completed)
- EVChargingStrategy requires OCPP charging station integration (completed)
Known Risks:
- PID tuning for different battery chemistries may require field testing
- Grid frequency regulation requires low-latency command path (<500ms)
- Strategy conflicts possible if multiple enabled simultaneously
Phase G2: Protocol & Control Expansion
Reference: src/Voltimax.Edge.Modbus/, src/Voltimax.Edge.Ocpp/, src/Voltimax.Edge.Gateway.Service/Commands/
| Task | Status | Priority | Description |
|---|---|---|---|
| Modbus TCP/RTU protocol | ✅ Complete | - | Full Modbus implementation with connection pooling, resilience, OpenTelemetry |
| HTTP protocol support | ✅ Complete | - | HTTP client for devices like Peblar charge station |
| Device auto-discovery | ❌ Not Started | High | mDNS/DNS-SD and MQTT discovery for network devices (Homey, HA, IoT devices) |
| MQTT client support | ⚠️ Partial | High | MQTT 3.1.1/5.0 client for lightweight device integration |
| Modbus TCP discovery | ❌ Not Started | Medium | Network scan + SunSpec interrogation for inverters/batteries |
| Additional SunSpec models | ⚠️ Partial | Medium | Expand coverage (currently 60+ models, add remaining) |
| Local automation rules | ❌ Not Started | Medium | If-then rules evaluated locally (e.g., if SoC < 10%, stop discharge) |
| Offline operation mode | ⚠️ Partial | Medium | Continue strategy execution when cloud disconnected |
| Control prioritization | ❌ Not Started | Medium | Priority queue (safety > manual > scheduled > optimization) |
| Safety interlocks | ⚠️ Partial | High | Hard limits enforced regardless of commands (temperature, SoC, current) |
| BACnet protocol | ❌ Not Started | Low | BACnet/IP for building automation integration |
| Custom protocol SDK | ❌ Not Started | Low | Plugin system for user-defined protocol handlers |
Success Criteria:
- MQTT client maintains stable connection with <1% packet loss
- Safety interlocks prevent 100% of out-of-bounds commands in testing
- Offline mode continues operation for 24+ hours without cloud
Dependencies:
- Safety interlocks require complete battery specification database
Implementation Status:
Completed:
- ✅ Modbus TCP/RTU (
VoltimaxModbusClient) with:- Connection pooling with auto-reconnect (3 attempts, exponential backoff)
- Resilience pipeline (retry, circuit breaker, 5s timeout)
- OpenTelemetry instrumentation
- Supported devices: CNTE Battery, Growatt Inverter, Peblar EV Charger, Autel MaxiCharger, Siemens PAC 2200, Eastron SDM 630, ABB B23
- ✅ HTTP protocol for web-API devices (Peblar charge station)
- ✅ Command execution engine (ApplyStrategyCommand, SetBatterySetpointCommand, SendControlSignalsCommand, RefreshConfigurationCommand)
- ✅ Simulation support for all asset types (Battery, Solar, Grid, Charger, Building)
- ✅ CLI tooling for asset management, testing, monitoring
In Progress:
- ⚠️ MQTT options (
MqttOptions.cs) configured but no active client implementation - ⚠️ Offline operation: Caching layer exists (Redis), config persisted locally, but no explicit offline fallback logic
- ⚠️ Safety interlocks: Battery min/max SoC checks in strategies, emergency stop interface exists, but missing comprehensive hard limits at protocol layer
- ⚠️ SunSpec models: Model classes generated, full coverage not verified
Not Started:
- ❌ Device auto-discovery (mDNS/DNS-SD, MQTT discovery)
- ❌ MQTT client integration (publish/subscribe handlers)
- ❌ Modbus TCP discovery (network scan, SunSpec interrogation)
- ❌ Local automation rules engine
- ❌ Control prioritization queue
- ❌ BACnet protocol
- ❌ Custom protocol SDK/plugin system
Known Risks:
- MQTT at scale (1000+ gateways) may require dedicated broker infrastructure
- BACnet/IP requires Windows compatibility testing (currently Linux-focused)
- Custom protocol SDK increases support burden
Phase G3: Fleet Management & Operations
Reference: src/Voltimax.Edge.Gateway.Service/, infra/bicep/, tools/Voltimax.Iot.Tools/
| Task | Status | Priority | Description |
|---|---|---|---|
| Gateway>Cloud telemetry | ✅ Complete | - | Azure Service Bus telemetry publishing operational |
| Cloud>Gateway commands | ✅ Complete | - | Polymorphic session queue with request/reply (PingGateway, SetBatterySetpoint, RefreshConfiguration) |
| Gateway deployment automation | ⚠️ Partial | High | Shell scripts for build, SSH transfer, systemd installation, and restart |
| Configuration sync (push-based) | ✅ Complete | - | RefreshConfigurationRequest via Service Bus triggers immediate config reload |
Success Criteria:
- Cloud commands reach gateway within 2 seconds (Service Bus queue delivery)
- Gateway deployments complete in <5 minutes via automated script
- Configuration changes propagate to gateways within 10 seconds (push-based)
Dependencies:
Bidirectional communication requires Platform API command endpoints(implemented - seeMessagingEndpoints.cs)- Deployment automation leverages existing
SshConnectionManager.csin tools
Implementation Status:
Completed:
- ✅ Gateway>Cloud telemetry via Azure Service Bus:
TelemetryFrameServiceBusPublisherHandlerpublishes snapshots to Service Bus topic- HTTP fallback via
TelemetryFrameHttpPublisherHandler - Telemetry includes gatewayId, assetName, timestamp, metrics
- ✅ Cloud>Gateway bidirectional commands via polymorphic session queue:
- Single
gateway-commandsqueue withPolymorphicHandlerDescriptorrouting PingGatewayRequest/PingGatewayResponse- health check and latency measurementSetBatterySetpointRequest/SetBatterySetpointResponse- remote battery control (mode + watts)RefreshConfigurationRequest/RefreshConfigurationResponse- trigger immediate config reload- Message contracts shared in
Voltimax.Iot.Contracts.Messaging - Platform API endpoints:
POST /api/messaging/gateways/{gatewayId}/ping,.../batteries/{name}/setpoint,.../refresh-configuration - Edge handlers dispatch to Mediator commands; results returned via request/reply
- Single
- ✅ Gateway build system:
- Platform targets: linux-arm64, win-x64
- Self-contained builds supported
- Systemd service support (
Program.cs:AddSystemd()) - Windows Service support (
Program.cs:AddWindowsService())
- ✅ CLI service management (
ServiceInstallCommand,ServiceStartCommand,ServiceStopCommand,ServiceUninstallCommand) - ✅ SSH tooling (
SshConnectionManager.csintools/Voltimax.Iot.Tools)
In Progress:
- ⚠️ Configuration polling (
UpdateConfigurationJob) every 5 minutes:- Polls
IGatewayConfigurationApi.GetLocationConfigurationByIdAsync() - SHA256 hash comparison detects changes
- Persists to
assets.jsonlocally - Publishes
ConfigurationUpdatedNotificationwithRequiresRestartflag - Missing: Push-based mechanism
- Polls
Not Started:
- ❌ Deployment automation scripts: No shell scripts in
tools/deployment/for:- Build (
dotnet publish -r linux-arm64 -c Release --self-contained) - Transfer (SCP binary to gateway)
- Install (systemd service setup)
- Restart (
systemctl restart voltimax-gateway)
- Build (
Current State:
- Gateway>Cloud: Azure Service Bus telemetry publishing operational
- Cloud>Gateway: Polymorphic session queue (
gateway-commands) with request/reply pattern- Available commands:
PingGatewayRequest,SetBatterySetpointRequest,RefreshConfigurationRequest - Message contracts in
Voltimax.Iot.Contracts.Messaging(shared between cloud and edge) - Platform API endpoints in
MessagingEndpoints.cs(POST /api/messaging/gateways/{gatewayId}/...) - Edge handlers dispatch to Mediator commands (
SetBatterySetpointCommand,RefreshConfigurationCommand)
- Available commands:
- Gateway polls Platform API every 5 minutes for config updates (
UpdateConfigurationJob) - Push-based config refresh available via
RefreshConfigurationRequestService Bus command - Gateway builds for linux-arm64, win-x64 with systemd/Windows Service support
SshConnectionManager.csexists intools/Voltimax.Iot.Toolsfor SSH operations
Bidirectional Communication Approach:
- Phase 1 (implemented): Single polymorphic session queue (
gateway-commands) for all cloud>gateway commands- Gateway accepts its session (gatewayId) on the shared queue
PolymorphicHandlerDescriptorroutes byMessageTypeapplication property to per-type handlers- Request/reply via session-based correlation through
cloud-repliesqueue - Supports: ping, battery setpoints, configuration refresh (extensible via
MapMessage<T>) - Cost: ~$0.05/million operations, acceptable for <100 gateways
- Phase 2 (future): Migrate to MQTT when gateway count > 50-100
- Lower latency (<500ms), better scalability
- Requires MQTT broker (Mosquitto self-hosted or Azure Event Grid MQTT)
Deployment Automation:
- Build script:
dotnet publish -r linux-arm64 -c Release --self-contained - Transfer: SCP binary to gateway via existing
SshConnectionManager - Install: Systemd service setup and configuration
- Restart:
systemctl restart voltimax-gateway - Deployment scripts location:
tools/deployment/
Architecture Evolution:
- OTA updates, batch provisioning, HA mode, and edge analytics deferred for scale
Related ADRs:
- ADR-001: MQTT/HTTP over Azure IoT Hub for vendor independence
- ADR-005: Service Bus queues for command delivery (short-term approach)
Known Risks:
- Service Bus queue per gateway becomes expensive at scale (>100 gateways)
- SSH-based deployment requires gateway network accessibility
- Configuration sync at scale may require rate limiting on Platform API