Skip to content

Gateway (Edge) Roadmap

This document outlines the roadmap for the Voltimax Edge Gateway software.

Reference: src/Voltimax.Edge.Gateway.Service/, src/Voltimax.Edge.Modbus/, src/Voltimax.Edge.Ocpp/


Phase G1: Strategy Expansion

Reference Implementation: NetZeroStrategy in src/Voltimax.Edge.Gateway.Service/Strategies/

Architecture:

  • Interface: IStrategy with ApplyStrategy(StrategyContext, CancellationToken)
  • Base class: AbstractStrategy (provides IMediator, GetBatteryName helper)
  • Factory: StrategyFactory with keyed DI registration
  • Enum: Strategy (None, NetZero, Discharge, Charge, PeakShaving, Manual)
StrategyStatusPriorityDescription
ChargeStrategy✅ Complete-Charges to MaximumStateOfCharge at MaxChargeRateWatts
DischargeStrategy✅ Complete-Discharges from MinimumStateOfCharge at MaxDischargeRateWatts
NetZeroStrategy✅ Complete-PID-based net-zero optimization with rate limiting, quantization, integral decay
NullStrategy✅ Complete-No-op strategy for idle mode
PeakShavingStrategy❌ Not StartedHighReduce grid peak demand by discharging during high-usage periods
TimeOfUseStrategy❌ Not StartedHighOptimize based on electricity tariff schedules (charge cheap, discharge expensive)
SelfConsumptionStrategy❌ Not StartedHighMaximize use of solar generation, minimize grid export
GridServicesStrategy❌ Not StartedMediumParticipate in grid frequency regulation and demand response
ArbitrageStrategy❌ Not StartedMediumBuy low/sell high based on spot market prices (EPEX integration)
BackupPowerStrategy❌ Not StartedMediumMaintain reserve SoC for outage protection
EVChargingStrategy❌ Not StartedLowCoordinate battery with EV charging schedules

Implementation Status:

Completed:

  • ✅ Core strategy framework with IStrategy, AbstractStrategy, StrategyFactory
  • ✅ NetZeroStrategy (258 lines) with:
    • PID controller (P, I, D gains configurable)
    • Deadband detection, rate limiting, register quantization
    • Integral decay to prevent windup
    • DecisionMetrics for observability
    • Cache-based last setpoint tracking (Redis, 5-min TTL)
  • ✅ Basic Charge/Discharge/Null strategies
  • ✅ Strategy execution via ApplyStrategyJob (NCronJob scheduled)
  • ✅ Command handlers: ApplyStrategyCommand, SetBatterySetpointCommand

Not Started:

  • ❌ 7 advanced strategies (PeakShaving, TimeOfUse, SelfConsumption, GridServices, Arbitrage, BackupPower, EVCharging)

Each strategy should follow NetZeroStrategy patterns:

  • Configuration class with validation (e.g., PeakShavingStrategyConfiguration)
  • Keyed DI registration in ServiceCollectionExtensions
  • DecisionMetrics for observability
  • Threshold-based or PID control logic (deterministic, no ML)

Success Criteria:

  • Each strategy reduces energy costs by measurable % vs baseline
  • Strategies honor battery safety limits (SoC, temperature, cycle life)
  • DecisionMetrics telemetry shows strategy reasoning in real-time

Dependencies:

  • TimeOfUseStrategy requires tariff configuration API (blocks F1)
  • ArbitrageStrategy requires EPEX pricing (completed)
  • EVChargingStrategy requires OCPP charging station integration (completed)

Known Risks:

  • PID tuning for different battery chemistries may require field testing
  • Grid frequency regulation requires low-latency command path (<500ms)
  • Strategy conflicts possible if multiple enabled simultaneously

Phase G2: Protocol & Control Expansion

Reference: src/Voltimax.Edge.Modbus/, src/Voltimax.Edge.Ocpp/, src/Voltimax.Edge.Gateway.Service/Commands/

TaskStatusPriorityDescription
Modbus TCP/RTU protocol✅ Complete-Full Modbus implementation with connection pooling, resilience, OpenTelemetry
HTTP protocol support✅ Complete-HTTP client for devices like Peblar charge station
Device auto-discovery❌ Not StartedHighmDNS/DNS-SD and MQTT discovery for network devices (Homey, HA, IoT devices)
MQTT client support⚠️ PartialHighMQTT 3.1.1/5.0 client for lightweight device integration
Modbus TCP discovery❌ Not StartedMediumNetwork scan + SunSpec interrogation for inverters/batteries
Additional SunSpec models⚠️ PartialMediumExpand coverage (currently 60+ models, add remaining)
Local automation rules❌ Not StartedMediumIf-then rules evaluated locally (e.g., if SoC < 10%, stop discharge)
Offline operation mode⚠️ PartialMediumContinue strategy execution when cloud disconnected
Control prioritization❌ Not StartedMediumPriority queue (safety > manual > scheduled > optimization)
Safety interlocks⚠️ PartialHighHard limits enforced regardless of commands (temperature, SoC, current)
BACnet protocol❌ Not StartedLowBACnet/IP for building automation integration
Custom protocol SDK❌ Not StartedLowPlugin system for user-defined protocol handlers

Success Criteria:

  • MQTT client maintains stable connection with <1% packet loss
  • Safety interlocks prevent 100% of out-of-bounds commands in testing
  • Offline mode continues operation for 24+ hours without cloud

Dependencies:

  • Safety interlocks require complete battery specification database

Implementation Status:

Completed:

  • Modbus TCP/RTU (VoltimaxModbusClient) with:
    • Connection pooling with auto-reconnect (3 attempts, exponential backoff)
    • Resilience pipeline (retry, circuit breaker, 5s timeout)
    • OpenTelemetry instrumentation
    • Supported devices: CNTE Battery, Growatt Inverter, Peblar EV Charger, Autel MaxiCharger, Siemens PAC 2200, Eastron SDM 630, ABB B23
  • HTTP protocol for web-API devices (Peblar charge station)
  • Command execution engine (ApplyStrategyCommand, SetBatterySetpointCommand, SendControlSignalsCommand, RefreshConfigurationCommand)
  • Simulation support for all asset types (Battery, Solar, Grid, Charger, Building)
  • CLI tooling for asset management, testing, monitoring

In Progress:

  • ⚠️ MQTT options (MqttOptions.cs) configured but no active client implementation
  • ⚠️ Offline operation: Caching layer exists (Redis), config persisted locally, but no explicit offline fallback logic
  • ⚠️ Safety interlocks: Battery min/max SoC checks in strategies, emergency stop interface exists, but missing comprehensive hard limits at protocol layer
  • ⚠️ SunSpec models: Model classes generated, full coverage not verified

Not Started:

  • ❌ Device auto-discovery (mDNS/DNS-SD, MQTT discovery)
  • ❌ MQTT client integration (publish/subscribe handlers)
  • ❌ Modbus TCP discovery (network scan, SunSpec interrogation)
  • ❌ Local automation rules engine
  • ❌ Control prioritization queue
  • ❌ BACnet protocol
  • ❌ Custom protocol SDK/plugin system

Known Risks:

  • MQTT at scale (1000+ gateways) may require dedicated broker infrastructure
  • BACnet/IP requires Windows compatibility testing (currently Linux-focused)
  • Custom protocol SDK increases support burden

Phase G3: Fleet Management & Operations

Reference: src/Voltimax.Edge.Gateway.Service/, infra/bicep/, tools/Voltimax.Iot.Tools/

TaskStatusPriorityDescription
Gateway>Cloud telemetry✅ Complete-Azure Service Bus telemetry publishing operational
Cloud>Gateway commands✅ Complete-Polymorphic session queue with request/reply (PingGateway, SetBatterySetpoint, RefreshConfiguration)
Gateway deployment automation⚠️ PartialHighShell scripts for build, SSH transfer, systemd installation, and restart
Configuration sync (push-based)✅ Complete-RefreshConfigurationRequest via Service Bus triggers immediate config reload

Success Criteria:

  • Cloud commands reach gateway within 2 seconds (Service Bus queue delivery)
  • Gateway deployments complete in <5 minutes via automated script
  • Configuration changes propagate to gateways within 10 seconds (push-based)

Dependencies:

  • Bidirectional communication requires Platform API command endpoints (implemented - see MessagingEndpoints.cs)
  • Deployment automation leverages existing SshConnectionManager.cs in tools

Implementation Status:

Completed:

  • Gateway>Cloud telemetry via Azure Service Bus:
    • TelemetryFrameServiceBusPublisherHandler publishes snapshots to Service Bus topic
    • HTTP fallback via TelemetryFrameHttpPublisherHandler
    • Telemetry includes gatewayId, assetName, timestamp, metrics
  • Cloud>Gateway bidirectional commands via polymorphic session queue:
    • Single gateway-commands queue with PolymorphicHandlerDescriptor routing
    • PingGatewayRequest / PingGatewayResponse - health check and latency measurement
    • SetBatterySetpointRequest / SetBatterySetpointResponse - remote battery control (mode + watts)
    • RefreshConfigurationRequest / RefreshConfigurationResponse - trigger immediate config reload
    • Message contracts shared in Voltimax.Iot.Contracts.Messaging
    • Platform API endpoints: POST /api/messaging/gateways/{gatewayId}/ping, .../batteries/{name}/setpoint, .../refresh-configuration
    • Edge handlers dispatch to Mediator commands; results returned via request/reply
  • Gateway build system:
    • Platform targets: linux-arm64, win-x64
    • Self-contained builds supported
    • Systemd service support (Program.cs: AddSystemd())
    • Windows Service support (Program.cs: AddWindowsService())
  • CLI service management (ServiceInstallCommand, ServiceStartCommand, ServiceStopCommand, ServiceUninstallCommand)
  • SSH tooling (SshConnectionManager.cs in tools/Voltimax.Iot.Tools)

In Progress:

  • ⚠️ Configuration polling (UpdateConfigurationJob) every 5 minutes:
    • Polls IGatewayConfigurationApi.GetLocationConfigurationByIdAsync()
    • SHA256 hash comparison detects changes
    • Persists to assets.json locally
    • Publishes ConfigurationUpdatedNotification with RequiresRestart flag
    • Missing: Push-based mechanism

Not Started:

  • Deployment automation scripts: No shell scripts in tools/deployment/ for:
    • Build (dotnet publish -r linux-arm64 -c Release --self-contained)
    • Transfer (SCP binary to gateway)
    • Install (systemd service setup)
    • Restart (systemctl restart voltimax-gateway)

Current State:

  • Gateway>Cloud: Azure Service Bus telemetry publishing operational
  • Cloud>Gateway: Polymorphic session queue (gateway-commands) with request/reply pattern
    • Available commands: PingGatewayRequest, SetBatterySetpointRequest, RefreshConfigurationRequest
    • Message contracts in Voltimax.Iot.Contracts.Messaging (shared between cloud and edge)
    • Platform API endpoints in MessagingEndpoints.cs (POST /api/messaging/gateways/{gatewayId}/...)
    • Edge handlers dispatch to Mediator commands (SetBatterySetpointCommand, RefreshConfigurationCommand)
  • Gateway polls Platform API every 5 minutes for config updates (UpdateConfigurationJob)
  • Push-based config refresh available via RefreshConfigurationRequest Service Bus command
  • Gateway builds for linux-arm64, win-x64 with systemd/Windows Service support
  • SshConnectionManager.cs exists in tools/Voltimax.Iot.Tools for SSH operations

Bidirectional Communication Approach:

  • Phase 1 (implemented): Single polymorphic session queue (gateway-commands) for all cloud>gateway commands
    • Gateway accepts its session (gatewayId) on the shared queue
    • PolymorphicHandlerDescriptor routes by MessageType application property to per-type handlers
    • Request/reply via session-based correlation through cloud-replies queue
    • Supports: ping, battery setpoints, configuration refresh (extensible via MapMessage<T>)
    • Cost: ~$0.05/million operations, acceptable for <100 gateways
  • Phase 2 (future): Migrate to MQTT when gateway count > 50-100
    • Lower latency (<500ms), better scalability
    • Requires MQTT broker (Mosquitto self-hosted or Azure Event Grid MQTT)

Deployment Automation:

  • Build script: dotnet publish -r linux-arm64 -c Release --self-contained
  • Transfer: SCP binary to gateway via existing SshConnectionManager
  • Install: Systemd service setup and configuration
  • Restart: systemctl restart voltimax-gateway
  • Deployment scripts location: tools/deployment/

Architecture Evolution:

  • OTA updates, batch provisioning, HA mode, and edge analytics deferred for scale

Related ADRs:

  • ADR-001: MQTT/HTTP over Azure IoT Hub for vendor independence
  • ADR-005: Service Bus queues for command delivery (short-term approach)

Known Risks:

  • Service Bus queue per gateway becomes expensive at scale (>100 gateways)
  • SSH-based deployment requires gateway network accessibility
  • Configuration sync at scale may require rate limiting on Platform API