Introduction
Accessing hardware in embedded systems is notoriously tricky. Beyond the raw registers and memory-mapped I/O, developers often face additional complexities: compression, encoding, DMA orchestration, concurrency, power management, and vendor-specific quirks. Without discipline, code quickly devolves into spaghetti tied tightly to silicon details.
Enter the Hardware Proxy Pattern: a structured design approach that creates a dedicated software proxy layer between the rest of the system and the hardware device. This proxy encapsulates access, transformation, and resource management in a way that makes code more portable, testable, and reliable.
When working with sustainable, low-power embedded systems, following established design patterns is especially valuable. These patterns simplify hardware/software interactions, reduce power consumption, and make the system more maintainable. The Hardware Proxy is just one of several key patterns. Others include:
- Hardware Adapter Pattern – Adapt between mismatched interfaces.
- Mediator Pattern – Coordinate multiple modules without tangled dependencies.
- Observer Pattern – Distribute sensor data efficiently to subscribers.
- Debouncing Pattern – Filter noisy signals like button presses.
- Interrupt Pattern – Respond quickly and efficiently to urgent events.
- Polling Pattern – Periodically check sensors when interrupts aren’t practical.
This post focuses on the Hardware Proxy Pattern. Future posts in this series will cover the other patterns with detailed explanations, C/C++ code examples, power optimization tips, and real-world use cases.
What is the Hardware Proxy Pattern?
A Hardware Proxy is a software element responsible for all access to a hardware block. It provides a stable, high-level API to clients, while internally handling:
- Register access and low-level protocols
- Data transformations (compression, encoding, framing)
- Power and clock management
- Concurrency control and request serialization
- Error handling, retries, and recovery
- Capability discovery for multiple hardware variants
Instead of exposing hardware registers or DMA buffers directly, the proxy mediates all interactions, ensuring the rest of the system only deals with clean, abstracted operations.
Responsibilities
- Encapsulation – hide registers, interrupts, and vendor quirks.
- Transformation – perform pre/post-processing (e.g., compression, encryption, packetization).
- Lifecycle – own initialization, power-up/down, and shutdown.
- Concurrency – serialize access and enforce safe multi-client sharing.
- Error Recovery – retries, timeouts, and watchdog integration.
- Testing Hooks – provide mocks and simulations for unit testing.
When to Use It
The Hardware Proxy Pattern shines in cases where:
- Complex hardware pipelines exist (e.g., codecs, encryption engines).
- Multiple software clients compete for the same device.
- Transformations (compression, encoding, framing) are mandatory.
- Platform portability is required — one driver must run across multiple SoCs.
- Safety-critical or testable design is essential.
It is less necessary in trivial cases (e.g., a simple GPIO toggle), where a thin Hardware Abstraction Layer (HAL) may suffice.
Benefits
- Encapsulation of complexity – one point of truth for hardware access.
- Improved testability – mock backends replace real hardware.
- Centralized power/resource management – no duplicate clock toggles scattered in the code.
- Cleaner client code – clients issue high-level operations without worrying about low-level details.
- Portability – proxy hides vendor-specific details.
Drawbacks
- Extra indirection can add latency.
- Larger code size in very constrained environments.
- If poorly designed, the proxy can become a bottleneck or a “god object.”
Design Variants
1. Synchronous Proxy (Blocking API)
In this variant, the proxy exposes simple blocking functions. When a client calls the API, the proxy immediately performs any required data transformation, configures the hardware, and waits until the operation completes or times out. This is easiest to implement and use but ties up the calling thread until completion.
- Use Case: Simple transactions like reading a register or writing a short buffer.
- Pros: Straightforward logic, predictable flow.
- Cons: Can waste CPU cycles and increase energy use if waiting on slow hardware.
2. Asynchronous Proxy (Callback / Event Driven)
Here, the proxy initiates the hardware transaction but returns immediately. Completion is reported via callbacks, events, or futures/promises. This avoids blocking the caller and allows the system to sleep or perform other work while the hardware is busy.
- Use Case: Long data transfers over UART, SPI, or radio.
- Pros: Efficient CPU and power usage, scales well to concurrent clients.
- Cons: More complex state management; requires careful concurrency handling.
3. Command Queue + Worker Task
In this design, the proxy maintains a queue of pending requests. A dedicated worker task or thread dequeues items, applies transformations, and services the hardware. Interrupts only signal task wake-ups or enqueue events. This design is common in RTOS-based systems.
- Use Case: Multiple clients sharing the same peripheral.
- Pros: Centralized arbitration, avoids conflicts, keeps ISRs short.
- Cons: Extra RAM for queue; adds scheduling latency.
4. Transform Strategy Plug-in
This variant separates hardware access from data transformation by allowing a pluggable strategy object or function table. The proxy can use different compression, encoding, or security algorithms without changing its own logic.
- Use Case: A sensor hub that may need different encoding schemes for different networks.
- Pros: High flexibility, easy testing of alternative algorithms.
- Cons: Slight overhead of indirection; requires clear interface contracts.
5. Capability Discovery
The proxy exposes a standardized way to query supported features (e.g., max transfer size, supported encodings, hardware version). This allows higher layers to adapt dynamically.
- Use Case: Supporting multiple revisions of hardware where features may differ.
- Pros: Improves portability and forward compatibility; reduces hardcoding.
- Cons: Slightly more code; must ensure capability queries remain accurate and efficient.
Concurrency & ISR Rules
- Keep ISRs short: acknowledge interrupt, enqueue event, exit.
- Proxy APIs must document context: ISR-safe vs task-only.
- Use lock-free queues: SPSC ring buffers for ISR → task communication.
- Avoid blocking in ISR context.
- Worker tasks handle transformations and callbacks.
Error Handling
- Explicit error codes (
TIMEOUT
,HW_ERR
,BAD_PARAM
). - Retries with backoff for transient failures (e.g., I²C NAKs).
- Timeouts for blocking APIs.
- Integration with watchdog timers.
Testing & Mockability
- Backend interface: swap hardware backend for mock in unit tests.
- Simulated backends: loopback or host-simulated devices.
- Fault injection: simulate timeouts, corrupted frames, bus errors.
- CI integration: run driver logic on host without hardware.
Anti-Patterns
- Heavy computation in ISR context.
- Leaking hardware details beyond the proxy.
- Unbounded queues with potential memory exhaustion.
- Priority inversion (high-priority task waiting on low-priority worker).
Example Workflow (Asynchronous Proxy)
- Caller submits a request via
hw_proxy_submit()
. - Proxy validates and applies transformation.
- Request is enqueued.
- Worker task dequeues and starts DMA transfer.
- DMA ISR signals completion.
- Worker finalizes and invokes callback in task context.
Caller -> Proxy.submit() -> Enqueue -> Worker Task Worker -> Transform.encode() -> Start HW DMA DMA ISR -> Signal Worker Worker -> Callback to User
Checklist for Implementation
- Public API with clear ISR/task context rules.
- Backend interface for real + mock hardware.
- Transform strategy plug-in system.
- Bounded command queue.
- Worker task for heavy lifting.
- Error handling with retries and timeouts.
- Power/resource management.
- Capability query API.
- Unit tests with mocks and simulations.
- Watchdog and health-check integration.
Scope of These Patterns
All of the design variants and approaches discussed above — synchronous proxy, asynchronous proxy, command queue with worker, transform plug-ins, and capability discovery — are specifically focused on how software should access and interact with hardware. They do not describe higher-level business logic or application state management, but instead concentrate solely on the safe, efficient, and portable use of physical peripherals. By treating these patterns as hardware-access mechanisms, you ensure that the rest of your system remains decoupled from low-level details while still taking advantage of optimized, sustainable interactions with the device.
Hardware Proxy: design variants
Variant | Purpose | When to use | API type | ISR-safe? | Concurrency model | Memory impact | Power impact | Pros |
---|---|---|---|---|---|---|---|---|
Synchronous Proxy (Blocking API) | Simple blocking access: caller waits for completion | Short/fast ops (register reads, short writes), simple systems | blocking call | No (not for ISR) | caller blocked; may use mutex if multi-client | Low | Less efficient (caller busy-waits) | Easy to implement & reason about |
Asynchronous Proxy (Callback / Event Driven) | Non-blocking ops; notify on completion | Long transfers (DMA, radio), multi-client systems | submit + callback/future/event | submit() can be ISR-safe if lock-free | worker + callbacks or futures | Medium | Efficient — allows sleep while waiting | Scales well; energy efficient |
Command Queue + Worker Task | Serialize multi-client access via a bounded queue & dedicated worker | RTOS systems, shared peripherals, throughput batching | enqueue command (non-blocking if queue bounded) | enqueue may be made ISR-safe (lock-free) | SPSC/SPMC queues, worker thread services HW | Medium → High (queue buffers) | Good (batching reduces wakeups) | Central arbitration; keeps ISRs tiny |
Transform Strategy Plug-in | Separate and swap data transforms (compression/encryption) | When multiple encoding strategies or runtime flexibility needed | register/plug transform object; worker calls transform | Transform usually not ISR-safe (heavy CPU) | worker executes transform before HW submit | Varies (depends on transform alg) | Transform may increase CPU energy but reduce transfer power | Highly flexible & testable |
Capability Discovery | Query hardware features & limits at runtime | Boards with multiple HW revisions; feature negotiation | get_capabilities() returning structured caps | Typically task context (read-only) | read-only; thread-safe access | Low | Neutral — enables power-aware decisions | Allows dynamic adaptation & forward compatibility |
Comparison Table: When to Use Each Hardware-Access Pattern
Pattern | Purpose | When to Use | Pros | Cons | Power Impact | Example |
---|---|---|---|---|---|---|
Hardware Proxy | Encapsulate all access & transforms for a hardware block | Complex peripherals (codecs, compressors), shared devices, centralizing power | Strong encapsulation, testable, unified resource mgmt | Extra indirection, latency, code size overhead | Saves power by batching operations and managing clocks centrally | Proxy around a hardware video codec that compresses frames before sending over radio |
Hardware Adapter | Translate between incompatible interfaces | Integrating mismatched modules or 3rd-party drivers | Enables reuse, decouples interfaces | Adds glue code, can hide inefficiency | Neutral, may enable more efficient interoperability | Adapter mapping a generic sensor API to a vendor-specific I²C driver |
Mediator | Coordinate interactions between multiple modules | When multiple subsystems (e.g., sensors, actuators) need orchestration | Reduces tangled dependencies, clear coordination | Single point of complexity, potential bottleneck | Can reduce power by aligning sensor reads and batching | Mediator task that schedules sensor sampling to align with radio wake-ups |
Observer | Publish/subscribe distribution of events | Multiple consumers need same data (e.g., sensor + logger + comms) | Efficient fan-out, decoupling | Must manage subscriber lifetimes & concurrency | Very power-efficient if event-driven instead of polling | Notifying multiple tasks of new accelerometer data without extra reads |
Debouncing | Filter noisy inputs before reporting stable events | Buttons, switches, mechanical contacts | Prevents false triggers, reduces wasted CPU work | Risk of tuning errors (too slow or too sensitive) | Saves power by avoiding spurious wakeups and repeated processing | Debouncing a GPIO pin connected to a push button |
Interrupt Pattern | Immediate response to urgent hardware events | Time-critical events, wake-on-event cases | Low latency, energy efficient while idle | ISRs must be short; complex concurrency handling | Extremely power-efficient (CPU sleeps until interrupt) | Waking MCU from deep sleep on motion sensor interrupt |
Polling Pattern | Periodically sample hardware state | When interrupts unavailable or too costly | Simple, predictable timing, easy to test | Wastes CPU if too frequent, high latency if too sparse | Power-hungry if poll rate too high; adjustable duty cycle mitigates | Polling a temperature sensor once every second |
Reference:
Design Patterns for Embedded Systems in C by Bruce Powel Douglass
Conclusion
The Hardware Proxy Pattern offers a disciplined way to manage hardware access in embedded systems. By centralizing access, transformations, and concurrency handling, it makes systems more robust, portable, and testable. While it introduces some indirection, the trade-off is well worth it for non-trivial hardware blocks.
In practice, you’ll want to choose between synchronous or asynchronous proxies, decide whether to use pluggable transforms, and carefully define context rules. Combined with strong testing and error-handling practices, the Hardware Proxy Pattern can significantly improve the reliability of your embedded systems.
This article is part of a series on Common Embedded Design Patterns for Sustainable Systems. In the upcoming posts, we’ll dive deeper into the Adapter, Mediator, Observer, Debouncing, Interrupt, and Polling patterns — complete with code examples, power optimization tips, and real-world use cases.