The embedded engineers who understand both real-time firmware architecture and scalable IoT system design will define the next generation of connected products. STM32 + FreeRTOS + MQTT is the professional skill stack that makes this possible.
Industrial controllers, smart energy meters, predictive maintenance nodes, remote telemetry gateways, and factory automation systems all share one requirement: reliable, low-overhead cloud connectivity. MQTT has become the standard protocol for embedded IoT communication, and for good reason. Its publish/subscribe architecture, minimal overhead, and support for Quality-of-Service guarantees make it uniquely suited to the constraints of embedded firmware.
This guide builds a complete, production-oriented IoT architecture using STM32 as the real-time controller, FreeRTOS for deterministic task scheduling, ESP32 or ESP8266 as a dedicated Wi-Fi coprocessor, and MQTT as the cloud communication protocol. This is not a beginner tutorial about blinking LEDs over Wi-Fi. Every design decision here reflects what real embedded products require: task isolation, queue-driven data flow, DMA-based UART communication, robust reconnection handling, memory discipline, and proper security foundations.
Why MQTT Dominates Embedded IoT in 2026
MQTT (Message Queuing Telemetry Transport) was created in 1999 by Andy Stanford-Clark of IBM and Arlen Nipper of Arcom to enable minimal battery consumption and bandwidth usage when transmitting data from oil pipelines via satellite. The protocol was designed around five core requirements: simple implementation, Quality of Service data delivery, lightweight and bandwidth-efficient operation, data agnosticism, and continuous session awareness. These requirements still define why MQTT is the right choice for constrained embedded systems today.
MQTT uses a binary message format over a persistent TCP connection, in contrast to HTTP which is text-based and stateless. A minimal MQTT CONNECT packet is 14 bytes. A PUBLISH packet carrying 20 bytes of sensor data adds only 4 bytes of header overhead. HTTP would require hundreds of bytes for the same transmission. On a device sending readings every 10 seconds over a cellular link, this difference is the margin between a 2-year and a 4-year battery life.
MQTT 5, ratified by OASIS in March 2019, added features critical for production IoT: reason codes on all acknowledgments, session and message expiry intervals, user properties for metadata, shared subscriptions for load balancing, and request/response patterns. For embedded systems connecting to modern cloud platforms like AWS IoT Core and Azure IoT Hub, MQTT 5 support is increasingly important.
MQTT vs HTTP: Why the Difference Matters for Firmware
| Feature | HTTP/REST | MQTT |
|---|---|---|
| Communication Model | Request/Response | Publish/Subscribe |
| Persistent Connection | No (per-request) | Yes (TCP keepalive) |
| Header Overhead | 200-800 bytes | 2-5 bytes |
| Real-Time Messaging | Requires polling | Push-based, instant |
| QoS Guarantees | None built-in | QoS 0, 1, 2 |
| Power Efficiency | Low | High |
| Bi-Directional Commands | Complex | Native |
| Embedded Suitability | Moderate | Excellent |
MQTT Quality of Service Levels
MQTT defines three Quality-of-Service levels that control delivery guarantees between clients and the broker. Understanding which level to use for each message type is a critical production firmware decision.
| QoS Level | Guarantee | Mechanism | Best Use Case |
|---|---|---|---|
| QoS 0 | At most once (fire and forget) | No acknowledgment | High-rate telemetry, sensor streams |
| QoS 1 | At least once | PUBACK required, retransmit on timeout | Commands, critical events, alarms |
| QoS 2 | Exactly once | 4-way PUBREC/PUBREL/PUBCOMP handshake | Billing data, safety-critical messages |
In production embedded systems, use QoS 0 for high-rate telemetry where occasional message loss is acceptable and bandwidth conservation is critical. Use QoS 1 for commands, configuration updates, and alarm events where at-least-once delivery is required. QoS 2 is rarely needed in embedded work — its 4-way handshake adds significant latency and resource usage that is justified only for financial transactions or safety-critical commands where exactly-once semantics are mandated.
System Architecture: Separating Real-Time Control from Networking
The most important architectural decision in an STM32 + MQTT system is where to run the network stack. Many prototypes run LwIP or an MQTT client library directly on the STM32, tightly coupling real-time control with network behavior. This works in demos but fails in production systems because Wi-Fi stacks are inherently non-deterministic: TCP retransmissions, DHCP renewals, TLS handshakes, and Wi-Fi association events all introduce variable latency that can interfere with control loops, sensor acquisition timing, and safety-critical processing.
The professional solution is to use the ESP32 or ESP8266 as a dedicated network coprocessor, communicating with the STM32 exclusively over UART using a structured packet protocol. The STM32 handles all real-time work — sensor reading, control algorithms, motor drive, safety monitoring — and offloads all networking to the ESP32. This separation gives you three major production advantages.
First, real-time determinism is preserved. The STM32’s FreeRTOS scheduler never blocks waiting for TCP acknowledgments or Wi-Fi reconnection. Second, fault isolation improves dramatically — a TCP stack crash or Wi-Fi association failure on the ESP32 does not corrupt STM32 heap memory or starve FreeRTOS tasks. Third, the architecture becomes modular and upgradeable: replacing Wi-Fi with LTE, Ethernet, or LoRa requires only changes to the coprocessor firmware and UART packet handlers, not a redesign of the control firmware.
Recommended Hardware
| Component | Purpose | Recommended Parts |
|---|---|---|
| Main Controller | Real-time processing, FreeRTOS | STM32F407, STM32F746, STM32H743 |
| Wi-Fi Coprocessor | MQTT, TCP/IP, Wi-Fi stack | ESP32-WROOM-32, ESP8266 (ESP-12F) |
| UART Interface | STM32 to ESP32 communication | 115200 to 921600 baud, DMA-driven |
| Sensors | Data acquisition | I2C/SPI sensors (BME280, MPU6050, etc.) |
| MQTT Broker | Cloud message routing | AWS IoT Core, Azure IoT Hub, EMQX, HiveMQ |
FreeRTOS Task Architecture for Production MQTT Systems
The most common firmware architecture mistake in MQTT-connected embedded systems is putting everything into a single task or, worse, running network code from interrupt service routines. Professional systems use task isolation: each functional concern has its own dedicated FreeRTOS task with an appropriate priority, stack size, and communication channel. This produces firmware that is debuggable, testable, and fault-tolerant.
| Task Name | Priority | Stack (words) | Purpose |
|---|---|---|---|
| SensorTask | High (4) | 256 | Reads sensors, posts to sensor queue |
| ControlTask | High (4) | 512 | PID loops, actuator control, safety |
| MQTTPublishTask | Medium (3) | 512 | Consumes sensor queue, formats and sends MQTT publish |
| MQTTReceiveTask | Medium (3) | 512 | Receives MQTT messages from ESP32, routes commands |
| WiFiManagerTask | Medium (2) | 384 | Manages Wi-Fi and MQTT broker connection lifecycle |
| UARTDriverTask | High (4) | 256 | UART DMA ring buffer processing, packet framing |
| LoggerTask | Low (1) | 256 | Debug output, status logging over debug UART |
| WatchdogTask | Critical (5) | 128 | Task heartbeat monitoring, IWDG refresh |
Task priorities must be assigned with care. In FreeRTOS, a higher priority task always preempts a lower priority task when it becomes ready to run. Sensor and control tasks run at high priority because they are time-critical. The MQTT publish task runs at medium priority because network latency is acceptable — a sensor reading that waits 100 ms to be published is fine. The Watchdog task runs at the highest priority because it must always run even under system stress. The Logger task runs at the lowest priority so that debug output never delays real work.
Queue-Driven Data Flow Architecture
FreeRTOS queues are the correct mechanism for passing data between tasks. Queues operate by value copy — the sending task copies data into the queue, and the receiving task copies data out. This means neither task holds a shared pointer to the data, eliminating entire classes of race conditions and memory corruption bugs. The FreeRTOS queue API handles all internal synchronization with interrupt-safe atomic operations.
The data flow in a production MQTT sensor system follows this pattern: Sensors produce data into the sensor queue. The MQTT publish task consumes from the sensor queue and places formatted packets into the UART transmit queue. The UART driver sends packets to the ESP32. Incoming data from the ESP32 goes into the UART receive queue. The MQTT receive task consumes incoming messages and posts commands to the control task via a command queue. Every transfer is non-blocking with defined timeouts, and every queue has a defined depth that acts as a natural backpressure mechanism.
Sensor Data Structures and Queue Implementation
Define your sensor data structure as a fixed-size struct that can be safely copied by value through FreeRTOS queues. Avoid dynamically allocated fields — pointers in queue messages create ownership ambiguity and memory leak risks that are extremely difficult to debug in production.
/* sensor_types.h */
#include <stdint.h>
#define SENSOR_NODE_ID_LEN 16
#define MAX_TOPIC_LEN 64
typedef enum {
SENSOR_TYPE_TEMPERATURE = 0,
SENSOR_TYPE_HUMIDITY,
SENSOR_TYPE_PRESSURE,
SENSOR_TYPE_VIBRATION,
SENSOR_TYPE_CURRENT
} sensor_type_t;
typedef struct {
float temperature; /* degrees Celsius */
float humidity; /* percent RH */
float pressure; /* hPa */
float vibration_rms; /* m/s^2 RMS */
uint32_t timestamp_ms; /* HAL_GetTick() at acquisition */
uint8_t node_id[SENSOR_NODE_ID_LEN];
sensor_type_t type;
uint8_t sequence; /* rolling counter for loss detection */
} sensor_data_t;
/* Queue handles - defined in main.c, declared extern here */
extern QueueHandle_t xSensorQueue;
extern QueueHandle_t xMQTTPublishQueue;
extern QueueHandle_t xCommandQueue;
/* main.c - Queue creation at startup */
#define SENSOR_QUEUE_DEPTH 10
#define MQTT_PUB_QUEUE_DEPTH 8
#define COMMAND_QUEUE_DEPTH 4
QueueHandle_t xSensorQueue;
QueueHandle_t xMQTTPublishQueue;
QueueHandle_t xCommandQueue;
/* Create all queues before starting the scheduler */
void System_CreateQueues(void)
{
xSensorQueue = xQueueCreate(SENSOR_QUEUE_DEPTH,
sizeof(sensor_data_t));
configASSERT(xSensorQueue != NULL);
xMQTTPublishQueue = xQueueCreate(MQTT_PUB_QUEUE_DEPTH,
sizeof(sensor_data_t));
configASSERT(xMQTTPublishQueue != NULL);
xCommandQueue = xQueueCreate(COMMAND_QUEUE_DEPTH,
sizeof(uint32_t));
configASSERT(xCommandQueue != NULL);
}
Sensor Producer Task
/* sensor_task.c */
#include "FreeRTOS.h"
#include "task.h"
#include "queue.h"
#include "sensor_types.h"
static uint8_t s_sequence = 0;
void SensorTask(void *argument)
{
sensor_data_t data;
TickType_t xLastWakeTime = xTaskGetTickCount();
const TickType_t xPeriod = pdMS_TO_TICKS(1000); /* 1 Hz */
/* Node identifier - set from flash/EEPROM in production */
const uint8_t node_id[SENSOR_NODE_ID_LEN] = "FACTORY-LINE1-01";
for (;;)
{
/* Read sensors - these are synchronous HAL calls */
data.temperature = BME280_ReadTemperature();
data.humidity = BME280_ReadHumidity();
data.pressure = BME280_ReadPressure();
data.vibration_rms = ADXL355_ReadVibrationRMS();
data.timestamp_ms = HAL_GetTick();
data.type = SENSOR_TYPE_TEMPERATURE;
data.sequence = s_sequence++;
memcpy(data.node_id, node_id, SENSOR_NODE_ID_LEN);
/* Post to queue - non-blocking (drop if queue full) */
if (xQueueSend(xSensorQueue, &data, 0) != pdTRUE)
{
/* Queue full - log overflow event */
Logger_LogWarning("SENSOR: Queue full, sample dropped");
}
/* Precise periodic timing using vTaskDelayUntil */
vTaskDelayUntil(&xLastWakeTime, xPeriod);
}
}
Using vTaskDelayUntil instead of vTaskDelay is important for periodic sensor tasks. vTaskDelay creates a delay from the current time, which means execution drift accumulates over time as task body execution consumes variable amounts of time. vTaskDelayUntil delays until an absolute tick count, producing precisely periodic behavior regardless of how long the task body takes to execute.
MQTT Publish Task
/* mqtt_publish_task.c */
#include "FreeRTOS.h"
#include "task.h"
#include "queue.h"
#include "sensor_types.h"
#include "esp_uart.h"
#define MQTT_TOPIC_BASE "factory/line1/node1"
#define PAYLOAD_BUFFER_LEN 192
void MQTTPublishTask(void *argument)
{
sensor_data_t data;
char payload[PAYLOAD_BUFFER_LEN];
int payload_len;
for (;;)
{
/* Block indefinitely waiting for sensor data */
if (xQueueReceive(xSensorQueue, &data, portMAX_DELAY) == pdTRUE)
{
/* Format JSON payload - avoid dynamic allocation */
payload_len = snprintf(
payload,
sizeof(payload),
"{"seq":%u,"ts":%lu,"temp":%.2f,"
""hum":%.2f,"pres":%.1f,"vib":%.3f}",
(unsigned)data.sequence,
(unsigned long)data.timestamp_ms,
data.temperature,
data.humidity,
data.pressure,
data.vibration_rms
);
if (payload_len > 0 && payload_len < PAYLOAD_BUFFER_LEN)
{
/* Send to ESP32 via UART */
ESP_MQTT_Publish(
MQTT_TOPIC_BASE "/telemetry",
payload,
payload_len,
0 /* QoS 0 for high-rate telemetry */
);
}
}
}
}
/* Publish critical alarms with QoS 1 */
void MQTT_PublishAlarm(const char *alarm_type, uint32_t value)
{
char payload[96];
int len;
len = snprintf(payload, sizeof(payload),
"{"alarm":"%s","value":%lu,"ts":%lu}",
alarm_type, (unsigned long)value,
(unsigned long)HAL_GetTick());
if (len > 0)
{
ESP_MQTT_Publish(
MQTT_TOPIC_BASE "/alarm",
payload,
len,
1 /* QoS 1 for guaranteed alarm delivery */
);
}
}
UART DMA Driver: The STM32-to-ESP32 Communication Layer
UART communication between STM32 and ESP32 must be implemented using DMA with idle-line detection. Polling UART byte-by-byte in a loop burns CPU cycles that belong to higher-priority tasks. Blocking UART transmit functions inside FreeRTOS tasks introduce unpredictable latency. The professional approach uses STM32’s UART idle-line interrupt combined with circular DMA reception to detect packet boundaries without polling, and DMA transmit to move data without blocking the CPU.
The UART idle-line interrupt fires when the UART RX line becomes idle after receiving data — precisely when a complete packet has arrived. This eliminates the need for software timeouts or byte-counting to detect packet end. Combined with DMA circular mode on the receive buffer, the CPU only needs to process data when a complete packet has been received.
UART Packet Frame Structure
Design a fixed packet structure for STM32-to-ESP32 communication. A well-designed frame provides packet synchronization, length framing, command routing, and error detection. CRC16 is the appropriate checksum for UART frames — it detects all single and double-bit errors, all odd numbers of errors, and all burst errors shorter than 16 bits, providing reliable error detection within the hardware capabilities of UART.
/* uart_protocol.h */
#include <stdint.h>
#define UART_FRAME_HEADER 0xAA55
#define UART_MAX_PAYLOAD_LEN 200
#define UART_RX_BUFFER_SIZE 512
#define UART_TX_BUFFER_SIZE 512
/* Command IDs */
typedef enum {
CMD_MQTT_PUBLISH = 0x01,
CMD_MQTT_SUBSCRIBE = 0x02,
CMD_WIFI_CONNECT = 0x10,
CMD_WIFI_STATUS = 0x11,
CMD_MQTT_STATUS = 0x20,
CMD_HEARTBEAT = 0x30,
CMD_INCOMING_MSG = 0x40, /* ESP32 -> STM32 */
CMD_ACK = 0xFF
} uart_cmd_t;
/* Packed frame structure */
#pragma pack(push, 1)
typedef struct {
uint16_t header; /* 0xAA55 sync word */
uint16_t length; /* payload length in bytes */
uint8_t cmd; /* command ID */
uint8_t sequence; /* rolling frame counter */
uint8_t payload[UART_MAX_PAYLOAD_LEN];
uint16_t crc16; /* CRC16/CCITT over header+length+cmd+seq+payload */
} uart_frame_t;
#pragma pack(pop)
/* Ring buffer for DMA receive */
typedef struct {
uint8_t buffer[UART_RX_BUFFER_SIZE];
uint16_t head;
uint16_t tail;
} uart_ring_buffer_t;
/* uart_driver.c - DMA + idle-line interrupt setup */
#include "uart_protocol.h"
static uint8_t s_dma_rx_buffer[UART_RX_BUFFER_SIZE];
static uart_ring_buffer_t s_rx_ring;
static UART_HandleTypeDef *s_huart;
/* Initialize UART with DMA circular mode and idle-line interrupt */
void UART_Driver_Init(UART_HandleTypeDef *huart)
{
s_huart = huart;
/* Enable UART idle-line interrupt */
__HAL_UART_ENABLE_IT(huart, UART_IT_IDLE);
/* Start DMA reception in circular mode (continuous, no stop) */
HAL_UART_Receive_DMA(huart,
s_dma_rx_buffer,
UART_RX_BUFFER_SIZE);
}
/* Called from UART IRQ handler - reads DMA position */
void UART_Driver_IdleCallback(UART_HandleTypeDef *huart)
{
if (__HAL_UART_GET_FLAG(huart, UART_FLAG_IDLE))
{
__HAL_UART_CLEAR_IDLEFLAG(huart);
/* Calculate how much data DMA has received */
uint16_t dma_pos = UART_RX_BUFFER_SIZE -
(uint16_t)__HAL_DMA_GET_COUNTER(huart->hdmarx);
/* Process new data in ring buffer */
UART_Driver_ProcessDMA(dma_pos);
}
}
/* IRQ handler in stm32fxxx_it.c */
void USART2_IRQHandler(void)
{
HAL_UART_IRQHandler(&huart2);
UART_Driver_IdleCallback(&huart2);
}
ESP32 AT Command Interface for MQTT
When using the ESP32 as a coprocessor running Espressif’s official AT firmware, the STM32 communicates using the ESP-AT command set over UART. The ESP-AT firmware supports full MQTT operation including TLS (scheme options 1 through 10 covering TCP, TLS with various certificate configurations, and WebSocket variants), keep-alive configuration, Last Will and Testament, and clean session control.
The initialization sequence for MQTT over TLS on ESP32 using AT commands is:
/* ESP32 AT command sequence for MQTT over TLS */
/* Step 1: Connect to Wi-Fi */
AT+CWMODE=1 // Station mode
AT+CWJAP="YourSSID","YourPassword" // Connect to AP
// Response: WIFI CONNECTED, WIFI GOT IP, OK
/* Step 2: Sync time via SNTP (required for TLS certificate validation) */
AT+CIPSNTPCFG=1,5,"pool.ntp.org","time.google.com"
AT+CIPSNPTIME?
// Response: +CIPSNPTIME:Tue May 26 08:00:00 2026
/* Step 3: Configure MQTT user settings */
/* Scheme 3 = MQTT over TLS, verify server certificate */
AT+MQTTUSERCFG=0,3,"stm32-node-01","your_username","your_password",0,0,""
/* Step 4: Configure MQTT connection parameters */
/* keepalive=120s, clean_session=1, LWT topic and message */
AT+MQTTCONNCFG=0,120,0,"factory/nodes/stm32-node-01/status","offline",0,1
/* Step 5: Connect to MQTT broker */
AT+MQTTCONN=0,"your-broker.iot.region.amazonaws.com",8883,1
// Response: +MQTTCONNECTED:0,3,"your-broker.iot.region.amazonaws.com",8883,"",1
/* Step 6: Subscribe to command topic */
AT+MQTTSUB=0,"factory/nodes/stm32-node-01/cmd",1
// Response: OK
/* Step 7: Publish telemetry */
AT+MQTTPUB=0,"factory/line1/node1/telemetry","{"temp":25.3,"hum":62.1}",0,0
// Response: OK
The STM32 builds these AT command strings and sends them via UART DMA. The ESP32 responds with OK, ERROR, or status notifications. Incoming MQTT messages are pushed to the STM32 as unsolicited responses in the format +MQTTSUBRECV:0,”topic”,length,data, which the UART driver parses and routes to the MQTT receive task.
ESP32 AT Command Driver on STM32
/* esp_uart.c - AT command abstraction layer */
#include "uart_protocol.h"
#include "FreeRTOS.h"
#include "semphr.h"
#define AT_CMD_TIMEOUT_MS 5000
#define AT_TX_BUFFER_LEN 256
static SemaphoreHandle_t s_at_mutex;
static char s_at_tx_buf[AT_TX_BUFFER_LEN];
/* Mutex protects AT command bus - only one sender at a time */
void ESP_Driver_Init(void)
{
s_at_mutex = xSemaphoreCreateMutex();
configASSERT(s_at_mutex != NULL);
}
/* Send AT command and wait for OK response */
bool ESP_SendATCommand(const char *cmd, uint32_t timeout_ms)
{
if (xSemaphoreTake(s_at_mutex,
pdMS_TO_TICKS(AT_CMD_TIMEOUT_MS)) != pdTRUE)
{
return false; /* AT bus busy */
}
/* Send command with CRLF terminator */
snprintf(s_at_tx_buf, sizeof(s_at_tx_buf), "%s
", cmd);
HAL_UART_Transmit_DMA(&huart2,
(uint8_t *)s_at_tx_buf,
strlen(s_at_tx_buf));
/* Wait for OK or ERROR response (via semaphore from UART RX task) */
bool result = UART_WaitForResponse("OK", timeout_ms);
xSemaphoreGive(s_at_mutex);
return result;
}
/* Publish MQTT message via AT command */
bool ESP_MQTT_Publish(const char *topic, const char *payload,
int payload_len, uint8_t qos)
{
char cmd[256];
snprintf(cmd, sizeof(cmd),
"AT+MQTTPUB=0,"%s","%s",%u,0",
topic, payload, qos);
return ESP_SendATCommand(cmd, AT_CMD_TIMEOUT_MS);
}
Wi-Fi and MQTT Connection Manager
One of the most common production failures in deployed IoT systems is poor reconnection handling. Networks go down. Brokers restart. DHCP leases expire. A system that cannot autonomously recover from these events requires manual field intervention — which is unacceptable for any deployed product. The solution is a dedicated connection manager task running a deterministic state machine. Never scatter reconnection logic across multiple tasks or call it from a callback.
/* Connection state machine with exponential backoff */
typedef enum {
CONN_STATE_INIT, CONN_STATE_WIFI_CONNECTING, CONN_STATE_WIFI_CONNECTED,
CONN_STATE_MQTT_CONNECTING, CONN_STATE_RUNNING, CONN_STATE_RECONNECTING
} conn_state_t;
EventGroupHandle_t xConnectionEvents;
#define WIFI_CONNECTED_BIT (1 << 0)
#define MQTT_CONNECTED_BIT (1 << 1)
void WiFiManagerTask(void *argument)
{
conn_state_t state = CONN_STATE_INIT;
uint32_t retry_count = 0;
xConnectionEvents = xEventGroupCreate();
for (;;)
{
switch (state)
{
case CONN_STATE_INIT:
ESP_SendATCommand("AT+RST", 3000);
vTaskDelay(pdMS_TO_TICKS(2000));
state = CONN_STATE_WIFI_CONNECTING; break;
case CONN_STATE_WIFI_CONNECTING:
if (ESP_ConnectWiFi(WIFI_SSID, WIFI_PASSWORD)) {
xEventGroupSetBits(xConnectionEvents, WIFI_CONNECTED_BIT);
state = CONN_STATE_WIFI_CONNECTED;
} else {
uint32_t delay = 2000u * (1u << retry_count++);
if (delay > 60000) delay = 60000;
vTaskDelay(pdMS_TO_TICKS(delay));
}
break;
case CONN_STATE_WIFI_CONNECTED:
ESP_SendATCommand("AT+CIPSNTPCFG=1,5,"pool.ntp.org"", 1000);
vTaskDelay(pdMS_TO_TICKS(2000));
state = CONN_STATE_MQTT_CONNECTING; break;
case CONN_STATE_MQTT_CONNECTING:
if (ESP_MQTT_Connect()) {
retry_count = 0;
xEventGroupSetBits(xConnectionEvents, MQTT_CONNECTED_BIT);
state = CONN_STATE_RUNNING;
} else {
vTaskDelay(pdMS_TO_TICKS(5000));
}
break;
case CONN_STATE_RUNNING:
vTaskDelay(pdMS_TO_TICKS(5000));
if (!ESP_MQTT_IsConnected()) {
xEventGroupClearBits(xConnectionEvents,
WIFI_CONNECTED_BIT | MQTT_CONNECTED_BIT);
state = CONN_STATE_RECONNECTING;
}
break;
case CONN_STATE_RECONNECTING:
state = ESP_WiFi_IsConnected() ?
CONN_STATE_MQTT_CONNECTING : CONN_STATE_WIFI_CONNECTING;
break;
}
}
}
bool MQTT_IsReadyToPublish(void) {
EventBits_t bits = xEventGroupGetBits(xConnectionEvents);
return (bits & (WIFI_CONNECTED_BIT | MQTT_CONNECTED_BIT)) ==
(WIFI_CONNECTED_BIT | MQTT_CONNECTED_BIT);
}
Watchdog and Per-Task Health Monitoring
The STM32 Independent Watchdog (IWDG) provides hardware-level protection, but it is not enough alone. A single watchdog-refresh task can keep running while other critical tasks deadlock. The professional solution is per-task heartbeat monitoring: each task increments a heartbeat counter every cycle. The watchdog task verifies all counters within their expected intervals before refreshing the IWDG. If any task misses its heartbeat, the watchdog refuses to refresh, triggering a hardware reset of the entire system.
/* Per-task heartbeat watchdog */
typedef enum { TASK_SENSOR=0, TASK_MQTT_PUB=1, TASK_WIFI_MGR=2, TASK_COUNT=3 } task_id_t;
static volatile uint32_t s_last_heartbeat_ms[TASK_COUNT];
#define TASK_HEARTBEAT_TIMEOUT_MS 5000
/* Called by each task once per cycle */
void Watchdog_TaskHeartbeat(task_id_t id) {
s_last_heartbeat_ms[id] = HAL_GetTick();
}
void WatchdogTask(void *argument)
{
/* Start IWDG via HAL: ~10s timeout */
MX_IWDG_Init(); /* Configured in CubeMX: prescaler 256, reload 2499 */
vTaskDelay(pdMS_TO_TICKS(5000)); /* Wait for all tasks to start */
for (;;)
{
bool healthy = true;
uint32_t now = HAL_GetTick();
for (int i = 0; i < TASK_COUNT; i++) {
if ((now - s_last_heartbeat_ms[i]) > TASK_HEARTBEAT_TIMEOUT_MS) {
Logger_LogCritical("WDG: Task %d missed heartbeat", i);
healthy = false;
}
}
if (healthy) IWDG->KR = 0xAAAA; /* Refresh only if all tasks healthy */
vTaskDelay(pdMS_TO_TICKS(1000));
}
}
MQTT Topic Design: Structure That Scales
MQTT topic design is an architectural decision that affects every layer of your IoT system. Poor naming creates access control problems, wildcard subscription ambiguity, and data routing headaches that are extremely expensive to fix across deployed devices. Use a hierarchical structure with site, area, and device granularity from day one.
| Pattern | Example Topic | QoS | Purpose |
|---|---|---|---|
| Telemetry (device to cloud) | factory/line1/machine2/telemetry | 0 | High-rate sensor streams |
| Commands (cloud to device) | factory/line1/machine2/cmd | 1 | Remote control, setpoints |
| Alarms (device to cloud) | factory/line1/machine2/alarm | 1 | Critical events, guaranteed delivery |
| Status (device to cloud) | factory/line1/machine2/status | 0 | Connection state, health reports |
| Config (cloud to device) | factory/line1/machine2/config | 1 | Runtime parameter updates |
| OTA (cloud to device) | factory/line1/machine2/ota | 1 | Firmware update commands |
Subscribing to “factory/line1/#” captures all traffic from a production line. Subscribing to “factory/+/+/alarm” captures alarms from all machines on all lines. This wildcard flexibility is only possible with a well-structured hierarchy. Never use flat topics like “temp1”, “sensor_data”, or “machine” — they cannot scale, cannot be secured with per-topic ACL policies, and cannot be routed to the correct cloud processing pipeline.
Security: TLS, Authentication, and Provisioning
Every production MQTT deployment connecting over the public internet requires TLS encryption, device authentication, and unique per-device credentials. Deploying without TLS exposes all sensor data, commands, and device identity to anyone who can observe network traffic — this is unacceptable for industrial, medical, or commercial products.
AWS IoT Core uses X.509 mutual TLS: each device has a unique certificate and private key, validated against a registered CA. Azure IoT Hub supports SAS tokens and X.509. EMQX and HiveMQ support username/password with TLS plus optional client certificate authentication. For ESP32 AT firmware, use scheme 5 in AT+MQTTUSERCFG for mutual TLS (verify server certificate and provide client certificate). Store device certificates in ESP32 flash with AT+SYSMFG during manufacturing. Enable STM32 RDP Level 1 read protection on production devices to prevent credential extraction via debug interfaces.
| Platform | Authentication | MQTT Version | Best For |
|---|---|---|---|
| AWS IoT Core | X.509 mTLS, SigV4 | 3.1.1, 5 | Large-scale industrial IoT |
| Azure IoT Hub | SAS tokens, X.509 | 3.1.1 | Enterprise, Azure ecosystem |
| HiveMQ Cloud | Username/password + TLS, X.509 | 3.1.1, 5 | MQTT 5 features, flexible |
| EMQX | Username/password, JWT, X.509 | 3.1.1, 5 | Self-hosted, open source |
| Mosquitto | Username/password, X.509 | 3.1.1, 5 | Local dev, private networks |
Memory Discipline and Stack Safety
MQTT-connected FreeRTOS applications frequently fail in the field due to heap fragmentation and stack overflows. These failures are insidious because they can take days or weeks to manifest. Use static FreeRTOS object allocation (xTaskCreateStatic, xQueueCreateStatic) wherever possible — this moves all RTOS memory to the BSS segment at link time, eliminating runtime heap allocation for scheduler objects entirely.
Monitor stack high watermarks during development. Call uxTaskGetStackHighWaterMark() from a logger task and log results every few minutes. The value returned is the minimum free stack space in words that has ever existed in that task. Set final stack sizes to maintain at least 30-40 words of margin above the measured minimum high watermark. Avoid dynamic JSON construction with malloc on every publish cycle — use snprintf() into a static stack-allocated buffer. For high-frequency or bandwidth-constrained deployments, consider CBOR as a binary replacement for JSON: it uses 30-50% less bandwidth with native support on AWS IoT and most major brokers.
Common Production Mistakes
- HAL_Delay() inside FreeRTOS tasks — busy-waits and blocks all equal/lower priority tasks. Always use vTaskDelay() or vTaskDelayUntil().
- MQTT or AT command logic inside ISRs — ISRs must be minimal: read hardware, post xQueueSendFromISR() or xTaskNotifyFromISR(), exit. All processing in task context.
- Unprotected AT command bus — protect with a mutex. Multiple tasks sending AT commands without a mutex interleave commands and corrupt the ESP32 command stream.
- No reconnection state machine — ad-hoc reconnection from random tasks creates race conditions and leaves the system in undefined states after complex failure sequences.
- Publishing without checking connection state — always check MQTT_IsReadyToPublish() before calling publish functions. Queue overflow during disconnection should be handled gracefully, not silently dropped or crash.
- Flat MQTT topic names — “temp1” or “sensor_data” cannot be secured, cannot be routed in the cloud, and cannot support wildcard subscriptions. Design hierarchy before writing a single line of firmware.
Recommended Hardware for This Project
- STM32 Nucleo-F401RE — Cortex-M4 development board, ideal for FreeRTOS + MQTT development
- ESP32 DevKit C (WROOM-32) — Wi-Fi + BLE coprocessor with official Espressif AT firmware support
- ESP8266 NodeMCU v3 — Lower cost Wi-Fi option for simpler MQTT connectivity
- STM32 Nucleo-L476RG — Low-power variant for battery-operated IoT sensor nodes
- Nordic PPK2 Power Profiler — Validate MQTT sleep/wake power consumption at uA resolution
Related Guides on Kalapi Infotech
- STM32 Low Power Modes: Sleep, Stop and Standby — Combine MQTT with Stop mode sleep between publish cycles for multi-year battery life
- Getting Started with Zephyr RTOS on STM32 (2026) — Zephyr’s built-in MQTT client and networking subsystem as an alternative to FreeRTOS
- Running TinyML on STM32 with Edge Impulse — Combine on-device ML inference results with MQTT telemetry upload
- Zephyr vs FreeRTOS: Choosing the Right RTOS — RTOS selection criteria for MQTT-connected embedded systems
Final Thoughts
STM32 combined with FreeRTOS and MQTT is one of the most powerful and widely deployed technology stacks in modern industrial embedded systems. The architectural pattern of separating real-time control from cloud networking — STM32 for determinism, ESP32 for connectivity — appears across factory automation controllers, smart energy meters, industrial condition monitoring platforms, and agricultural telemetry networks.
The gap between a demo that works on a desk and a product that runs reliably for years in the field comes down to the engineering disciplines covered in this guide: task isolation with defined priorities, queue-driven data flow, DMA-based UART communication, CRC-validated packet framing, connection management with exponential backoff, per-task watchdog monitoring, and TLS-secured MQTT. None of these are advanced in isolation. Applying them all consistently in a single codebase is what separates engineers who build prototypes from engineers who build products.
A reliable embedded IoT system is not built by adding Wi-Fi to a microcontroller. It is built through deterministic task architecture, queue-driven firmware design, robust fault recovery, memory discipline, and production-grade security. That is what separates a demo from a deployed product.