Search This Blog

Wednesday 1 November 2017

Dont' Delay!

The following code looks like it will wait until pin 5 goes LOW until allowing the loop function to run:

void setup(){

  Serial.begin(74880);

  pinMode(5,INPUT_PULLUP);

  Serial.printf("T=%d Waiting\n",millis());

  while(digitalRead(5)==HIGH);

  Serial.printf("T=%d Ready\n",millis());

}

The while loop is a technique used on other systems - and probably works - but here's what happens on an ESP8266:

T=5204 Waiting

Soft WDT reset

ctx: cont
sp: 3ffef240 end: 3ffef420 offset: 01b0

>>>stack>>>
3ffef3f0:  3fffdad0 00000000 3ffee3c8 40201c28
3ffef400:  feefeffe feefeffe 3ffee3ec 4020237c
3ffef410:  feefeffe feefeffe 3ffee400 40100718
<<<stack<<<

It just crashed. What happened was, your code is in a very tight loop, and while it is there, the WiFi code cannot run. The "watchdog timer" thinks the processor has "hung up" and so it restarts the system. This alone is one good reason is why programming the ESP8266 is different from programming e.g. an AVR with the Arduino IDE. If you replace the while statement with:

  while(digitalRead(5)==HIGH) delay(1); 

Then it works as you would have expected. The system sits doing nothing until pin 5 goes LOW. So it looks like "delay" is the solution. Before we added it your code was "doing nothing" - and it crashed. It doesn't take a rocket scientist to deduce then that delay cannot also be "doing nothing" therefore it must be doing something!

That something is allowing the WiFi code to run in the background, hence the WDT isn't worried.

Oddly, delay(0) would also have worked. As would the special function yield() which does pretty much the same as delay(0). delay is specifically designed to "yield" the CPU, i.e. "let go" of it for a short while, and in that short while, the WiFi code can use it.

"But Wait!" I hear you cry: "I haven't done anything with WiFi! I haven't even tried to connect to it!" - and that's true. But - look at the crash information: "T=5204" The system had been running for 5.2 seconds before it even got to the while loop. Did Serial.begin and pinMode really take 5 seconds?

If not, then what was the CPU doing for 5 seconds?

To answer that, I'd like to to actually try this experiment: don't just read it and take my word, actually do it. First - and this is vital: if you use a brand new chip or one you have used for "messing about with WiFi" then the following code might not work, You have to use a chip that you have already successfully connected to your WiFi, at least once before. When you have that, load the following sketch:

#include<ESP8266WiFi.h>

#define CSTR(x) x.c_str()
#define TXTIP(x) CSTR(x.toString())

WiFiEventHandler    gotIpEventHandler,disconnectedEventHandler;
              
void wifiEvent(WiFiEvent_t event) {
    switch(event) {
        case WIFI_EVENT_STAMODE_CONNECTED:
            Serial.printf("T=%d WiFi Connected SSID=%s\n",millis(),CSTR(WiFi.SSID()));
            break;
        case WIFI_EVENT_STAMODE_GOT_IP:
            Serial.printf("T=%d WiFi got IP %s\n",millis(),CSTR(WiFi.localIP().toString()));
            break;
        case WIFI_EVENT_STAMODE_DISCONNECTED:
            Serial.printf("T=%d WiFi lost connection\n",millis());
            break;        
        default:
            break;
    }
}

void wifiDisconnectHandler(const WiFiEventStationModeDisconnected& event){
  Serial.printf("T=%d Disconnected (reason=%d)\n",millis(),event.reason);
}

void wifiGotIPHandler(const WiFiEventStationModeGotIP& event){
  Serial.printf("T=%d Connected to %s (%s) as %s (ch: %d) hostname=%s\n",millis(),CSTR(WiFi.SSID()),TXTIP(WiFi.gatewayIP()),TXTIP(WiFi.localIP()),WiFi.channel(),CSTR(WiFi.hostname()));
}

void setup(){
  Serial.begin(74880);
  delay(1000);
  Serial.printf("T=%d Setup\n",millis());
  WiFi.onEvent(wifiEvent);
  gotIpEventHandler = WiFi.onStationModeGotIP(wifiGotIPHandler);
  disconnectedEventHandler = WiFi.onStationModeDisconnected(wifiDisconnectHandler);
  }

void loop(){
  Serial.printf("T=%d LOOP: Do something\n",millis());
  delay(30000);
}

Now then, this is where it starts to look like a magic trick: Your WiFi SSID and password are not in that sketch and there is of course, nothing up my sleeve - I do not and could not possibly know them. Even if I did, there is no WiFi.begin anywhere in the sketch that would tell the ESP8266 to connect to your WiFi. But it will connect. Go on, if you don't believe me, try it. Surprised? I hope so. Note also that the connection occured after the loop had started running - certainly on my chip it did:

⸮T=6303 Setup
T=6303 LOOP: Do something
T=8126 WiFi got IP 192.168.1.113
T=8126 Connected to LaPique (192.168.1.1) as 192.168.1.113 (ch: 6) hostname=ESP_836EDC
T=36303 LOOP: Do something

What have we learned?
  1. The ESP8266 remembers the last successful WiFi connection and automatically re-connects to it - without you even asking!
  2. It can take quite a few seconds to connect.
  3. The ESP8266 is doing a lot of things "in the background" even when you think that nothing else is happening except your code.
That last one is the one you need to think very hard about. When "your" code crashes - it might not actually be your code causing the crash, it can often be your code causing some other code to crash, which can make it hard to pin down the true cause. The whole purpose of this article is to predict that - in a large percentage of cases - there will be a delay call just before the crash, so, for my next trick:

Insert a delay(1) before the Serial.printf in wifiGotIPHandler like so:

void wifiGotIPHandler(const WiFiEventStationModeGotIP& event){
  delay(1);
  Serial.printf("T=%d Connected to %s (%s) as %s (ch: %d) hostname=%s\n",millis(),CSTR(WiFi.SSID()),TXTIP(WiFi.gatewayIP()),TXTIP(WiFi.localIP()),WiFi.channel(),CSTR(WiFi.hostname()));
}

Here's what happens on mine:

T=6302 Setup
T=6302 LOOP: Do something
T=8124 WiFi got IP 192.168.1.113
T=8124 Connected to LaPique (192.168.1.1) as 192.168.1.113 (ch: 6) hostname=ESP_836EDC

Exception (9):
epc1=0x401050b9 epc2=0x00000000 epc3=0x00000000 excvaddr=0xffffffff depc=0x00000000

ctx: sys 
sp: 3ffffdb0 end: 3fffffb0 offset: 01a0

So now what have we learned?
  1. That time travel is apparently possible: the delay(1) before the Serial.printf caused the crash (trust me, it did) yet the Serial.printf still worked! Welcome to the wonderful world of asynchronous programming...
  2. That as little as a 1ms delay can cause a crash? No,we have learned that...
  3. delay is not - after all - such a wonderful solution: when used in the "wrong" place it is a nightmare.
I see a lot of code in example sketches, in question from "newbies" that is littered with delay calls, often with values that you just know have been made up on the spot, because there is often no need for delay to be there at all. A lot of people think it's the answer to a variety of problems, but I'm here to tell them - and you - that it's the indiscriminate use of it that is the cause of many.

delay can only be called - without problems - from the "main loop thread". If you call it from the "background thread", you've seen what happens. Some callbacks (especially timers) and all interrupt service routines (ISRs) do not run - by definition - on the main loop thread, so calling delay inside them will cause problems.

We are now in a "catch-22": The best way to write code for the ESP8266 is the event-driven style, but the same method makes other code break. The solution however, is easy: don't use that "other code". Don't use delay. If you need something to happen at a later date, use a Ticker, or the author's H4 library (which adds a lot of functionality to the Ticker class) github.com/philbowles/h4

Many of the ESP8266 libraries use callbacks. If you know - in every case - whether or not it's safe to call delay in the callback, feel free to ignore everything here. But if you don't, don't. Since you can't really do anything much more productive without those libraries other than flashing LEDs, the best starting point is:

D O   N O T   C A L L   D E L A Y

Unless, of course you understand all of this already and know exactly what you are doing.


No comments:

Post a Comment