0% found this document useful (0 votes)
245 views124 pages

ESP32S3+CAMERA+MASTERY+FREE

The ESP32S3 Camera Mastery Course, authored by Simone from Italy, is an updated resource for utilizing the ESP32 camera, addressing a gap in high-quality online tutorials. The course covers both basic and advanced topics, including real-time video streaming, motion detection, and machine learning applications like face detection. It emphasizes the use of the EloquentEsp32Cam library for efficient coding practices and provides recommendations for hardware and software requirements to enhance the learning experience.

Uploaded by

nguyenvhk.22ceb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
245 views124 pages

ESP32S3+CAMERA+MASTERY+FREE

The ESP32S3 Camera Mastery Course, authored by Simone from Italy, is an updated resource for utilizing the ESP32 camera, addressing a gap in high-quality online tutorials. The course covers both basic and advanced topics, including real-time video streaming, motion detection, and machine learning applications like face detection. It emphasizes the use of the EloquentEsp32Cam library for efficient coding practices and provides recommendations for hardware and software requirements to enhance the learning experience.

Uploaded by

nguyenvhk.22ceb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

ESP32S3 Camera Mastery Course

Hi,
I'm Simone from Italy, author of the ESP32S3 Camera Mastery Course and owner of the
eloquentarduino.com blog.
This course is the updated version of my original book Mastering the ESP32 camera,
which got much more interest than I could have anticipated. This is why I spent months
working on this new revision: you reader and the whole Arduino/ESP32 community
deserve better tools to work with the fantastic piece of hardware that is the ESP32
camera.
If there would have existed top quality, advanced content on this topic on the internet, I
wouldn't have sold a single copy of the book. Instead, I sold +100 copies (not that much,
but more than I anticipated). It means there's lack of resources online: the intent of this
course is to fill this gap.

What's inside?
This book covers a few topics. Some are very basic:
1. take your first picture in a sane way
2. real time video stream
3. save pictures to SD card with structure
4. motion detection without PIR
5. Telegram notifications

Others can be considered more advanced:


1. face detection
2. image classification using Edge Impulse
3. object detection using Edge Impulse

The first chapters work on the old generation of ESP32 chip (e.g. AiThinker board), while
the latest ones that require machine learning (face/object detection) only run (or run
better) on the new ESP32S3 chip.
I strongly recommend you upgrade if you can, because the S3 chip is a lot faster and has
a lot more RAM that its predecessor. A couple boards I can recommend because I have
them:
1. Seeed XIAO Sense: tiny and pretty cheap, it gets hot really quick though
2. Freenove Cam Board S3: good price point, may be available even on Amazon in
your country. Pay attention that it is the S3 version (the one with 2 USB ports!),
since they also have a non-S3 version

A note about the coding style


This course is articulated in short chapters, each for each topic.
Even though they don't follow any particular order (apart from the first chapter that
shows how to take your first picture), I would recommend to read it in order. The first
chapters cover the most frequent use cases and show the basics, which you can later
integrate into more advanced projects to get a complete working project.
The code examples make heavy use of a few C++ features:
object orientation: everything from the library we'll be using is scoped under an
object, there are no global functions. Many methods even return objects instead
of primitive values, so you will often find chains of functions. Don't be scared,
they're still easy to use
namespaces: everything from the library is scoped under a C++ namespace to
not pollute the global scope. Most object are scoped under the eloq
namespace, but sometimes you will encounter constructs like
namespace::object : don't be scared! Namespaces are helpful in grouping

related objects and functions under a common name


lambda functions: when a function returns multiple objects, we will iterate them
using a forEach construct that accepts a function to run on every result.
Declaring a global function would be not as idiomatic as using a locally-defined
function, so we'll be using lambdas.

Here's an example snippet showing the above points.


1 using namespace eloq; Copy
2
3 // object orientation with method chaining
4 if (!camera.capture().isOk()) {
5 Serial.println(camera.exception.toString());
6 return;
7 }
8
9 // namespacing: 'face' is the namespace, 'detection' is the object
10 if (!face::detection.run().isOk()) {
11 Serial.println(face::detection.exception.toString());
12 return;
13 }
14
15 // lambda functions
16 fomo.forEach([](int i, bbox_t bbox) {
17 Serial.printf("Detected object #%d\n", i + 1);
18 });
Hardware requirements
Only an ESP32 camera board is required. We will not use any external sensor nor
hardware. Supported boards are:
AiThinker
M5 cameras (normal, fisheye, timer)
TTGO camera with LCD display
Seeed XIAO Sense (see above)
Freenove Camera S3 (see above)
Espressif Eye
Software requirements
To follow the sketches in the book you will need:
Arduino IDE (1.8.13 recommended)
ESP32 core 2.x.x (2.0.14 recommeded)
EloquentEsp32Cam library 2.x.x (latest version recommended)

If you cannot find version 2.x.x of EloquentEsp32Cam from the Arduino Library Manager,
you can install it from Github.
1. visit https://github.com/eloquentarduino/EloquentEsp32cam/tree/2
2. download library as zip
3. extract the zip inside your Arduino libraries folder
4. if the folder is named EloquentEsp32Cam-master , delete the -master part

Free sample limitations


If this is the free sample of the course, you will have access to the first handful chapters
only. To unlock all the chapters, consider buying the full course at
https://eloquentarduino.lemonsqueezy.com/checkout/buy/e7995ef6-f001-4208-8080-
d5365a35a14e
Take picture

The ESP32 camera is a nice piece of hardware. At only 5 USD on Aliexpress, it is by far
the easiest and cheapest way to get your hands on embedded vision.
Sadly, the (many) tutorials you find online are of really poor quality...
They are lengthy, intricate and hard to customize for your specific needs. And since
they're almost a copy-past of each other, you run the risk to get used to that style of
programming.
But it doesn't have to be like that. There's a better, cleaner, more efficient way to use the
ESP32 camera.
Enter the EloquentEsp32Cam Arduino library.
What's inside EloquentEsp32Cam?
A lot of things, actually, but here I'll list some of them:
camera abstraction
sensor configuration
jpeg decoding
motion detection
face detection
Edge Impulse image classification
Edge Impulse object detection
telegram notifications
MQTT notifications
SD photo storage
multithreading

But, believe it or not, quantity is not the main selling point of this library. Quality is!
In the following sections, I'll show you a Quickstart of the EloquentEsp32Cam library to let
you experience by yourself a better way of interacting with your little ESP32 camera
board.

Hardware requirements
An ESP32 board with a camera.

Software requirements
EloquentEsp32Cam >= 2.2 (install from Arduino Library Manager)

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled
Take a picture
Before showing you how to capture a picture with the EloquentEsp32Cam library, I want to
refresh you how you may be accustomised. This code is taken from one of the many
examples you can easily find online.

How it used to be
1 #include "esp_camera.h" Copy
2 #include "soc/soc.h" // Disable brownour problems
3 #include "soc/rtc_cntl_reg.h" // Disable brownour problems
4
5 // OV2640 camera module pins (CAMERA_MODEL_AI_THINKER)
6 #define PWDN_GPIO_NUM 32
7 #define RESET_GPIO_NUM -1
8 #define XCLK_GPIO_NUM 0
9 #define SIOD_GPIO_NUM 26
10 #define SIOC_GPIO_NUM 27
11 #define Y9_GPIO_NUM 35
12 #define Y8_GPIO_NUM 34
13 #define Y7_GPIO_NUM 39
14 #define Y6_GPIO_NUM 36
15 #define Y5_GPIO_NUM 21
16 #define Y4_GPIO_NUM 19
17 #define Y3_GPIO_NUM 18
18 #define Y2_GPIO_NUM 5
19 #define VSYNC_GPIO_NUM 25
20 #define HREF_GPIO_NUM 23
21 #define PCLK_GPIO_NUM 22
22 #define FLASH_GPIO_NUM 4
23
24
25 void initCamera()
26 {
27 // Turn-off the 'brownout detector'
28 WRITE_PERI_REG(RTC_CNTL_BROWN_OUT_REG, 0);
29
30 // OV2640 camera module
31 camera_config_t config;
32 config.ledc_channel = LEDC_CHANNEL_0;
33 config.ledc_timer = LEDC_TIMER_0;
34 config.pin_d0 = Y2_GPIO_NUM;
35 config.pin_d1 = Y3_GPIO_NUM;
36 config.pin_d2 = Y4_GPIO_NUM;
37 config.pin_d3 = Y5_GPIO_NUM;
38 config.pin_d4 = Y6_GPIO_NUM;
39 config.pin_d5 = Y7_GPIO_NUM;
40 config.pin_d6 = Y8_GPIO_NUM;
41 config.pin_d7 = Y9_GPIO_NUM;
42 config.pin_xclk = XCLK_GPIO_NUM;
43 config.pin_pclk = PCLK_GPIO_NUM;
44 config.pin_vsync = VSYNC_GPIO_NUM;
45 config.pin_href = HREF_GPIO_NUM;
46
46
47 config.pin_sscb_sda = SIOD_GPIO_NUM;
48 config.pin_sscb_scl = SIOC_GPIO_NUM;
49 config.pin_pwdn = PWDN_GPIO_NUM;
50 config.pin_reset = RESET_GPIO_NUM;
51 config.xclk_freq_hz = 20000000;
52 config.pixel_format = PIXFORMAT_JPEG;
53
54 if (psramFound())
55 {
56 config.frame_size = FRAMESIZE_UXGA;
57 config.jpeg_quality = 10;
58 config.fb_count = 2;
59 }
60 else
61 {
62 config.frame_size = FRAMESIZE_SVGA;
63 config.jpeg_quality = 12;
64 config.fb_count = 1;
65 }
66 // Camera init
67 esp_err_t err = esp_camera_init(&config);
68 if (err != ESP_OK)
69 {
70 Serial.printf("Camera init failed with error 0x%x", err);
71 ESP.restart();
72 }
73 }
74
75 // Capture Photo
76 void capturePhoto(void)
77 {
78 camera_fb_t *fb = NULL; // pointer
79 fb = esp_camera_fb_get();
80 if (!fb)
81 {
82 Serial.println("Camera capture failed");
83 return;
84 }
85
86 Serial.println("Capture OK");
87
88 // Hoops, memory leak!!!
89 }
90
91 void setup()
92 {
93 Serial.begin(9600);
94 initCamera();
95 }
96
97 void loop() {
98 capturePhoto();
}

You should recognize the usual structure:


define pin constants
assign pins
configure sensor
get frame

Pretty lengthy, right?


The code by itself is not difficult to understand, yet it is a whole bunch of lines that you
have to copy-paste from project to project just to get started. Worst part is when you
need to change the camera model: you have to look on internet for the correct pins
definition and copy-paste one more piece of code.
There must be a better way, right?

How it is now
The following is what using the EloquentEsp32Cam library looks like.
See source
Filename: Take_Picture.ino
1 /** Copy
2 * Get your first picture with ESP32
3 *
4 * Open the Serial Monitor and enter 'capture' (without quotes)
5 * to capture a new image
6 *
7 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
8 * to turn on debug messages
9 */
10 #include <eloquent_esp32cam.h>
11
12 // all global objects (e.g. `camera`)
13 // are scoped under the `eloq` namespace
14 using eloq::camera;
15
16
17 /**
18 *
19 */
20 void setup() {
21 delay(3000);
22 Serial.begin(115200);
23 Serial.println("___GET YOUR FIRST PICTURE___");
24
25 // camera settings
26 // replace with your own model!
27 // supported models:
28 // - aithinker
29 // - m5
30 // - m5_wide
31 // - m5_timer
32 // - eye
33 // - wrover
34 // - wroom_s3
35 // - freenove_s3
36 // - xiao
37 // - ttgo_lcd
38 // - simcam
39 camera.pinout.aithinker();
40 camera.brownout.disable();
41 // supported resolutions
42 // - yolo (96x96)
43 // - qqvga
44 // - qcif
45 // - face (240x240)
46 // - qvga
47 // - cif
48 // - hvga
49 // - vga
50 // - svga
51 // - xga
52 // - hd
53 // - sxga
54 // - uxga
55 // - fhd
56 // - qxga
57 // ...
58 camera.resolution.vga();
59 // supported qualities:
60 // - low
61 // - high
62 // - best
63 camera.quality.high();
64
65 // init camera
66 while (!camera.begin().isOk())
67 Serial.println(camera.exception.toString());
68
69 Serial.println("Camera OK");
70 Serial.println("Enter 'capture' (without quotes) to shot");
71 }
72
73 /**
74 *
75 */
76 void loop() {
77 // await for Serial command
78 if (!Serial.available())
79 return;
80
81 if (Serial.readStringUntil('\n') != "capture") {
82 Serial.println("I only understand 'capture'");
83 return;
84 }
85
86 // capture picture
87 if (!camera.capture().isOk()) {
88 Serial.println(camera.exception.toString());
89 return;
90 }
91
92 // print image info
93 Serial.printf(
94 "JPEG size in bytes: %d. Width: %dpx. Height: %dpx.\n",
95 camera.getSizeInBytes(),
96 camera.resolution.getWidth(),
97 camera.resolution.getHeight()
98 );
99
100
101 Serial.println("Enter 'capture' (without quotes) to shot");
}

I'll strip all the comments and Serial code to show you the lines that really matter. So
you can appreciate the compactness of the code.

1 // configure the camera Copy


2 camera.pinout.aithinker();
3 camera.brownout.disable();
4 camera.resolution.vga();
5 camera.quality.high();
6
7 // init and capture
8 camera.begin().isOk();
9 camera.capture().isOk();

After you upload the sketch, open the Serial Monitor and follow the instructions. You
should get something similar to the image below.

A note about the style: objects everywhere!


Most of the methods that can succeed or fail from the EloquentEsp32Cam library (like
begin() and capture() ) don't return a boolean. They instead return an exception
object.
Why? you may ask...
Because it is a shame when something fails and you don't know why. The C style of
notifying of an error is with integer status code. But then you have to search online what
each status code means. Wouldn't it be much better if you got a text status code? That's
what the exception object does for you.
And this is why you have to test if something went wrong or not with the .isOk()

construct.

1 // bad, don't do this! Copy


2 if (camera.begin()) {}
3
4 // good, do this!
5 if (camera.begin().isOk()) {}
6
7 // in case of error, print a human readable description
8 Serial.println(camera.exception.toString());
Sensor configuration
The ESP32 camera sensor is pretty powerful. It has a lot of configurations available that
you can tweak to get perfect pictures in a wide range of scenarios. I recommend you take
some time to experiment with its settings and choose the ones that work best for you
before you start developing a new project.
Load the default CameraWebServer example and find your optimal configuration. Take
note of the values: we will convert them to code in the next section.

Change sensor settings


To configure the sensor you access the camera.sensor object. The API is really intuitive
and follows a standardized syntax.

1 // values that range from -2 to +2 (brightness and saturation) Copy


2 // follow this syntax
3 camera.sensor.lowestSaturation();
4 camera.sensor.lowSaturation();
5 camera.sensor.defaultSaturation();
6 camera.sensor.highSaturation();
7 camera.sensor.highestSaturation();
8 camera.sensor.setSaturation(int value); // from -2 to +2
9
10 camera.sensor.lowestBrightness();
11 camera.sensor.lowBrightness();
12 camera.sensor.defaultBrightness();
13 camera.sensor.highBrightness();
14 camera.sensor.highestBrightness();
15 camera.sensor.setBrightness(int value); // from -2 to +2

Boolean settings follow the enable/disable API.

1 camera.sensor.enableAutomaticWhiteBalance(); Copy
2 camera.sensor.disableAutomaticWhiteBalance();
3 camera.sensor.setAutomaticWhiteBalance(true|false);
4
5 camera.sensor.enableGainControl();
6 camera.sensor.disableGainControl();
7 camera.sensor.setGainControl(true|false);

Finally, there are specific settings which don't follow any scheme.
1 camera.sensor.noSpecialEffect(); Copy
2 camera.sensor.negative();
3 camera.sensor.grayscale();
4 camera.sensor.sepia();
5 camera.sensor.redTint();
6 camera.sensor.greenTint();
7 camera.sensor.blueTint();
8 camera.sensor.hmirror();
9 camera.sensor.vmirror();

Take a look at the source code of the sensor object to see the full list of available
methods.
MJPEG streaming
When you're first starting with the ESP32 camera, you probably want to see its live video
streaming.
The following sketch starts an HTTP server that only streams the video from the camera.
Consider it a leaner clone of the default CameraWebServer example.
See source
Filename: MJPEG_Stream.ino
1 /** Copy
2 * View camera MJPEG stream
3 *
4 * Start an HTTP server to access the live video feed
5 * of the camera from the browser.
6 *
7 * Endpoints are:
8 * - / -> displays the raw MJPEG stream
9 * - /jpeg -> captures a still image
10 *
11 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
12 * to turn on debug messages
13 */
14
15 // if you define WIFI_SSID and WIFI_PASS before importing the library,
16 // you can call connect() instead of connect(ssid, pass)
17 //
18 // If you set HOSTNAME and your router supports mDNS, you can access
19 // the camera at http://{HOSTNAME}.local
20
21 #define WIFI_SSID "SSID"
22 #define WIFI_PASS "PASSWORD"
23 #define HOSTNAME "esp32cam"
24
25 #include <eloquent_esp32cam.h>
26 #include <eloquent_esp32cam/viz/mjpeg.h>
27
28 using namespace eloq;
29 using namespace eloq::viz;
30
31
32 /**
33 *
34 */
35 void setup() {
36 delay(3000);
37 Serial.begin(115200);
38 Serial.println("___MJPEG STREAM SERVER___");
39
40 // camera settings
41 // replace with your own model!
42 camera.pinout.aithinker();
43 camera.brownout.disable();
44
44
camera.resolution.vga();
45
camera.quality.high();
46
47
// init camera
48
while (!camera.begin().isOk())
49
Serial.println(camera.exception.toString());
50
51
// connect to WiFi
52
while (!wifi.connect().isOk())
53
Serial.println(wifi.exception.toString());
54
55
// start mjpeg http server
56
while (!mjpeg.begin().isOk())
57
Serial.println(mjpeg.exception.toString());
58
59
Serial.println("Camera OK");
60
Serial.println("WiFi OK");
61
Serial.println("MjpegStream OK");
62
Serial.println(mjpeg.address());
63
}
64
65
66
void loop() {
67 // HTTP server runs in a task, no need to do anything here
68
}

From a high-level perspective, there's not much to tell about this sketch:
1. it configures the camera
2. it connects to the WiFi network
3. it starts the MJPEG HTTP server

The few lines of code that we added are listed below.

1 // connect to WiFi Copy


2 while (!wifi.connect().isOk())
3 Serial.println(wifi.exception.toString());
4
5 // start mjpeg http server
6 while (!mjpeg.begin().isOk())
7 Serial.println(mjpeg.exception.toString());
8
9 // get IP address of the board
10 Serial.println(mjpeg.address());

Open the Serial Monitor and you will read something similar to the screenshot below.
To access the stream, open a web browser and visit

1 http://esp32cam.local:81 Copy

Be sure you are connected to the same network as the ESP32!


If you get a blank page, try to replace the above address with the IP address that get's
printed in the Serial Monitor. It will look like

1 http://192.X.Y.Z:81 Copy

If you set a resolution higher than QVGA (320 x 240), be sure your WiFi signal is
strong, otherwise the feed will look laggish.
Still image
As an added feature, this sketch will allow you to get still image captures from the camera
at the following endpoint:

1 http://esp32cam.local:81/jpeg Copy

Stream controls
You can play/pause/stop the MJPEG stream at your will. While paused, the stream will
freeze and will automatically restart when you call play() .
If you stop it, instead, it will drop the connection altogether and a full page reload will be
necessary to restart the stream (after you call play() ).
See source
Filename: MJPEG_Controls.ino
1 /** Copy
2 * Play/Pause/Stop MJPEG stream
3 *
4 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
5 * to turn on debug messages
6 */
7
8 // if you define WIFI_SSID and WIFI_PASS before importing the library,
9 // you can call connect() instead of connect(ssid, pass)
10 //
11 // If you set HOSTNAME and your router supports mDNS, you can access
12 // the camera at http://{HOSTNAME}.local
13
14 #define WIFI_SSID "SSID"
15 #define WIFI_PASS "PASSWORD"
16 #define HOSTNAME "esp32cam"
17
18 #include <eloquent_esp32cam.h>
19 #include <eloquent_esp32cam/viz/mjpeg.h>
20
21 using namespace eloq;
22 using namespace eloq::viz;
23
24
25 /**
26 *
27 */
28 void setup() {
29 delay(3000);
30 Serial.begin(115200);
31 Serial.println("___MJPEG STREAM SERVER CONTROLS___");
32
33 // camera settings
34 // replace with your own model!
35 camera.pinout.aithinker();
36 camera.brownout.disable();
37 camera.resolution.qvga();
38 camera.quality.high();
39
40 // init camera
41 while (!camera.begin().isOk())
42 Serial.println(camera.exception.toString());
43
44 // connect to WiFi
45 while (!wifi.connect().isOk())
46 Serial.println(wifi.exception.toString());
47
48 // start mjpeg http server
49 while (!mjpeg.begin().isOk())
50 Serial.println(mjpeg.exception.toString());
51
52 Serial.println("Camera OK");
53 Serial.println("WiFi OK");
54 Serial.println("MjpegStream OK");
55 Serial.println(mjpeg.address());
56 Serial.println("Send play/pause/stop to control the server");
57 }
58
59 /**
60
60
*
61
*/
62
void loop() {
63
if (!Serial.available())
64
return;
65
66
String command = Serial.readStringUntil('\n');
67
68
if (command.startsWith("play"))
69
mjpeg.play();
70
else if (command.startsWith("pause"))
71
mjpeg.pause();
72
else if (command.startsWith("stop"))
73
mjpeg.stop();
74
else
75
Serial.println("Unknown command");
76
}
Save pictures to SD card

Now that you know how to capture a picture with your ESP32 camera, you may want to
store those pictures to a permanent medium.
The ESP32 camera ships with an internal storage (up to 16MB on some boards) that you
can in part fill with files. Most of the times, though, you will insert an external SD card for
both larger storage space and easier accessibility.
This page will show you how to easily interact with the SD card of your ESP32 camera to
save images on it in 3 different ways:
1. manually setting the filename
2. using an always incrementing counter that survives reboots
3. using NTP (Network Time Protocol) to use the current timestamp as filename

As an added bonus, when using the NTP timestamping, I will show you how easy it is to
create a nested folder structure to save pictures under the current date folder (e.g.
/20231001/20231001T100000.jpg , /20231001/20231001T110000.jpg , ...) to keep your files
organized.

Hardware requirements
This project works with S3 and non-S3 boards.
If you don't have a board with SD card slot, you can still run the below sketches by
replacing each occurrence of sdmmc with spiffs .

1 #include <eloquent_esp32cam/extra/esp32/fs/sdmmc.h> Copy


2 // becomes
3 #include <eloquent_esp32cam/extra/esp32/fs/spiffs.h>
4
5 if (sdmmc.save(camera.frame).to(filename).isOk()) {}
6 // becomes
7 if (spiffs.save(camera.frame).to(filename).isOk()) {}

Software requirements
This project is tested on EloquentEsp32Cam version 2.0.5

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled
Save frame with manual filename
If you're manually interacting with the ESP32 camera or you want to use your own
naming scheme, you can easily choose the name of your file with the following code.
See source
Filename: Manual_Name.ino

1 /** Copy
2 * SDMMC file storage: manual filename
3 *
4 * This sketch shows how to save a picture on the SD Card
5 * filesystem by specifying the filename manually
6 *
7 * Open the Serial Monitor and enter 'capture' (without quotes)
8 * to capture a new image and save it to SD
9 *
10 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
11 * to turn on debug messages
12 */
13
14 #include <eloquent_esp32cam.h>
15 #include <eloquent_esp32cam/extra/esp32/fs/sdmmc.h>
16
17 using namespace eloq;
18
19
20 void setup() {
21 delay(3000);
22 Serial.begin(115200);
23 Serial.println("___SAVE PIC TO SD CARD___");
24
25 // camera settings
26 // replace with your own model!
27 camera.pinout.freenove_s3();
28 camera.brownout.disable();
29 camera.resolution.vga();
30 camera.quality.high();
31
32 // you can configure each pin of SDMMC (if needed)
33 // (delete these lines if you are not sure)
34 sdmmc.pinout.clk(39);
35 sdmmc.pinout.cmd(38);
36 sdmmc.pinout.d0(40);
37 // or shorter
38 sdmmc.pinout.freenove_s3();
39
40 // init camera
41 while (!camera.begin().isOk())
42 Serial.println(camera.exception.toString());
43
44 // init SD
45 while (!sdmmc.begin().isOk())
46 Serial.println(sdmmc.exception.toString());
47
48
48
49 Serial.println("Camera OK");
50 Serial.println("SD card OK");
51 Serial.println("Enter the filename where to save the picture");
52 }
53
54
55 void loop() {
56 // await for filename from the Serial Monitor
57 if (!Serial.available())
58 return;
59
60 String filename = Serial.readStringUntil('\n');
61 filename.trim();
62
63 if (!filename.endsWith(".jpg") && !filename.endsWith(".jpeg"))
64 filename = filename + ".jpg";
65
66 // capture picture
67 if (!camera.capture().isOk()) {
68 Serial.println(camera.exception.toString());
69 return;
70 }
71
72 // save to SD
73 if (sdmmc.save(camera.frame).to(filename).isOk()) {
74 Serial.print("File written to ");
75 Serial.println(sdmmc.session.lastFilename);
76 }
77 else Serial.println(sdmmc.session.exception.toString());
78
79 // you can also save under a nested folder
80 if (sdmmc.save(camera.frame).inside("myfolder").to(filename).isOk()) {
81 Serial.print("File written to ");
82 Serial.println(sdmmc.session.lastFilename);
83 }
84 else Serial.println(sdmmc.session.exception.toString());
85
86 // restart the loop
87 Serial.println("Enter another filename");
}

The code may seem lengthy, but it is for the most part comments and configuration lines.
Here is the breakdown.

Configure SD MMC
The EloquentEsp32Cam has a driver for the SD MMC library. I chose this library because it
requires no configuration in most of the cases, it just works out of the box.
In the cases where it does not (eg. Freenove S3 camera), you can configure the pins with
the following lines.

1 // replace with your board actual pins Copy


2 sdmmc.pinout.clk(39);
3 sdmmc.pinout.cmd(38);
4
4 sdmmc.pinout.d0(40);
5 // or shorter
6 sdmmc.pinout.freenove_s3();

When using the AiThinker camera, for example, you can just delete these lines.

Save frame under root directory


After you input your desired filename in the Serial monitor, this is the line that saves the
current frame under the root directory of the SD card.

1 if (sdmmc.save(camera.frame).to(filename).isOk()) {} Copy

Save frame under nested directory


If you want to keep your files organized under a tree structure, you can nest a file under
your desired folder by adding inside(folder-name) .

1 if (sdmmc.save(camera.frame).inside("myfolder").to(filename).isOk()) {} Copy

Here is the Serial Monitor log for this sketch.


Save frame with incremental filename
Many times you will use the ESP32 camera in a autonomous deployment (e.g. a timelapse
setup). In this case you may not want to bother with manually choosing a filename for
each frame. An incremental filename may be all you need.
By incremental naming I mean that your pictures will be saved as 0000001.jpg ,

0000002.jpg and so on.

Getting this result is as easy as re-using most of the above sketch and replacing

1 if (sdmmc.save(camera.frame).to(filename).isOk()) {} Copy

with

1 if (sdmmc.save(camera.frame).to("", "jpg").isOk()) {} Copy

In the first case, the to function accepted the complete name of the file to write.
In the second case, it accepts a file name and file extension. If the filename is empty, it
will fill it with the incremental counter. The best part is that the counter survives reboots,
so you will never risk to overwrite existing images.
As before, you can nest this file under a specific folder.

1 if (sdmmc.save(camera.frame).inside("myfolder").to("", "jpg").isOk()) {} Copy

This same approach works with any kind of file, really, not just images.

1 // save text to .txt file with incremental name Copy


2 if (sdmmc.save("Hello world!").to("", "txt").isOk()) {}

Here is the Serial Monitor.


And here's the full sketch.
See source
Filename: Incremental_Name.ino

1 /** Copy
2 * SDMMC file storage: incremental filename
3 *
4 * This sketch shows how to save a picture on the SD Card
5 * filesystem by using an incremental filename that
6 * persists across reboots
7 *
8 * Open the Serial Monitor and enter 'capture' (without quotes)
9 * to capture a new image and save it to SD
10 *
11 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
12 * to turn on debug messages
13 */
14
15 #include <eloquent_esp32cam.h>
16 #include <eloquent_esp32cam/extra/esp32/fs/sdmmc.h>
17
18 using namespace eloq;
19
20
21 void setup() {
22 delay(3000);
23 Serial.begin(115200);
24 Serial.println("___SAVE PIC TO SD CARD___");
25
26
27 // camera settings
28 // replace with your own model!
29 camera.pinout.freenove_s3();
30 camera.brownout.disable();
31 camera.resolution.vga();
32 camera.quality.high();
33
34 // you can configure each pin of SDMMC (if needed)
35 // (delete these lines if you're not sure)
36 sdmmc.pinout.clk(39);
37 sdmmc.pinout.cmd(38);
38 sdmmc.pinout.d0(40);
39
40 // init camera
41 while (!camera.begin().isOk())
42 Serial.println(camera.exception.toString());
43
44 // init SD
45 while (!sdmmc.begin().isOk())
46 Serial.println(sdmmc.exception.toString());
47
48 Serial.println("Camera OK");
49 Serial.println("SD card OK");
50 Serial.println("Enter 'capture' to capture a new picture");
51 }
52
53
54 void loop() {
55 // await for "capture" from the Serial Monitor
56 if (!Serial.available())
57 return;
58
59 if (Serial.readStringUntil('\n') != "capture") {
60 Serial.println("I only understand 'capture'");
61 return;
62 }
63
64 // capture picture
65 if (!camera.capture().isOk()) {
66 Serial.println(camera.exception.toString());
67 return;
68 }
69
70 // save under root folder
71 if (sdmmc.save(camera.frame).to("", "jpg").isOk()) {
72 Serial.print("File written to ");
73 Serial.println(sdmmc.session.lastFilename);
74 }
75 else Serial.println(sdmmc.session.exception.toString());
76
77 // save under nested folder
78 if (sdmmc.save(camera.frame).inside("myfolder").to("", "jpg").isOk()) {
79 Serial.print("File written to ");
80 Serial.println(sdmmc.session.lastFilename);
81 }
82 else Serial.println(sdmmc.session.exception.toString());
83
84 // restart the loop
85 Serial.println("Enter 'capture' to capture a new picture");
}
Save frame with NTP timestamping
In the most demanding scenarios, you may want to keep track of when the picture was
taken.
If you have access to WiFi, you can leverage NTP (Network Time Protocol) to keep track
of time and name your files based on current date and time. This requires a bit of
configuration before you can use it. The full sketch is listed below.
See source
Filename: NTP_Name.ino
1 /** Copy
2 * SDMMC file storage: NTP filename
3 *
4 * This sketch shows how to save a picture on the SD Card
5 * filesystem by generating a filename using current time
6 *
7 * Open the Serial Monitor and enter 'capture' (without quotes)
8 * to capture a new image and save it to SD
9 *
10 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
11 * to turn on debug messages
12 */
13
14 // if you define WIFI_SSID and WIFI_PASS before importing the library
15 // you can call wifi.connect() instead of wifi.connect(ssid, password)
16 #define WIFI_SSID "SSID"
17 #define WIFI_PASS "PASSWORD"
18
19 #include <eloquent_esp32cam.h>
20 #include <eloquent_esp32cam/extra/esp32/ntp.h>
21 #include <eloquent_esp32cam/extra/esp32/fs/sdmmc.h>
22
23 using namespace eloq;
24
25
26 void setup() {
27 delay(3000);
28 Serial.begin(115200);
29 Serial.println("___SAVE PIC TO SD CARD___");
30
31 // camera settings
32 // replace with your own model!
33 camera.pinout.freenove_s3();
34 camera.brownout.disable();
35 camera.resolution.vga();
36 camera.quality.high();
37
38 // if connected to the internet, try to get time from NTP
39 // you can set your timezone offset from Greenwich
40 ntp.offsetFromGreenwhich(0);
41 // or any of
42
42
43 ntp.cst();
44 ntp.ist();
ntp.eest();
45
46 ntp.cest();
47 ntp.bst();
ntp.west();
48
49 ntp.cet();
50 ntp.gmt();
ntp.edt();
51
52 ntp.pdt();
53
// enable/disable daylight saving
54
55 ntp.isntDaylight();
56 ntp.isDaylight();
57
58 ntp.server("pool.ntp.org");
59
// you can configure each pin of SDMMC (if needed)
60
61 // (delete these lines if not sure)
62 sdmmc.pinout.clk(39);
sdmmc.pinout.cmd(38);
63
64 sdmmc.pinout.d0(40);
65
// init camera
66
67 while (!camera.begin().isOk())
68 Serial.println(camera.exception.toString());
69
70 // init SD
71 while (!sdmmc.begin().isOk())
Serial.println(sdmmc.exception.toString());
72
73
74 // connect to WiFi to sync NTP
75 while (!wifi.connect().isOk())
76 Serial.println(wifi.exception.toString());
77
78 // get NTP time
79 while (!ntp.begin().isOk())
80 Serial.println(ntp.exception.toString());
81
82 Serial.println("Camera OK");
83 Serial.println("SD card OK");
84 Serial.println("NTP OK");
85 Serial.print("Current time is ");
86 Serial.println(ntp.datetime());
87 Serial.println("Enter 'capture' to capture a new picture");
88 }
89
90
91 void loop() {
92 // await for "capture" from the Serial Monitor
93 if (!Serial.available())
94 return;
95
96 if (Serial.readStringUntil('\n') != "capture") {
97 Serial.println("I only understand 'capture'");
98 return;
99 }
100
101 // capture picture
102 if (!camera.capture().isOk()) {
Serial.println(camera.exception.toString());
103
104 return;
105
105 }
106
107 // save under root directory
108 if (sdmmc.save(camera.frame).to(ntp.datetime(), "jpg").isOk()) {
109 Serial.print("File written to ");
110 Serial.println(sdmmc.session.lastFilename);
111 }
112 else Serial.println(sdmmc.session.exception.toString());
113
114 // save under nested directory
115 String date = ntp.date();
116 String datetime = ntp.datetime();
117
118 if (sdmmc.save(camera.frame).inside(date).to(datetime, "jpg").isOk()) {
119 Serial.print("File written to ");
120 Serial.println(sdmmc.session.lastFilename);
121 }
122 else Serial.println(sdmmc.session.exception.toString());
123
124 // restart the loop
125 Serial.println("Enter 'capture' to capture a new picture");
}

In this case, we added a new section to configure the NTP server.

1 // if connected to the internet, try to get time from NTP Copy


2 // you can set your timezone offset from Greenwich
3 ntp.offsetFromGreenwhich(0);
4 // or any of
5 ntp.cst();
6 ntp.ist();
7 ntp.eest();
8 ...
9
10 // enable/disable daylight saving
11 ntp.isntDaylight();
12 ntp.isDaylight();
13
14 ntp.server("pool.ntp.org");

NTP needs to know where you're located to give the correct time. This is achieved by
setting your offset from the Greenwich time (in hours). A few helper methods are
provided for easier setup.
Then you need to configure if you're under daylight saving time or not.
The line

1 ntp.server("pool.ntp.org"); Copy
lets you configure a specific NTP server. If you don't have any specific reason to change
it, you can delete this line since pool.ntp.org is the default server.

Timestamped filename
To use the current timestamp as filename, it is as easy as

1 if (sdmmc.save(camera.frame).to(ntp.datetime(), "jpg").isOk()) {} Copy

If you remember from above, to accepts a filename and an extension. In this case we're
using the current datetime as filename and jpg as extension. The filename will look
something like 20231003T102005.jpg , where 20231003 is the date in YYYYMMDD format,
T is the time separator, and 102005 is the time in HHMMSS format.

If you prefer to categorize the images under a folder structure that tracks the date, it
couldn't be easier.

1 if (sdmmc.save(camera.frame).inside(ntp.date()).to(ntp.datetime(), "jpg").isOk()) {} Copy


2 // or, if you want to name the file with the time only
3 if (sdmmc.save(camera.frame).inside(ntp.date()).to(ntp.time(), "jpg").isOk()) {}
4
5 This will generate a folder structure like below:
6 |- 20231001
7 |- 20231001T100000.jpg
8 |- 20231001T110000.jpg
9 |- 20231001T120000.jpg
10 |- 20231002
11 |- 20231002T100000.jpg
12 |- 20231002T110000.jpg
13 |- 20231002T120000.jpg
14 ...

It feels like magic, doesn't it?


Conclusion
In this page you learned 3 ways to save the current frame of your ESP32 camera to an
external SD card:
1. manually setting the filename
2. using an always incrementing counter that survives reboots
3. using NTP (Network Time Protocol) to use the current timestamp as filename

After configuring the SD and (optionally) the NTP server, it only takes 1 line of code to
actually save the frame.
This gives you complete freedom over file naming scheme and folder structuring. One
thing less to worry about in your next project!
Telegram notifications

When working with a remote control or surveillance system, you need a way to get
notified by your ESP32 camera when something of interest happens.
Something of interest may be:
motion is detected
an object is recognized
a person is detected...

Regardless the source of the event, you need a way to get a notification out of the board
directed to you. Among the many channels possible, Telegram is one of the most used in
the Arduino ecosystem, because it:
is easily accessible via API
runs on your phone, so you always have it with you
is free
works in realtime

There already exists a few libraries that cover most of the available options of the API. If
you want to stay lean, though, and leverage the syntactic style of the EloquentEsp32Cam
library, you're in luck since there's a minimal implementation of Telegram for you.

Hardware requirements
This project works with S3 and non-S3 boards.

Software requirements
This project is tested on EloquentEsp32Cam version 2.0.5
You will need a Telegram Bot Token and a Chat ID. See appendix.

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled
Send text message to Telegram
Sending a text message requires one line for configuration and one line to actually send
the text. Here's a sketch that sends a message with the content you type in the the Serial
monitor.
See source
Filename: Send_Text.ino
1 /** Copy
2 * Send text to Telegram
3 *
4 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
5 * to turn on debug messages
6 */
7 // WiFi credentials
8 #define WIFI_SSID "SSID"
9 #define WIFI_PASS "PASSWORD"
10
11 // replace with your bot token and chat id
12 #define TELEGRAM_TOKEN "1234567890:AABBCCDDEEFFGGHHIILLMMN-NOOPPQQRRSS"
13 #define TELEGRAM_CHAT "0123456789"
14
15 #include <eloquent_esp32cam.h>
16 #include <eloquent_esp32cam/extra/esp32/telegram.h>
17
18 using eloq::wifi;
19 using eloq::telegram;
20
21
22 /**
23 *
24 */
25 void setup() {
26 delay(3000);
27 Serial.begin(115200);
28 Serial.println("___TELEGRAM TEXT MESSAGE___");
29
30 // connect to WiFi
31 while (!wifi.connect().isOk())
32 Serial.println(wifi.exception.toString());
33
34 // connect to Telegram API
35 while (!telegram.begin().isOk())
36 Serial.println(telegram.exception.toString());
37
38 Serial.println("Telegram OK");
39 Serial.println("Enter the text you want to send to Telegram chat");
40 }
41
42
43 void loop() {
44 // await for text
45 if (!Serial.available())
46
47 return;
48
49 String text = Serial.readStringUntil('\n');
50
51 Serial.print("Sending text: ");
52 Serial.println(text);
53
54 // send
55 if (telegram.to(TELEGRAM_CHAT).send(text).isOk())
56 Serial.println("Message sent to Telegram!");
57 else
58 Serial.println(telegram.exception.toString());
59
60 Serial.println("Enter the text you want to send to Telegram chat");
}

The actually relevant lines are

1 while (!telegram.begin().isOk()) {} Copy

to connect to the Telegram API and

1 if (telegram.to(TELEGRAM_CHAT).send(text).isOk()) {} Copy

to send the text message.


Send image to Telegram
Sending a picture to Telegram is as easy as sending a text: you only need to capture a
frame first!
See source
Filename: Send_Image.ino

1 /** Copy
2 * Send image from camera to Telegram
3 *
4 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
5 * to turn on debug messages
6 */
7 // WiFi credentials
8 #define WIFI_SSID "SSID"
9 #define WIFI_PASS "PASSWORD"
10
11 // replace with your bot token and chat id
12 #define TELEGRAM_TOKEN "1234567890:AABBCCDDEEFFGGHHIILLMMN-NOOPPQQRRSS"
13 #define TELEGRAM_CHAT "0123456789"
14
15 #include <eloquent_esp32cam.h>
16 #include <eloquent_esp32cam/extra/esp32/telegram.h>
17
18 using eloq::camera;
19 using eloq::wifi;
20 using eloq::telegram;
21
22
23 /**
24 *
25 */
26 void setup() {
27 delay(3000);
28 Serial.begin(115200);
29 Serial.println("___TELEGRAM IMAGE___");
30
31 // camera settings
32 // replace with your own model!
33 camera.pinout.aithinker();
34 camera.brownout.disable();
35 camera.resolution.vga();
36 camera.quality.high();
37
38 // init camera
39 while (!camera.begin().isOk())
40 Serial.println(camera.exception.toString());
41
42 // connect to WiFi
43 while (!wifi.connect().isOk())
44 Serial.println(wifi.exception.toString());
45
46 // connect to Telegram API
47 while (!telegram.begin().isOk())
48
48
49 Serial.println(telegram.exception.toString());
50
51 Serial.println("Camera OK");
52 Serial.println("Telegram OK");
53 Serial.println("Enter 'image' (without quotes) to send image to Telegram chat");
54 }
55
56
57 void loop() {
58 // await for command
59 if (!Serial.available())
60 return;
61
62 if (Serial.readStringUntil('\n') != "image") {
63 Serial.println("I only understand 'image' (without quotes)");
64 return;
65 }
66
67 Serial.println("Sending photo...");
68
69 // capture new frame
70 if (!camera.capture().isOk()) {
71 Serial.println(camera.exception.toString());
72 return;
73 }
74
75 // send
76 if (telegram.to(TELEGRAM_CHAT).send(camera.frame).isOk())
77 Serial.println("Photo sent to Telegram");
78 else
79 Serial.println(telegram.exception.toString());
80
81 Serial.println("Enter 'image' (without quotes) to send image to Telegram chat");
}

As you can see, the only difference is that we replaced

1 if (telegram.to(TELEGRAM_CHAT).send(text).isOk()) Copy

with

1 if (telegram.to(TELEGRAM_CHAT).send(camera.frame).isOk()) Copy
Appendix A. Create a Telegram Bot
If you want to use your own Telegram Bot to get motion notifications, you need to create
one.
Creating a Telegram bot is a straightforward process: there's a bot to do it!
Search for BotFather in your Telegram account and send him a /newbot message, then
follow the instructions. You will have something similar to the screenshot below.
Replace this string in the TELEGRAM_TOKEN define.
Appendix B. Get your Chat ID
You can get your very own chat id messaging the myidbot bot from your Telegram
account.
Replace this number in the CHAT_ID define.
Motion detection

Motion detection is the task of detecting when the scene in the ESP32 camera field of
view changes all of a sudden.
This change may be caused by a lot of factors (an object moving, the camera itself
moving, a light change...) and you may be interested in get notified when it happens.
For example, you may point your ESP32 camera to the door of your room and take a
picture when the door opens.
In a scenario like this, you are not interested in localizing the motion (knowing where it
happened in the frame), only that it happened.
Motion detection with PIR (infrared sensor)
Most tutorials on the web focus on human motion detection, so they equip the ESP32
with an external infrared sensor (a.k.a PIR, the one you find in home alarm systems) and
take a photo when the PIR detects something.
If this setup works fine for you, go with it. It's easy, fast, low power and pretty accurate.
But the PIR approach has a few drawbacks:
1. you can only detect living beings: since it is based on infrared sensing, it can
only detect when something "hot" is in its field of view (humans and animals,
basically). If you want to detect a car passing, it won't work
2. it has a limited range: PIR sensors reach at most 10-15 meters. If you need to
detect people walking on the street in front of your house at 30 meters, it won't
work
3. it needs a clear line-of-sight: to detect infrared light, the PIR sensor needs no
obstacles in-between itself and the moving object. If you put it behind a window
to detect people outside your home, it won't work
4. it triggers even when no motion happened: the PIR tecnique is actually a proxy
for motion detection. The PIR sensor doesn't actually detects motion: it detects
the presence of warm objects. For example, if a person comes into a room and
lies down on the sofa, the PIR sensor will trigger for as long as the person
doesn't leave the room
Motion detection without PIR (image based)
On the other hand, image motion detection can fulfill all the above cases because it
performs motion detection on the camera frames, comparing each one with the previous
looking for differences.
If a large portion of the image changed, it triggers.
Video motion detection has its drawbacks, nonetheless:
1. power-hungry: comparing each frame with the previous frame means the
camera must be always on. While with the PIR sensor you can put the camera to
sleep, now you have to continuously check each frame
2. insensitive to slow changes: to avoid false triggers, you will set a lower threshold
on the image portion that need to change to detect motion (e.g. 10% of the
frame). If something is moving slowly in your field of view such that it changes
less than 10% of the frame, the algorithm will not pick it up.

Take some time to review the pros and cons of video motion detection now that you have
a little more details.

Hardware requirements
This project works with ESP32S3 and non-S3 boards.

Software requirements
EloquentEsp32Cam >= 2.2

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled
Motion detection configuration
In the sketch below, you can configure a few parameters for the detector algorithm:
1. skip(n) :don't run detection on the first few frames. Many times, the camera
needs a little time to settle and this prevents false positives at startup
2. stride(n) : the image is divided into a grid where each cell size equals the
stride. The motion is analyzed only at the corners of each cell. The larger the
stride, the faster the execution time, but the coarser the detection. Keep in mind
that motion detection already happens on a 1/8th version of the original image!
So, if you start with a VGA image (640x480), motion detection happens on a
80x60 grid. If you set stride(2) , motion detection happens instead on a 40x30
grid. You can leave this value to 1 if you're not sure, it is still pretty fast.
3. threshold(n) : if the value of the pixels at each corner changed from the
previous frame by more than this value, that point is marked as a moving point.
The higher the value, the less sensitive to noise and small changes the detection
is.
4. ratio(percent) : this is the ratio of moving points / total points above
which motion is triggered. The higher the value (from 0 to 1), the more pixels
need to be marked as moving to return true at detection time
5. rate limiter(n) : when an object is moving in front of the camera, it will trigger
the detection for all the time it is passing by. Most often, you only want to get a
single trigger for each event (for example to turn on an actuator or send a
notification) and not get spammed with consecutive activations. You can
configure how many seconds should elapse from one trigger to the next.
Motion detection quickstart
See source
Filename: Motion_Detection.ino
1 /** Copy
2 * Motion detection
3 * Detect when the frame changes by a reasonable amount
4 *
5 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = DEBUG"
6 * to turn on debug messages
7 */
8 #include <eloquent_esp32cam.h>
9 #include <eloquent_esp32cam/motion/detection.h>
10
11 using eloq::camera;
12 using eloq::motion::detection;
13
14
15 /**
16 *
17 */
18 void setup() {
19 delay(3000);
20 Serial.begin(115200);
21 Serial.println("___MOTION DETECTION___");
22
23 // camera settings
24 // replace with your own model!
25 camera.pinout.freenove_s3();
26 camera.brownout.disable();
27 camera.resolution.vga();
28 camera.quality.high();
29
30 // configure motion detection
31 // the higher the stride, the faster the detection
32 // the higher the stride, the lesser granularity
33 detection.stride(1);
34 // the higher the threshold, the lesser sensitivity
35 // (at pixel level)
36 detection.threshold(5);
37 // the higher the threshold, the lesser sensitivity
38 // (at image level, from 0 to 1)
39 detection.ratio(0.2);
40 // optionally, you can enable rate limiting (aka debounce)
41 // motion won't trigger more often than the specified frequency
42 detection.rate.atMostOnceEvery(5).seconds();
43
44 // init camera
45 while (!camera.begin().isOk())
46 Serial.println(camera.exception.toString());
47
48 Serial.println("Camera OK");
49 Serial.println("Awaiting for motion...");
50 }
51
52
53 /**
54 *
55 */
56 void loop() {
57 // capture picture
58 if (!camera.capture().isOk()) {
59 Serial.println(camera.exception.toString());
60 return;
61 }
62
63 // run motion detection
64 if (!detection.run().isOk()) {
65 Serial.println(detection.exception.toString());
66 return;
67 }
68
69 // on motion, perform action
70 if (detection.triggered())
71 Serial.println("Motion detected!");
}

Pretty easy, no?


This basic sketch will cover most of your use cases for smart projects, like:
intruder alert
vision-based triggers (very inefficient - not recommended)
starting object detection
In the image above, you can see motion detection and the rate limiter in action: even if
the moving points ratio exceeded multiple times the threshold, the message
Motion detected! gets printed only the first time.

Switch resolution on the fly


To get a fast detection, you want to run the motion algorithm on a frame that is as small
as possible (VGA or even QVGA).
But what if you want to capture a picture of what caused the motion and save it on your
SD card for later inspection? In this case, you want a frame resolution as large as
possible, instead.
This would be a dilemma... if the EloquentEsp32cam library didn't addressed this specific
(but common) use case!
The resolution object has a function construct that allows you to change frame size on
the fly, do your things (e.g. take a picture), and switch back to the original resolution
automatically.
See source
Filename: Motion_Detection_Higher_Resolution.ino
1 /** Copy
2 * Run motion detection at low resolution.
3 * On motion, capture frame at higher resolution
4 * for SD storage.
5 *
6 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
7 * to turn on debug messages
8 */
9 #include <eloquent_esp32cam.h>
10 #include <eloquent_esp32cam/motion/detection.h>
11
12 using eloq::camera;
13 using eloq::motion::detection;
14
15
16 /**
17 *
18 */
19 void setup() {
20 delay(3000);
21 Serial.begin(115200);
22 Serial.println("___MOTION DETECTION + SWITCH RESOLUTION___");
23
24 // camera settings
25 // replace with your own model!
26 camera.pinout.freenove_s3();
27 camera.brownout.disable();
28 camera.resolution.vga();
29 camera.quality.high();
30
31 // see example of motion detection for config values
32 detection.skip(5);
33 detection.stride(1);
34 detection.threshold(5);
35 detection.ratio(0.2);
36 detection.rate.atMostOnceEvery(5).seconds();
37
38 // init camera
39 while (!camera.begin().isOk())
40 Serial.println(camera.exception.toString());
41
42 Serial.println("Camera OK");
43 Serial.println("Awaiting for motion...");
44 }
45
46 /**
47 *
48 */
49 void loop() {
50 // capture picture
51 if (!camera.capture().isOk()) {
52 Serial.println(camera.exception.toString());
53 return;
54 }
55
56
56 // run motion detection
57 if (!detection.run().isOk()) {
58 Serial.println(detection.exception.toString());
59 return;
60 }
61
62 // on motion, perform action
63 if (detection.triggered()) {
64 Serial.printf(
65 "Motion detected on frame of size %dx%d (%d bytes)\n",
66 camera.resolution.getWidth(),
67 camera.resolution.getHeight(),
68 camera.getSizeInBytes()
69 );
70
71 Serial.println("Taking photo of motion at higher resolution");
72
73 camera.resolution.at(FRAMESIZE_UXGA, []() {
74 Serial.printf(
75 "Switched to higher resolution: %dx%d. It took %d ms to switch\n",
76 camera.resolution.getWidth(),
77 camera.resolution.getHeight(),
78 camera.resolution.benchmark.millis()
79 );
80
81 camera.capture();
82
83 Serial.printf(
84 "Frame size is now %d bytes\n",
85 camera.getSizeInBytes()
86 );
87
88 // save to SD...
89 });
90
91 Serial.println("Resolution switched back to VGA");
92 }
93 }
How long does it take to make the switch?
On my Freenove S3 board, it takes 44 milliseconds. You have to test this time for your
own board and see if it fits your project. If you think 44 ms will make you miss the source
that triggered motion (fast moving object), then your only chance is to increase the
resolution by default and keep the detection running "slower".
CAUTION: ESP32-cam boards from AiThinker (non-S3) may not have enough memory to
capture pictures at maximum resolution (XGA, 1600x1200). In that case you can still
switch to higher resolutions (e.g. SVGA), though not full resolution.
Event-Driven motion detection
In the Quickstart sketch, we saw how easy and linear it is to run motion detection; it
only requires a few lines of code. Nevertheless, the loop() function is pretty lengthy
now because it has to continuously check if one or more faces are present in the frame.
In this section, I'm going to show you how to move all the logic into the setup() function
instead. We will exploit a style of programming called event driven (or reactive). Event
driven programming consists in registering a listener function that will run when an event
of interest happens. In our case, the event of interest is motion being detected.
Why is this useful?
As I said, because it allows for leaner loop() code, where you can focus on running
other tasks that need to occur in parallel to motion detection. Often, event listeners also
help to isolate specific functionalities (motion detection handling) into their own routines,
visually de-cluttering other tasks' code.
Here's the updated sketch.

This source code is only available to paying users.

The configuration part is exactly the same as before. The new entry is the daemon object
which does 2 things:
1. accepts event listeners to be run on events of interest
2. runs the motion detection code in background

To register the callback to run when a single face is detected, we use

1 detection.daemon.onMotion(listener_function); Copy

Now you could even leave the loop() empty and you still will see the Motion detected

message printed on the Serial Monitor when you move your camera around.
Motion detection to MQTT
In your specific project, detecting motion may only be the first step in a larger system.
Maybe you want to log the time motion events were detected in a database, or get a
notification on your phone, or an email... whatever. There are many ways to accomplish
this goal. One of the most popular in the maker community is using the MQTT protocol as
a mean of systems communication.
The EloquentEsp32Cam has a first party integration for MQTT.
In the following sketch, you will have to replace the test.mosquitto.org broker with your
own and (if required) add proper authentication. Beside that, the sketch will work out of
the box.

Software requirements
EloquentEsp32Cam >= 2.2

PubSubClient >= 2.8

This source code is only available to paying users.

What is the payload that will be uploaded? It is a JSON object with the following
structure.

1 {"motion": true} Copy


Motion detection to Telegram
In the tutorial about sending text and pictures to Telegram we saw how easy this can be.
Now we will see how easy it is to integrate Telegram into motion detection by leveraging
the event-driven approach.

This source code is only available to paying users.

The motion detection configuration setup is the same as in previous examples. Telegram
configuration is also the same. The only addition is the definition of an actual event
listener for the motion event.

1 detection.daemon.onMotion([]() { Copy
2 Serial.println("Motion detected");
3
4 if (!telegram.to(TELEGRAM_CHAT).send(camera.frame).isOk())
5 Serial.println(telegram.exception.toString());
6 });
Object Detection with Edge Impulse FOMO

Object detection is the task of detecting an object of interest inside an image. Until a
couple years ago, this task was exclusive matter of computers due to the complexity of
models and the prohibitive number of math operations to perform.
Thanks to platforms like Edge Impulse, however, the entry barrier for beginners has
become much lower and it is now possible to:
1. easily train an object detection model in the cloud
2. (not so easily) deploy this model to the Esp32 camera

Sadly, the Esp32 camera seems not to be a first-class citizen on the Edge Impulse
platform and the (few) documentation and tutorials available online miss a lot of
fundamental pieces.
The purpose of this post is to help you develop and deploy your very own object
detection model to your Esp32 camera with detailed, easy to follow steps, even if
you're a beginner with Edge Impulse or Arduino programming.
Let's start!

Hardware requirements
This project works with S3 and non-S3 boards. An S3 board is highly recommended
though.

Software requirements
ESP32 Arduino core version 2.x. Not 1.x. Not 3.x.
EloquentEsp32Cam >= 2.2

You will need a free account on Edge Impulse.


If you've never used Edge Impulse, I suggest you watch a couple video tutorials online
before we start, otherwise you may get lost later on (there exists an official tutorial
playlist on YouTube).

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled

Let's start with the end result.


Watch video online at https://eloquentarduino.com/video/FOMO-LIVE-excfg.mp4
Object Detection using ESP32(S3) Camera Quickstart
To perform object detection on our ESP32 camera board, we will follow these steps:
1. Collect images from the camera
2. Use Edge Impulse to label the images
3. Use Edge Impulse to train the model
4. Use Edge Impulse to export the model into an Arduino library
5. Use EloquentEsp32Cam to run the model

Before this post existed, steps 1. and 5. were not as easy as they should've been.
I invite you to stop reading this post for a minute and go search on Google Esp32-cam
object detection: read the first 4-5 tutorials on the list and tell me if you're able to deploy
such a project.
I was not.
And I bet you neither.
Enough words, it's time to start tinkering.
CAUTION: there are 8 steps to follow from start to finish in this tutorial. Some of them
only take 1 minute to execute, so don't get scared. But remember that it is crucial for you
to follow them one by one, in the exact order they're written!

Collect images from the ESP32 camera


This is the first part where things are weird in the other tutorials. Many of them suggest
that you use images from Google/internet to train your model. Some others suggests to
collect data using your smartphone. How using random images from the web or from a
40 MP camera can help train a good model that will run on a 2$ camera hardware is out
of my knowledge.
That said, I believe that to get good results, we need to collect images directly from our
own Esp32 camera board.
This used to be hard.
But now, thanks to the tools I created for you, you will complete this task in a matter of
minutes.

[1/8] Upload the "Collect_Images_for_EdgeImpulse" Sketch to your Board


After you installed the EloquentEsp32Cam library,
navigate to
File > Examples > EloquentEsp32Cam > Collect_Images_for_EdgeImpulse and upload

the sketch to your board.

In case you want to know what's inside, here's the code.


See source
Filename: Collect_Images_for_EdgeImpulse.ino
1 /** Copy
2 * Collect images for Edge Impulse image
3 * classification / object detection
4 *
5 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
6 * to turn on debug messages
7 */
8
9 // if you define WIFI_SSID and WIFI_PASS before importing the library,
10 // you can call connect() instead of connect(ssid, pass)
11 //
12 // If you set HOSTNAME and your router supports mDNS, you can access
13 // the camera at http://{HOSTNAME}.local
14
15
15
#define WIFI_SSID "SSID"
16
#define WIFI_PASS "PASSWORD"
17
#define HOSTNAME "esp32cam"
18
19
20
#include <eloquent_esp32cam.h>
21
#include <eloquent_esp32cam/extra/esp32/wifi/sta.h>
22
#include <eloquent_esp32cam/viz/image_collection.h>
23
24
using eloq::camera;
25
using eloq::wifi;
26
using eloq::viz::collectionServer;
27
28
29
void setup() {
30
delay(3000);
31
Serial.begin(115200);
32
Serial.println("___IMAGE COLLECTION SERVER___");
33
34
// camera settings
35
// replace with your own model!
36
camera.pinout.wroom_s3();
37
camera.brownout.disable();
38
// Edge Impulse models work on square images
39
// face resolution is 240x240
40
camera.resolution.face();
41
camera.quality.high();
42
43
// init camera
44
while (!camera.begin().isOk())
45
Serial.println(camera.exception.toString());
46
47
// connect to WiFi
48
while (!wifi.connect().isOk())
49
Serial.println(wifi.exception.toString());
50
51
// init face detection http server
52
while (!collectionServer.begin().isOk())
53
Serial.println(collectionServer.exception.toString());
54
55
Serial.println("Camera OK");
56
Serial.println("WiFi OK");
57
Serial.println("Image Collection Server OK");
58
Serial.println(collectionServer.address());
59
}
60
61
62
void loop() {
63
// server runs in a separate thread, no need to do anything here
64
}

Don't forget to replace SSID and PASSWORD with your WiFi credentials!
Don't forget to replace your own camera model, if different from Ai Thinker.
Once the upload is done, open the Serial Monitor and take note of the IP address of the
camera.
Open a browser and navigate either to http://esp32cam.local or the IP address of the
camera.

[2/8] Prepare your environment


The OV2640 sensor that comes with your ESP32 camera is pretty good for its price,
really. Nevertheless, it still remains a 1 $ 2MP camera, so you can't expect great results in
every scenario.
Do a favor to yourself and put some efforts into creating a proper shooting environment:
1. use proper illumination (either bright sun or artificial light)
2. de-clutter your environment (use a desk and a flat, monochrome background -
e.g. a wall)

[3/8] Collect images in a web browser


Back to the webpage at http://esp32cam.local.
To collect the images, you simply need to click Start collecting . To stop, click Stop .

Please follow these EXACT steps to collect a good quality dataset (that guarantees you
won't have troubles later on):
1. put nothing in front of the camera. Collect 15-20 images. Pause, download and
clear
2. put the first object in front of the camera. Collect 20-30 images while moving the
camera a bit around. Try to capture different angles and positions of the object.
Pause, download and clear
3. repeat 2. for each object you have

After you finish, you will have one zip of images for each object class + one for "no
object / background".
Extract the zips and move to the next step.
[4/8] Use Edge Impulse to Label the Images
For object detection to work, we need to label the objects we want to recognize.
There are a few tools online, but Edge Impulse has one integrated that works good
enough for our project. Register a free account on edgeimpulse.com if you don't have
one already.
Create a new project, name it something like esp32-cam-object-detection , then choose
Images > Classify multiple objects > Import existing data .

Now follow these EXACT steps to speed up the labelling:


1. click Select files and select all the images in the "no object / background"
folder; check Automatically split between training and testing ; then hit
Begin upload
2. close the modal and click Labelling queue in the bar on top
3. always click Save labels without doing anything! (since there's no object in
these images, we'll use them for background modelling)
4. once done for all the images, go back to Upload data in the bar on top
5. click Select files and select all the images in the first object folder. Only
upload the images of a single folder at a time!!
6. Check Automatically split between training and testing ; then hit
Begin upload

7. go to Labelling queue in the top bar and draw the box around the object you
want to recognize. On the right, make sure
Label suggestions: Track objects between frames is selected

8. label all the images. Make sure to fix the bounding box to fit the object while
leaving a few pixels of padding
9. repeat 4-7 for each object

If you upload all the images at once, the labelling queue will mix the different objects and
you will lose a lot more time to draw the bounding box.

Be smart!
At this point, you will have all the data you need to train the model.

[5/8] Use Edge Impulse to Train the Model


If you've ever used Edge Impulse, you know this part is going to be pretty easy.
1. Navigate to Impulse design on the left menu
2. enter 48 as both image width and image height
3. select Squash as resize mode
4. Add the Image processing block
5. Add the Object detection learning block
6. save the impulse

48 is a value that I found working pretty well with non-S3 cameras: it generates a model
small enough to fit in memory yet produces usable results. If you have an S3 board, you
can change this value to 60 or even 96 .

1. navigate to Impulse design > Image on the left


2. select Grayscale as color depth
3. click on Save parameters
4. click on Generate features

It should take less than a minute to complete, depending on the number of images.
If you only have a single class of object (as I do in the image), the plot on the right will
have little meaning, just ignore it.
What about RGB images?
Edge Impulse also works with RGB images. The problem is that RGB models are slower
than grayscale ones, so I suggest you start with the grayscale version, test it on your
board, and "upgrade" to the RGB one only if you find it is not performing fine.

1. navigate to Impulse design > Object detection


2. set the number of training cycles to 35
3. set learning rate to 0.005
4. click on Choose a different model right below the FOMO block
5. select FOMO (Faster Objects, More Objects) MobileNetV2 0.1

This is the smallest model we can train and the only one that fits fine on non-ESP32S3
camera boards.
Hit Start training and wait until it completes. It can take 4-5 minutes depending on the
number of images.

To get accurate estimates of inferencing time and memory usage as the above image
shows, be sure to select "Espressif ESP-EYE" as target board in the top-right corner of
the Edge Impulse page.
If you're satisfied with the results, move to the next step. If you're not, you have to:
1. collect more / better images. Good input data and labelling is a critical step of
Machine Learning
2. increase the number of training cycles (don't go over 50, it's almost useless)
3. decrease the learning rate to 0.001. It may help, or not

Now that all looks good, it's time to export the model.

[6/8] Export the Model to an Arduino Library


This is the shortest step.
1. Navigate to Deployment on the left menu
2. select Arduino Library
3. scroll down and hit Build .

A zip containing the model library will download.

[7/8] Sketch deployment


Before I went on this personally, as a beginner with Edge Impulse, I couldn't find any
good tutorial on how to deploy an object detection model to the Esp32-camera on the
entire web!
That's why I'm keeping this tutorial free instead of making it a premium chapter of my
paid eBook.
1. open the Arduino IDE and create a new sketch
2. select Esp32 Dev Module or ESP32S3 Dev Module (depending on your board) as
model
3. enable external PSRAM (for S3 boards, select OPI PSRAM )
4. navigate to Sketch > Include library > Add .zip library and select the zip
downloaded from Edge Impulse.
5. copy the sketch below inside your project

Don't forget to change the name of the Edge Impulse library if your filename is
different!
See source
Filename: EdgeImpulse_FOMO_NO_PSRAM.ino
1 /** Copy
2 * Run Edge Impulse FOMO model.
3 * It works on both PSRAM and non-PSRAM boards.
4 *
5 * The difference from the PSRAM version
6 * is that this sketch only runs on 96x96 frames,
7 * while PSRAM version runs on higher resolutions too.
8 *
9 * The PSRAM version can be found in my
10 * "ESP32S3 Camera Mastery" course
11 * at https://dub.sh/ufsDj93
12 *
13 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
14 * to turn on debug messages
15 */
16 #include <your-fomo-project_inferencing.h>
17 #include <eloquent_esp32cam.h>
18 #include <eloquent_esp32cam/edgeimpulse/fomo.h>
19
20 using eloq::camera;
21 using eloq::ei::fomo;
22
23
24 /**
25 *
26 */
27 void setup() {
28 delay(3000);
29 Serial.begin(115200);
30
30 Serial.println("__EDGE IMPULSE FOMO (NO-PSRAM)__");
31
32 // camera settings
33 // replace with your own model!
34
camera.pinout.aithinker();
35 camera.brownout.disable();
36 // NON-PSRAM FOMO only works on 96x96 (yolo) RGB565 images
37
camera.resolution.yolo();
38 camera.pixformat.rgb565();
39
40
// init camera
41 while (!camera.begin().isOk())
42 Serial.println(camera.exception.toString());
43
44 Serial.println("Camera OK");
45 Serial.println("Put object in front of camera");
46
}
47
48
49
void loop() {
50 // capture picture
51 if (!camera.capture().isOk()) {
52
Serial.println(camera.exception.toString());
53 return;
54 }
55
56 // run FOMO
57 if (!fomo.run().isOk()) {
58
Serial.println(fomo.exception.toString());
59 return;
60 }
61
62
// how many objects were found?
63 Serial.printf(
64
"Found %d object(s) in %dms\n",
65
fomo.count(),
66 fomo.benchmark.millis()
67
);
68
69 // if no object is detected, return
70 if (!fomo.foundAnyObject())
71
return;
72
73 // if you expect to find a single object, use fomo.first
74
Serial.printf(
75 "Found %s at (x = %d, y = %d) (size %d x %d). "
76 "Proba is %.2f\n",
77
fomo.first.label,
78 fomo.first.x,
79 fomo.first.y,
80
fomo.first.width,
81 fomo.first.height,
82 fomo.first.proba
83
);
84
85 // if you expect to find many objects, use fomo.forEach
86
if (fomo.count() > 1) {
87 fomo.forEach([](int i, bbox_t bbox) {
88 Serial.printf(
89
"#%d) Found %s at (x = %d, y = %d) (size %d x %d). "
90 "Proba is %.2f\n",
91 i + 1,
92
93 bbox.label,
94 bbox.x,
95 bbox.y,
96 bbox.width,
97 bbox.height,
98 bbox.proba
99 );
100 });
101 }
}

[8/8] Run
This is going to be the most rewarding step of all this tutorial.
Save the sketch, hit upload, open the Serial Monitor and watch the predictions text
magically appearing while you put your objects in front of the camera.
Watch video online at https://eloquentarduino.com/video/FOMO-LIVE-excfg.mp4
Object Detection at higher resolutions
So far, we've been collecting images at YOLO resolution (96x96) to save RAM and
perform little-to-no rescaling on the source image. This allowed us to fit the model on the
little space available on non-S3 boards.
But what if you own a S3 board with plenty of RAM?
Why should you not leverage that much more memory?
In fact, you can well capture images at the resolution you prefer (up to the RAM limits, of
course) and still have space to run FOMO model.
The following sketch is very similar to the previous one, but with a huge difference:
instead of setting yolo resolution with RGB565 encoding, it is using JPEG encoding at
VGA resolution (640 x 480).

This source code is only available to paying users.

You may ask what difference it makes.


Let's say you're implementing a wildlife camera to shoot at bears in the forest. You may
want to store those frames on a SD card when you recognize such an animal.
With the non-PSRAM version, you won't be able to do so, since the image is captured at
96x96 resolution and it would be too small to be of any use. Now, instead, you have
access to the full resolution image.
Event-Driven object detection
In the Quickstart sketch, we saw how easy and linear it is to run object detection; it only
requires a few lines of code. Nevertheless, the loop() function is pretty lengthy now
because it has to continuously check if one or more objects are present in the frame.
In this section, I'm going to show you how to move all the logic into the setup() function
instead. We will exploit a style of programming called event driven (or reactive). Event
driven programming consists in registering a listener function that will run when an event
of interest happens. In our case, the event of interest is one or more object being
detected.
Why is this useful?
To begin, because it allows for leaner loop() code, where you can focus on running
other tasks that need to occur in parallel to face detection. And then, event listeners often
help to isolate specific functionalities (object detection handling) into their own routines,
visually de-cluttering other tasks' code.
Here's the updated sketch.

This source code is only available to paying users.

The configuration part is exactly the same as before. The new entry is the daemon object
which does 2 things:
1. accepts event listeners to be run on events of interest
2. runs the face detection code in background

There are 3 types of listeners you can register:


1. whenYouDontSeeAnything : this runs when no object is detected
2. whenYouSeeAny : this runs whenever an object (any object) is detected
3. whenYouSee(label) : this runs only when an object with the given label is
detected
Object detection to MQTT
In your specific project, detecting an object may only be the first step in a larger system.
Maybe you want to log how many objects were detected in a database, or get a
notification on your phone, or an email... whatever. There are many ways to accomplish
this goal. One of the most popular in the maker community is using the MQTT protocol as
a mean of systems communication.
The EloquentEsp32Cam has a first party integration for MQTT.
In the following sketch, you will have to replace the test.mosquitto.org broker with your
own and (if required) add proper authentication. Beside that, the sketch will work out of
the box.

Software requirements
EloquentEsp32Cam >= 2.2

PubSubClient >= 2.8

This source code is only available to paying users.

What is the payload that will be uploaded? It is a JSON description of all the objects
found in the frame.

1 [{"label": "penguin", "x": 8, "y": 16, "w": 32, "h": 32, "proba": 0.8}, ...] Copy
Object detection streaming
So far, we were only able to debug object detection in the Serial Monitor. It would be
much better if we could visually debug it by seeing the realtime streaming video from the
camera at the same time.
As far as I know (at least to the date of Dec 2023), this feature does not exists anywhere
else on the web. It took a lot of work to make this happen but it was worth is, since the
result is stunning.
Watch video online at https://eloquentarduino.com/video/FOMO-stream-lorwe.mp4

When the object that you trained your FOMO model on is recognized, you will see a dot
on its centroid.
The streaming may be laggish because of poor wifi. Consider that sketch is mainly for
debug purposes, not for real time execution.

Here's the code.

This source code is only available to paying users.


Person Detection

Person detection running on our cheap ESP32 cam has become a common task
nowadays.
Only a few years ago this would have been simply impossible because of lack of both
hardware (Arduino boards used to feature a mediocre 16 kb RAM) and software (neural
networks support for embedded systems was simply non-existent).
As of today, lots of things has changed and person detection has even become one the
get started projects for TinyML. Nonetheless, many tutorials on the subject are a bit
convoluted and pretty hard to follow, let alone modify and integrate into your very own
project.
But it doesn't have to be that way.
I'm going to show you how you can add person detection on your very own ESP32
camera project in 5 minutes!

Software requirements
You will need to install 3 libraries:
1. tlfm_esp32 :the TensorFlow runtime for the ESP32
2. EloquentTinyML : a wrapper around TensorFlow

3. EloquentEsp32Cam : to interact with camera easily

You can install them all from the Arduino Library Manager.
Be sure you have the latest version of each!

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled
Arduino sketch
This part is easy, you don't have to write much code nor configure many options.
See source
Filename: PersonDetectionExample.ino
1 /** Copy
2 * Run person detection on ESP32 camera
3 * - Requires tflm_esp32 library
4 * - Requires EloquentEsp32Cam library
5 *
6 * Detections takes about 4-5 seconds per frame
7 */
8 #include <Arduino.h>
9 #include <tflm_esp32.h>
10 #include <eloquent_tinyml.h>
11 #include <eloquent_tinyml/zoo/person_detection.h>
12 #include <eloquent_esp32cam.h>
13
14 using eloq::camera;
15 using eloq::tinyml::zoo::personDetection;
16
17
18 void setup() {
19 delay(3000);
20 Serial.begin(115200);
21 Serial.println("__PERSON DETECTION__");
22
23 // camera settings
24 // replace with your own model!
25 camera.pinout.freenove_s3();
26 camera.brownout.disable();
27 // only works on 96x96 (yolo) grayscale images
28 camera.resolution.yolo();
29 camera.pixformat.gray();
30
31 // init camera
32 while (!camera.begin().isOk())
33 Serial.println(camera.exception.toString());
34
35 // init tf model
36 while (!personDetection.begin().isOk())
37 Serial.println(personDetection.exception.toString());
38
39 Serial.println("Camera OK");
40 Serial.println("Point the camera to yourself");
41 }
42
43 void loop() {
44 Serial.println("Loop");
45
46 // capture picture
47 if (!camera.capture().isOk()) {
48 Serial.println(camera.exception.toString());
49 return;
50
50
}
51
52
// run person detection
53
if (!personDetection.run(camera).isOk()) {
54
Serial.println(personDetection.exception.toString());
55
return;
56
}
57
58
// a person has been detected!
59
if (personDetection) {
60
Serial.print("Person detected in ");
61
Serial.print(personDetection.tf.benchmark.millis());
62
Serial.println("ms");
63
}
64
}

The sketch can be broken into the following parts.

Configure
1 // configure camera Copy
2 camera.pinout.freenove_s3();
3 camera.brownout.disable();
4 // only works on 96x96 (yolo) grayscale images
5 camera.resolution.yolo();
6 camera.pixformat.gray();
7
8 // init camera
9 while (!camera.begin().isOk())
10 Serial.println(camera.exception.toString());
11
12 // init tf model
13 while (!personDetection.begin().isOk())
14 Serial.println(personDetection.exception.toString());

Run
1 // capture picture Copy
2 if (!camera.capture().isOk()) {
3 Serial.println(camera.exception.toString());
4 return;
5 }
6
7 // run person detection
8 if (!personDetection.run(camera).isOk()) {
9 Serial.println(personDetection.exception.toString());
10 return;
11 }

Test
1 // test if a person is detected Copy
2 if (personDetection) {
3 // do whatever you want here...
4 }
Customize
Customizing the sketch is easy. You can add your own logic to be performed when a
person is detected inside the if block above.
A few use cases you can implement:
a smart door bell that activates when it recognizes a person in front of it
a smart relay that turns the lights on when you're back at home
count the number of people passing in front of your camera

Feel free to let me know what you implemented with the form below.
If you're stuck, don't hesitate to ask for help.
Appendix. Camera config
To run the above code, set these configurations in your Tools menu (left is for ESP32,
right form ESP32S3).
Face detection

The new ESP32S3 chip has a lot more processing power than the old ESP32. This makes
it suitable for many tasks that would have taken too much time in the past to implement in
a realtime application.
One of these task is face detection. With the ESP32S3 chip, you can run accurate face
detection in 80ms!
The official ESP-IDF API is not too difficult to use, but I made it even easier with the
EloquentEsp32Cam library. In this page, I'll show you how easy it can be to run fast and

accurate face detection on your ESP32S3 camera board.

Hardware requirements
This project only works with ESP32S3 boards.

Software requirements
EloquentEsp32Cam >= 2.2

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled

This is a demo of what we will achieve by the end of this tutorial.


Watch video online at https://eloquentarduino.com/video/face-detection-stream-
gtRHD.mp4
Face detection quickstart
See source
Filename: Face_Detection.ino
1 /** Copy
2 * Face detection
3 * ONLY WORKS ON ESP32S3
4 *
5 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
6 * to turn on debug messages
7 */
8 #include <eloquent_esp32cam.h>
9 #include <eloquent_esp32cam/face/detection.h>
10
11 using eloq::camera;
12 using eloq::face_t;
13 using eloq::face::detection;
14
15
16 /**
17 *
18 */
19 void setup() {
20 delay(3000);
21 Serial.begin(115200);
22 Serial.println("___FACE DETECTION___");
23
24 // camera settings
25 // !!!!REPLACE WITH YOUR OWN MODEL!!!!
26 camera.pinout.freenove_s3(); // e.g. xiao(), lilygo_tcamera_s3(), ...
27 camera.brownout.disable();
28 // face detection only works at 240x240
29 camera.resolution.face();
30 camera.quality.high();
31
32 // you can choose fast detection (50ms)
33 detection.fast();
34 // or accurate detection (80ms)
35 detection.accurate();
36
37 // you can set a custom confidence score
38 // to consider a face valid
39 // (in range 0 - 1, default is 0.5)
40 detection.confidence(0.7);
41
42 // init camera
43 while (!camera.begin().isOk())
44 Serial.println(camera.exception.toString());
45
46 Serial.println("Camera OK");
47 Serial.println("Awaiting for face...");
48 }
49
50 /**
51 *
52
53 */
54 void loop() {
55 // capture picture
56 if (!camera.capture().isOk()) {
57 Serial.println(camera.exception.toString());
58 return;
59 }
60
61 // run detection
62 if (!detection.run().isOk()) {
63 Serial.println(detection.exception.toString());
64 return;
65 }
66
67 // if face is not found, abort
68 if (detection.notFound())
69 return;
70
71 Serial.printf(
72 "Face(s) detected in %dms!\n",
73 detection.benchmark.millis()
74 );
75
76 // you can access the first detected face
77 // at detection.first
78 Serial.printf(
79 " > face #1 located at (%d, %d)\n"
80 " proba is %.2f\n",
81 detection.first.x,
82 detection.first.y,
83 detection.first.score
84 );
85
86 // if you expect multiple faces, you use forEach
87 if (detection.count() > 1) {
88 detection.forEach([](int i, face_t face) {
89 Serial.printf(
90 " > face #%d located at (%d, %d)\n",
91 i + 1,
92 face.x,
93 face.y
94 );
95
96 // if you enabled accurate detection
97 // you have access to the keypoints
98 if (face.hasKeypoints()) {
99 Serial.printf(
100 " > left eye at (%d, %d)\n"
101 " > right eye at (%d, %d)\n"
102 " > nose at (%d, %d)\n"
103 " > left mouth at (%d, %d)\n"
104 " > right mouth at (%d, %d)\n",
105 face.leftEye.x,
106 face.leftEye.y,
107 face.rightEye.x,
108 face.rightEye.y,
109 face.nose.x,
110 face.nose.y,
111 face.leftMouth.x,
112 face.leftMouth.y,
113 face.rightMouth.x,
114 face.rightMouth.y
115
115
);
116
}
117
118
});
119
}
}

This entire code is ~100 lines, most of which comments and prints.
The most important parts are listed below and explained in more detail.

Frame resolution
At the current state, face detection only works on frames of 240x240!

1 camera.resolution.face(); Copy

Detection accuracy vs speed


1 // you can choose fast detection (50ms) Copy
2 detection.fast();
3 // or accurate detection (80ms)
4 detection.accurate();

There are two methods of detection:


1. one pass: there's a single classifier that detects the faces. It is fast, but may get
some faces wrong or miss actual faces
2. two passes: the output of the one pass classifier is fed as input to a second
classifier that refines the results

Of course, the two passes process is more accurate but takes some more time. It is up to
you to decide which one best suits your project.

Run face detection


Running the face detection algorithm only takes a single line.

1 if (!detection.run().isOk()) {} Copy

The detection in itself will never actually fail. You can get a failure under 2 circumstances:
1. you forgot to set resolution to 240x240
2. the JPEG frame cannot be converted to RGB888 format (memory issues
probably)

In both cases, you will see a clear error message printed in the Serial monitor.

Check face existance


If the above line returns true , it doesn't mean that there is a face in the frame! You can
check if at least one face has been detected by calling

1 if (detection.found()) {} Copy
2 // or, on the contrary
3 if (detection.notFound()) {}

Face coordinates and keypoints


Once a face (or more) is detected, you can get access to its coordinates ( x , y , width ,
height ). If you turned on accurate detection, you also have access to the coordinates of

leftEye , rightEye , nose , leftMouth , rightMouth .

If you only expect a single face to be detected, you can access it at detection.first .

1 Serial.printf( Copy
2 " > face #1 located at (%d, %d)\n"
3 " proba is %.2f\n",
4 detection.first.x,
5 detection.first.y,
6 // score is the probability of detection
7 detection.first.score
8 );

In case multiple faces are detected, you can iterate over them with the forEach function.

1 detection.forEach([](int i, face_t face) { Copy


2 Serial.printf(
3 " > face #%d located at (%d, %d)\n",
4 i + 1,
5 face.x,
6 face.y
7 );
8 });
Event-Driven face detection
In the Quickstart sketch, we saw how easy and linear it is to run face detection; it only
requires a few lines of code. Nevertheless, the loop() function is pretty lengthy now
because it has to continuously check if one or more faces are present in the frame.
In this section, I'm going to show you how to move all the logic into the setup() function
instead. We will exploit a style of programming called event driven (or reactive). Event
driven programming consists in registering a listener function that will run when an event
of interest happens. In our case, the event of interest is one or more faces being
detected.
Why is this useful?
As I said, because it allows for leaner loop() code, where you can focus on running
other tasks that need to occur in parallel to face detection. Often, event listeners also
help to isolate specific functionalities (face detection handling) into their own routines,
visually de-cluttering other tasks' code.
Here's the updated sketch.

This source code is only available to paying users.

The configuration part is exactly the same as before. The new entry is the daemon object
which does 2 things:
1. accepts event listeners to be run on events of interest
2. runs the face detection code in background

To register the callback to run when a single face is detected, we use

1 detection.daemon.onFace([](face_t face) {}); Copy

To register the callback to run when multiple faces are detected, we use
1 detection.daemon.onMultipleFaces([](int face_index, face_t face) {}); Copy

As easy as that!
Now you could even leave the loop() empty and you still will see the Face detected
message printed on the Serial Monitor when a face enters the camera viewport.
Face detection to MQTT
In your specific project, detecting a face may only be the first step in a larger system.
Maybe you want to log how many faces were detected in a database, or get a notification
on your phone, or an email... whatever. There are many ways to accomplish this goal.
One of the most popular in the maker community is using the MQTT protocol as a mean
of systems communication.
The EloquentEsp32Cam has a first party integration for MQTT.
In the following sketch, you will have to replace the test.mosquitto.org broker with your
own and (if required) add proper authentication. Beside that, the sketch will work out of
the box.

Software requirements
EloquentEsp32Cam >= 2.2

PubSubClient >= 2.8

This source code is only available to paying users.

What is the payload that will be uploaded? It is a JSON description of all the faces found
in the frame.

1 [{"x": 0, "y": 0, "w": 100, "h": 100, "score": 0.8}, ...] Copy
Face detection streaming
So far, we were only able to debug face detection in the Serial Monitor. It would be much
better if we could visually debug it by seeing the realtime streaming video from the
camera at the same time.
Face detection streaming is implemented in the default CameraWebServer example from
the Arduino IDE, but it also has a lot more options that may distract you. If you prefer a
cleaner interface, you can run the sketch below.

This source code is only available to paying users.

To view the camera stream in a web browser, we need to connect to a WiFi network.
Replace WIFI_SSID , WIFI_PASS and HOSTNAME with your own values and be sure your
PC/smartphone is connected to the same network!
Open the Serial Monitor and take note of the IP address of your board. If your router
supports mDNS (most do), you can open the stream at http://esp32cam.local, otherwise
you will need to use the IP address.
This is a short demo of what the result will look like.
Watch video online at https://eloquentarduino.com/video/face-detection-stream-
gtRHD.mp4
As opposed as a few other tutorials that you can find online, the landmarks detection is
happening on the ESP32 itself, not in the browser!
ESP32 Cam Face Recognition

If you search on Google "esp32 cam face recognition arduino", you won't find a single
result that shows you how to create a standalone sketch that does it (in ~191.000
reported from Google).

Not. A. Single. Result.

The few relevant ones that appear (even the one ranked #1) only tells you "Load the
default WebServer example and it does face recognition".
But what if you don't want the default WebServer? How do you create your very own
project that does face recognition without the web GUI and all the other stuff?
I've spent hours to bring this feature into my EloquentEsp32Cam Arduino library and now
I'll show you how easy it can be.

Requirements
This project only works with ESP32S3 boards. If you have the cheap AiThinker ESP32
cam, it won't work. Non-S3 chips cannot handle the heavy computations required to
perform face recognition.
You will also need to install EloquentEsp32Cam >= 2.3.8 from the Arduino Library
Manager.

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled
Face recognition quickstart
The following sketch is a standalone project that allows you to
1. enroll new faces for later recognition
2. recognize the current detected face

It has ~150 lines of code, but there are many blank lines and comments, so don't be
scared. I'll first detail the most important parts of the code, then show the complete
sketch at the end of the post.

Includes
To perform face recognition (tell which person a given face belongs to), you first need to
do face detection (tell if and where in the image there's a face). So you need to import
the files for these 2 tasks.

1 #include <eloquent_esp32cam.h> Copy


2 #include <eloquent_esp32cam/face/detection.h>
3 #include <eloquent_esp32cam/face/recognition.h>
4
5 // create short names for namespaced global variables
6 using eloq::camera;
7 using eloq::face::detection;
8 using eloq::face::recognition;

Setup
Now you have to configure the camera, the face detector and the face recognizer.
If you read other posts from this blog, the following code will look familiar. If this is your
first time here, I suggest you read the ESP32 Camera Quickstart tutorial to get familiar
with the style of the EloquentEsp32Cam library.

1 // !!!!REPLACE WITH YOUR OWN MODEL!!!! Copy


2 camera.pinout.freenove_s3(); // e.g. xiao(), lilygo_tcamera_s3(), ...
3 camera.brownout.disable();
4 // face recognition only works at 240x240
5 camera.resolution.face();
6 camera.quality.high();
7
8 // face recognition only works with accurate detection
9 detection.accurate();
10
11 detection.confidence(0.7);
12
13 // face recognition confidence
14 recognition.confidence(0.85);
15
16 // init camera
17 while (!camera.begin().isOk())
18 Serial.println(camera.exception.toString());
19
20 // init recognizer
21 while (!recognition.begin().isOk())
Serial.println(recognition.exception.toString());

Enroll new face


To enroll a new face you need one line of code.

1 // to enroll a face, there must be a face in the frame! Copy


2 if (!recognition.detect().isOk())
3 return;
4
5 if (recognition.enroll(name).isOk())
6 Serial.println("Success!");
7 else
8 Serial.println(recognition.exception.toString());

Recognize face
Guess how many lines of code you need to recognize a face...

1 // to recognize a face, there must be a face in the frame! Copy


2 if (!recognition.detect().isOk())
3 return;
4
5 if (!recognition.recognize().isOk()) {
6 Serial.println(recognition.exception.toString());
7 return;
8 }
9
10 // you have access to the recognition result
11 // via recognition.match
12 Serial.print("Recognized face as ");
13 Serial.print(recognition.match.name.c_str());
14 Serial.print(" with confidence ");
15 Serial.print(recognition.match.similarity);
16 Serial.print(" (");
17 Serial.print(recognition.benchmark.millis());
18 Serial.println("ms)");

Perform action on face recognition


Let's say you're making a smart door lock project and you want to unlock the door when
you are recognized. How easy is it?

1 void loop() { Copy


2 if (!camera.capture().isOk()) {
3 Serial.println(camera.exception.toString());
4 return;
5 }
6
7 if (!recognition.detect().isOk())
8 return;
9
10 if (!recognition.recognize().isOk()) {
11 Serial.println(recognition.<em>exception</em>.toString());
12 return;
13 }
14
15 if (recognition.match.name != "simone") {
16 Serial.println("You are not authorized");
17 return;
18 }
19
20 // open lock
21 digitalWrite(RELAY_PIN, HIGH);
22 }

Complete code
See source
Filename: Face_Recognition.ino
1 /** Copy
2 * ESP32S3 Face Recognition
3 * (not detection!)
4 *
5 * Enroll a couple of faces 2-3 times each,
6 * then watch the ESP32 camera recognize between the two!
7 *
8 * Only works on ESP32 S3 chip.
9 */
10 #include <eloquent_esp32cam.h>
11 #include <eloquent_esp32cam/face/detection.h>
12 #include <eloquent_esp32cam/face/recognition.h>
13
14
15 using eloq::camera;
16 using eloq::face::detection;
17 using eloq::face::recognition;
18
19
20 String prompt(String message);
21
22
23 /**
24 *
25 */
26
26 void setup() {
27 delay(4000);
28 Serial.begin(115200);
29 Serial.println("Begin");
30
31 // !!!!REPLACE WITH YOUR OWN MODEL!!!!
32 camera.pinout.freenove_s3(); // e.g. xiao(), lilygo_tcamera_s3(), ...
33 camera.brownout.disable();
34 // face recognition only works at 240x240
35 camera.resolution.face();
36 camera.quality.high();
37
38 // face recognition only works with accurate detection
39 detection.accurate();
40 detection.confidence(0.7);
41
42 // face recognition confidence
43 recognition.confidence(0.85);
44
45 // init camera
46 while (!camera.begin().isOk())
47 Serial.println(camera.exception.toString());
48
49 // init recognizer
50 while (!recognition.begin().isOk())
51 Serial.println(recognition.exception.toString());
52
53 Serial.println("Camera OK");
54 Serial.println("Face recognizer OK");
55
56 // delete stored data, if user confirms
57 if (prompt("Do you want to delete all existing faces? [yes|no]").startsWith("y")) {
58 Serial.println("Deleting all existing faces...");
59 recognition.deleteAll();
60 }
61
62 // dump stored faces, if user confirms
63 if (prompt("Do you want to dump existing faces? [yes|no]").startsWith("y")) {
64 recognition.dump();
65 }
66
67 Serial.println("Awaiting for face...");
68 }
69
70
71 /**
72 *
73 */
74 void loop() {
75 // capture picture
76 if (!camera.capture().isOk()) {
77 Serial.println(camera.exception.toString());
78 return;
79 }
80
81 // run face detection (not recognition!)
82 if (!recognition.detect().isOk())
83 return;
84
85 // if face is found, ask user to enroll or recognize
86 String answer = prompt("Do you want to enroll or recognize? [e|r]");
87
88
89 if (answer.startsWith("e"))
90 enroll();
91 else if (answer.startsWith("r"))
92 recognize();
93
94 Serial.println("Awaiting for face...");
95 }
96
97
98 /**
99 * Ask user for input
100 */
101 String prompt(String message) {
102 String answer;
103
104 do {
105 Serial.print(message);
106
107 while (!Serial.available())
108 delay(1);
109
110 answer = Serial.readStringUntil('\n');
111 answer.trim();
112 } while (!answer);
113
114 Serial.print(" ");
115 Serial.println(answer);
116 return answer;
117 }
118
119
120 /**
121 * Enroll new person
122 */
123 void enroll() {
124 String name = prompt("Enter person name:");
125
126 if (recognition.enroll(name).isOk())
127 Serial.println("Success!");
128 else
129 Serial.println(recognition.exception.toString());
130 }
131
132
133 /**
134 * Recognize current face
135 */
136 void recognize() {
137 if (!recognition.recognize().isOk()) {
138 Serial.println(recognition.exception.toString());
139 return;
140 }
141
142 Serial.print("Recognized face as ");
143 Serial.print(recognition.match.name.c_str());
144 Serial.print(" with confidence ");
145 Serial.print(recognition.match.similarity);
146 Serial.print(" (");
147 Serial.print(recognition.benchmark.millis());
148 Serial.println("ms)");
}
Want more contents?
I put hours of work into the code and posts I write. As far as I love writing content to help
people upgrade their skills with Arduino programming, I need to pay my bills.
If you want to make sure I can continue creating new awesome content like this, consider
helping me by buying my "Mastering the ESP32 Camera" ebook.
FOMO driven car

In my article ESP32 cam object detection, I showed you how to train and deploy an Edge
Impulse object detection model to your ESP32 camera.
In this tutorial, I'll take a step further: we'll use the object detection results to drive an
autonomous car!
Let's start with the end result.
Watch video online at https://eloquentarduino.com/video/esp32-cam-autonomous-car-
demo.mp4

Pretty awesome, right?


Do you want to replicate it right now?
If you already have a toy car, it will take ~20 minutes to train the model and program the
board.

Software requirements
EloquentEsp32Cam >= 2.6

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled
Get a car
This tutorial is not an instructables. I will not teach you how to build an Arduino-controlled
car because there are a lot of tutorials on the web dedicated to this topic. I'll link some of
the ones I followed to build mine:
Car chassis from Botland or from Amazon
L298N motor driver on Amazon with tutorial
ESP32S3 camera

*no affiliate links (sadly)


For the building process, search on YouTube if you need help. You only have to connect
the motors and power supply to the L298N module, then the L298N module to the ESP32.
Nothing else.
Test the car
Before moving on, be sure your car is working.
Load the following sketch, set the correct motor pin numbers and test that the car
behaves like expected when you send commands over Serial.

If you swap the pins, the car will swap directions! Double check.

See source
Filename: Car_Test.ino
1 /** Copy
2 * Test car wiring
3 */
4 #include <eloquent_esp32cam.h>
5 #include <eloquent_esp32cam/car/two_wheels_car.h>
6
7 using eloq::camera;
8 using eloq::car::Motor;
9 using eloq::car::TwoWheelsCar;
10
11 // replace with your motor pins
12 Motor left(39, 40);
13 Motor right(42, 41);
14 TwoWheelsCar testCar(left, right);
15
16
17 /**
18 *
19 */
20 void setup() {
21 delay(3000);
22 Serial.begin(115200);
23 Serial.println("___CAR TEST___");
24
25 // how many millis motors will run
26 testCar.defaultDuration(200);
27 testCar.stop();
28
29 Serial.println("Enter one of f (forward), b (backward), l (left), r (right)");
30 }
31
32
33 /**
34 *
35 */
36 void loop() {
37 if (!Serial.available())
38 return;
39
40
41 String cmd = Serial.readStringUntil('\n');
42
43 if (cmd.startsWith("f")) testCar.forward();
44 else if (cmd.startsWith("b")) testCar.backward();
45 else if (cmd.startsWith("l")) testCar.left();
46 else if (cmd.startsWith("r")) testCar.right();
}
Train an object detection model
Please refer to the aforementioned ESP32 cam object detection for detailed instructions.
Program the car
Now that you have all the pieces in place, it's time to program the ESP32 camera to drive
the car based on the object detection results.
The following sketch assumes that:
1. the camera is placed vertically
2. left and right motor pins are set correctly

See source
Filename: Autonomous_Car.ino
1 /** Copy
2 * Autonomous car driven by Edge Impulse FOMO
3 * From: https://eloquentarduino.com/esp32-cam-autonomous-car
4 *
5 * Tested on ESP32S3 camera
6 */
7 #include <your_edge_impulse_inferencing.h>
8 #include <eloquent_esp32cam.h>
9 #include <eloquent_esp32cam/car.h>
10
11 using eloq::camera;
12 using eloq::car::Motor;
13 using eloq::car::Car;
14
15 /**
16 * Replace with your motor pins
17 */
18 Motor left(39, 40);
19 Motor right(42, 41);
20 Car fomoCar(left, right);
21
22
23
24 /**
25 *
26 */
27 void setup() {
28 delay(3000);
29 Serial.begin(115200);
30 Serial.println("___AUTONOMOUS CAR___");
31
32 // replace with your board
33 camera.pinout.freenove_s3();
34 camera.brownout.disable();
35 camera.resolution.yolo();
36 camera.pixformat.rgb565();
37
38 // how many millis motors will run
39 // to follow given object
40 fomoCar.defaultDuration(100);
41
41
fomoCar.stop();
42
43
// if you mounted the camera "backward"
44
// (see video), you have to reverse the motors
45
46 // left.reverse();
// right.reverse();
47
48
49 // init camera
while (!camera.begin().isOk())
50
Serial.println(camera.exception.toString());
51
52
Serial.println("Camera OK");
53
Serial.println("Put object in front of camera");
54
55 }
56
/**
57
58 *
*/
59
void loop() {
60
61 // capture picture
if (!camera.capture().isOk()) {
62
Serial.println(camera.exception.toString());
63
64 return;
}
65
66
67 // run FOMO
if (!fomo.run().isOk()) {
68
Serial.println(fomo.exception.toString());
69
70 return;
}
71
72
73 // let the car follow the object
74 fomoCar.follow(fomo);
}

Wait...
Is it that simple? How???
Well, that's the power of the EloquentEsp32cam library.
JPEG encoding on the fly

Many makers start their journey with the ESP32 camera by flashing the CameraWebServer
example. This sketch allows you to access the camera feed in your browser.
Then you learnt that you can strip away all the controls GUI and just stream the video in
the ESP32 cam Quickstart tutorial.
This tutorial is an advanced modification of that tutorial: I will you you how to process the
raw pixels from the camera before streaming them over HTTP to your browser.

Hardware/Software requirements
To follow this tutorial you will need an ESP32 cam with lots of RAM (8 or 16 Mbit) and
EloquentEsp32cam library version >= 2.7.7.

Arduino IDE Tools configuration for ESP32S3


Board ESP32S3 Dev Module
Upload Speed 921600
USB Mode Hardware CDC and JTAG
USB CDC On Boot Disabled
USB Firmware MSC On Boot Disabled
USB DFU On Boot Disabled
Upload Mode UART0 / Hardware CDC
CPU Frequency 240MHz (WiFi)
Flash Mode QIO 80MHz
Flash Size 4MB (32Mb)
Partition Scheme Huge APP (3MB No OTA/1MB SPIFFS)
Core Debug Level Info
PSRAM OPI PSRAM
Arduino Runs On Core 1
Events Run On Core 1
Erase All Flash Before Sketch Upload Disabled
JTAG Adapter Disabled

End result
I'll show you what we're going to implement. The video below shows the camera video
feed where the image is "negative". This is achieved entirely via pixel manipulation, not
by applying the built-in filter of the camera sensor!

Arduino sketch
This is the sketch that implements the end result.
See source
Filename: Encode_Frame_on_the_Fly.ino
1 /** Copy
2 * Alter camera pixels before sending them via MJPEG stream
3 * (requires enough RAM to run)
4 * (expect 0.5 - 2 FPS)
5 *
6 * BE SURE TO SET "TOOLS > CORE DEBUG LEVEL = INFO"
7 * to turn on debug messages
8 */
9 #define WIFI_SSID "SSID"
10 #define WIFI_PASS "PASSWORD"
11 #define HOSTNAME "esp32cam"
12
13 #include <eloquent_esp32cam.h>
14 #include <eloquent_esp32cam/viz/mjpeg.h>
15
16 using namespace eloq;
17 using namespace eloq::viz;
18
19 uint16_t jpeg_length = 0;
20 size_t tick = 0;
21
22
23 // prototype of the function that will
24 // re-encode the frame on-the-fly
25 void reencode_frame(WiFiClient *client, camera_fb_t* frame);
26
27 // prototype of the functon that will
28 // put JPEG-encoded data back into the frame
29 size_t buffer_jpeg(void * arg, size_t index, const void* data, size_t len);
30
31
32 /**
33 *
34 */
35 void setup() {
36 delay(3000);
37 Serial.begin(115200);
38 Serial.println("__RE-ENCODE MJPEG STREAM__");
39
40 // camera settings
41 // replace with your own model!
42 camera.pinout.aithinker();
43 camera.brownout.disable();
44 // higher resolution cannot be handled
45 camera.resolution.qvga();
46 camera.quality.best();
47
48 // since we want to access the raw pixels
49 // capture in RGB565 format
50 // keep in mind that you need a lot of RAM to store
51 // all this data at high resolutions
52 // (e.g. QVGA = 320 x 240 x 2 = 1536 kB)
53 camera.pixformat.rgb565();
54
55 // MJPEG settings
56 mjpeg.onFrame(&reencode_frame);
57
58 // init camera
59 while (!camera.begin().isOk())
60
60
Serial.println(camera.exception.toString());
61
62
// connect to WiFi
63
while (!wifi.connect().isOk())
64
Serial.println(wifi.exception.toString());
65
66
// start mjpeg http server
67
while (!mjpeg.begin().isOk())
68
Serial.println(mjpeg.exception.toString());
69
70
// assert camera can capture frames
71
while (!camera.capture().isOk())
72
Serial.println(camera.exception.toString());
73
74
Serial.println("Camera OK");
75
Serial.println("ToF OK");
76
Serial.println("WiFi OK");
77
Serial.println("MjpegStream OK");
78
Serial.println(mjpeg.address());
79
}
80
81
/**
82
*
83
*/
84
void loop() {
85
// nothing to do here, MJPEG server runs in background
86
}
87
88
89
/**
90
* Apply your custom processing to pixels
91
* then encode to JPEG.
92
* You will need to modify this
93
*/
94
void reencode_frame(WiFiClient *client, camera_fb_t* frame) {
95
// log how much time elapsed from last frame
96
const size_t now = millis();
97
const uint16_t height = camera.resolution.getHeight();
98
const uint16_t width = camera.resolution.getWidth();
99
100
ESP_LOGI("benchmark", "%d ms elapsed from last frame", now - tick);
101
tick = now;
102
103
// frame->buf contains RGB565 data
104
// that is, 2 bytes per pixel
105
//
106
// in this test, we're going to do a "negative" effect
107
// feel free to replace this with your own code
108
for (uint16_t y = 0; y < height; y++) {
109
uint16_t *row = (uint16_t*) (frame->buf + width * 2 * y);
110
111
for (uint16_t x = 0; x < width; x++) {
112
// read pixel and parse to R, G, B components
113
const uint16_t pixel = row[x];
114
uint16_t r = (pixel >> 11) & 0b11111;
115
uint16_t g = (pixel >> 5) & 0b111111;
116
uint16_t b = pixel & 0b11111;
117
118
// actual work: make negative
119
r = 31 - r;
120
g = 63 - g;
121
b = 31 - b;
122
123
123
124 // re-pack to RGB565
125 row[x] = (r << 11) | (g << 5) | b;
126 }
127 }
128
129 // encode to jpeg
130 uint8_t quality = 90;
131
132 frame2jpg_cb(frame, quality, &buffer_jpeg, NULL);
133 ESP_LOGI("var_dump", "JPEG size=%d", jpeg_length);
134 }
135
136
137 /**
138 * Put JPEG-encoded data back into the original frame
139 * (you don't have to modify this)
140 */
141 size_t buffer_jpeg(void *arg, size_t index, const void* data, size_t len) {
142 if (index == 0) {
143 // first MCU block => reset jpeg length
144 jpeg_length = 0;
145 }
146
147 if (len == 0) {
148 // encoding is done
149 camera.frame->len = jpeg_length;
150 return 0;
151 }
152
153 jpeg_length += len;
154
155 // override input data
156 memcpy(camera.frame->buf + index, (uint8_t*) data, len);
157
158 return len;
}

I'm going to explain each block in detail.

setup()
The setup function configures all the components of the sketch: the camera , the mjpeg
HTTP server and the wifi . Refer to the ESP32 cam Quickstart tutorial for more details.
loop()
This is empty, since all the streaming logic is handled in a background task.

reencode_frame()
This function gets called after a client connected to see the stream, each time a new
frame is ready to be sent to the browser. You can hook into this function to alter what will
be sent to the user and replace the original frame with your own.
In our demo, we're decode the RGB565 pixels, negate each component and re-package
back into RGB565. The line

1 frame2jpg_cb(frame, quality, &buffer_jpeg, NULL); Copy

encodes the RGB565 data into JPEG.


buffer_jpeg()
This function is called by the frame2jpg_cb encoding routine with chunks of JPEG
encoded data. We're simply copying the produced data back into the camera frame
buffer to override what will be sent to the user.

Speed
Speed is low. On my Freenove S3 camera it takes 300-400 ms to modify the pixels and
encode them to JPEG. Add to that the WiFi lag and you can expect 1-2 FPS as a realistic
estimate.
If you can, you should stream the original jpeg data as-is and save a lot of strain to the
CPU. Use this code only is strictly necessary.

You might also like