« Back to “Projects”

ESP8266 walkie talkie

Assembled device

I've been looking for a good two-way radio for bicycle trips, to be able to talk easily without needing to ride too closely. After trying out some smartphone apps and Leixen VV-108 mobile radio, I decided to build my own based on ESP8266. However, it didn't go quite as well as I hoped..


Main annoyances of the solutions I've tried so far are:

  • Too difficult to connect
  • Short battery life
  • Long latency
  • Lack of full duplex communication

Smartphone apps, like FLUX Intercom, that go through local radio such as bluetooth or WiFi have good audio quality and low latency. However they seem to drain phone battery in just a few hours, and if the connection drops, you need to browse a lot of menus to reconnect the WiFi and then reconnect the application.

Applications that go through internet, like Google Hangouts or Skype, seem to hold the connection better but suffer from a lot of latency. Because some of the sound is heard directly and some through the radio, latency should be at most ~100 ms or it starts to sound annoying.

I also bought Leixen VV-108 PMR radio pair for some 20 USD. It did have good audio quality and very low latency, but was completely unsuitable for me because of the lack of voice activated transmit functionality. I knew beforehand that it is not true full-duplex radio, i.e. it cannot transmit and receive at the same time. However, their application has a "VOX" checkbox, which led me to believe that I could configure it to automatically transmit whenever one starts speaking. Turns out that it doesn't actually work on this model. Leixen was nice enough to reply to my inquiry, though: "The software is copied from our other model radio. But we engineer is too lazy to change the software to close the VOX function.".

There are quite many systems like this sold as "motorcycle intercoms". However, they usually have separate speakers and microphones that you are supposed to permanently glue to your motorcycle helmet, and which do not appear to be quite suitable for bicycle helmets. These are also quite expensive, usually more than $30 each.

Ideally, I'd want something that has just a power switch and volume control and automatically connects and reconnects whenever it is in range. It would also be nice for it to be all self-contained like a typical bluetooth handsfree, so I could just put it to my ear or clip to strap on my bike helmet and go.

Custom design

The Leixen VV-108 had a small enclosure with a clip. I though: why don't I just build my own radio in there, with the ever-so-popular ESP8266?

A sensible approach would be to use an external ADC/DAC chip (like WM8960) that communicates with I2S to the ESP8266. One could put it on a separate power supply to reduce noise, and it would have integrated microphone and speaker amplifiers. However, that seemed like an overkill for the kind of audio quality I was after. Surely the ESP8266 internal ADC could be pushed to sample audio at, say, 12kHz? And PDM could be used to output audio from a digital IO pin.

Some initial experimentation showed promise in this approach. I found the reverse-engineered ADC driver by pvvx. A bit of hacking later, I had it sampling audio at 12 kHz. When I fed it a sine wave from signal generator, results seemed good enough to move forwards.

I designed a simple circuit that uses LMV324 opamp as both microphone preamp and also as speaker amplifier. With output current of ~50mA, it has plenty of power to drive a small 30 ohm headphone speaker. Rest of the circuit is pretty standard, with voltage regulator and li-ion charger. I used one of the volume buttons also as a power button by connecting it to the regulator enable pin.

The unnamed parts in the schematic correspond to later hacks that I did to reduce noise problems.


PCB Making everything fit

ESP-12E radio module seemed quite small, until I tried to make it fit in the premade enclosure. In the end I had to install it quite close to the speaker and the battery, which probably degrades antenna performance. Though I don't expect these modules to have a particularly well tuned antenna to begin with, either.

Most of the parts are on the bottom side, with only the volume buttons and programming connector on the upper side. The original VV-108 PCB used to have parts on the top side and used a 0.8 mm PCB, so mine ended up a bit thicker, so I had to cut the opening for the micro-USB port a bit.


I started with esp-open-rtos because I had heard good things about its realtime performance. And indeed, it is good! It was simple to get an interrupt sampling ADC at 12 kHz and storing data to memory buffers, while a thread was sending the filled buffers using UDP to the network.

I initially planned to use broadcast UDP for transferring the audio, but it turns out that broadcast packets on WiFi are quite slow and cumbersome. In addition, my access point I was testing with rate-limits broadcast packets to about 10 per second. Thus I switched to a slightly more complex solution where broadcast packets are only used to discover peers on the local network, and then actual audio is transferred using unicast packets.

Because the microphone and speaker are so close in the enclosure, some of the playback sound will travel back to the recording side. To reduce the echo, I used a 12-tap LMS filter, which is an adaptive linear filter. Basically, the playback data is filtered to try to make it correspond to recorded data as closely as possible, and then it is subtracted from the recorded data. Whatever difference remains is due to external voice, which is what we want to record.

The recording side uses interrutps to sample the ESP8266 ADC at 12.5 kHz samplerate. The playback side uses the I2S peripheral to output a digital bitstream at 1.6 MHz. A simple sigma-delta modulation algorithm is used to generate the bitstream from the audio samples received from the network.


Already in my first tests, it was evident that the ESP8266 was causing some noise in the audio. It didn't sound too bad, however, when fed from a signal generator. A wifi chip like ESP8266 typically draws current in very sharp spikes, such as 1 ampere for just 1 millisecond of time. This causes large variations in supply voltage and also electromagnetic noise that gets induced to other parts of the device, causing 'click click click' sound in the audio.

Whenever the ESP8266 transmitted a packet, there was a very large spike in the signal:

With the finished PCB using a electret microphone, the noise was quite bad. I did what I could to combat it by changing the circuit and soldering on extra capacitors close to the radio module. I also added some software filtering, after which the audio didn't sound too bad in either direction when tested with PC.

However, when I put two devices together, configured one as access point and the other as client, and tried the connection, it was completely unusable. Below there is an audio clip of what it sounds like:

The sudden degradation of quality seems to be a sum of multiple factors:

  • Access point mode causes the ESP8266 to transmit more packets than client mode, thus more noise.
  • My echo cancellation is not perfect, so when one end receives a 'click' and plays it back, some of it is heard by the microphone and transmitted back to the other end, ad infinitum.
  • The internal jfet amplifier that is inside a typical electret microphone seems quite sensitive to radio noise, compared to a signal generator.


For now, I've had enough of trying to make this circuit work. However the files are available in case someone finds a use for them:

– Petteri Aimonen on 16.7.2018

Comment on this page (6 comments so far)