Execution tracing on Cortex-M microcontrollers
The ARM CoreSight Trace Macrocells are not really a secret, they are publicly documented and found in pretty much every Cortex-M3 and -M4 based microcontroller on the market. However, they seem to be very rarely used — for lack of purpose or for lack of tools?
If you do a Google image search for ARM ETM Trace, you get a few photos of specialized capture hardware and some screenshots of textual trace results. Ok, so seems that it is expensive, and mainly avoids the trouble of having to manually step through in a debugger.
However, it is possible to capture the trace output with a cheap (about 10 EUR) FX2 based logic analyzer. Many of these are sold as either FX2 breakout boards or as Saleae clones. Of course it would be immoral to use the Saleae software for these clones, but fortunately free software sigrok package has support for most of them.
I then realized what had to be done: add sigrok decoders for ARM trace signals. Now totally new possibilities open: with the single-wire trace output, 7 other lines are left free for observing other data. It is possible to see e.g. the code execution in parallel with stepper signals generated by the chip.
If your browser does not support WebM for video, you can either watch on YouTube or download the video.
Follow below for a guide and an example of applying this to debugging a real bug in Smoothieware.
Tracing on STM32 discovery
There are two separate trace blocks in Cortex-M3: the ITM and ETM, that is, Instrumentation Trace Macrocell and Embedded Trace Macrocell. Yeah, the names tell nothing about the difference between the two. ITM does higher level, smaller bandwidth, less intrusive traces like watchpoints, interrupts and periodic program counter sampling. ETM, on the other hand, traces every single instruction executed by the CPU. In practice it is usually necessary to slow down the CPU to use ETM, while ITM can be used without slowdown.
Debugging a Smoothieware bug
I actually had a specific bug in mind when I started looking into ARM Cortex-M tracing features. I use Smoothieware to drive my CNC3020, which I'm making a 3D printer option for. When I started increasing the printing speed, a few problems appeared.