Category Archives: Software

Programming interface progress: SWD JTAG command sequences

by Stijn Kuipers

Since the last post I spent quite some time with my new tool to dump sequences from my commercial debugger. After comparing these dumps to various codebases I found online (libswd, cmsis-dap, mchck) and comparing them to the various manuals on the subject, I managed to extract the command sets for a set of essential functions:

1) Peek/Poke (universal to all ARM cores)
2) Halt/Continue (universal to all ARM cores)
3) Reset (Freescale specific)
4) Mass erase (Freescale specific)

With these command sets figured out, I was able to recreate them on a standard Arduino and trigger them with a simple serial remote control application. I also managed to figure out how, when and why all the various types of registers are available to read/write over the debug lines. In the case of the Freescale chips I’m targeting, some features are only available after setting certain bits somewhere else. Others only work when the system resetpin is activated. Joy.

Example: reading the identification register to see if the chip has powered up correctly. The magic value here is the 0x04770031 at the end.

bits

Figuring all this out has brought me much further along the path – I am now almost ready to start executing flash commands on the target device. When the system can flash the MKL26 and MKL02 chips I am planning to move the codebase from the Arduino it lives on now to a set of dedicated small/cheap boards. One of these will be another USB-stick type PCB (plugs straight in to a USB port) with a simple row of programming pins meant to program microgameboys – possibly pogopin holder.

Still to be continued…

Reverse engineering SWD JTAG debugging/flashing protocol for Freescale chips

by Stijn Kuipers

One of the great breakthroughs of the Arduino has been that to get started, you only need the device itself and the software. The try/fail/try again cycle of development got reduced to altering your code and pressing run (again).

Most other platforms completely fail in this respect. Many chip-manufacturers have excellent and cheap try-out boards that are even pin compatible with the Arduino. However, while Arduino gets many things right – hardware choices and software choices are not two of them. The Atmel AVR series is easy but outdated. The Arduino “IDE” barely beats notepad.exe in functionality and project management.

Chip manufacturers almost invariably fail to recognize that having to struggle through ANYTHING between the compile button and the thing actually running your code is losing them enthusiasm, mindshare and ultimately customers. Because the manufacturers have no idea how to approach this part of the developer experience themselves, and apparently no inclination to get involved, they attempt to extend the Eclipse IDE (with endless pages of settings tabs) or try to become hardware compatible with Arduino.

So when I was recently asked if I would like to host a workshop programming games for the latest incarnation of the microgameboy, I had a problem! There is no real foolproof way to even talk to these chips! The official toolchain only supports a handful of programmer devices you have to attach separately using a tiny halfpitch connector that is not easy to find. This will not do for novices.

So part of my “this really should NOT be Rocket Science” quest has become this:

Make developing for a platform like the microgameboy EASY.

Stage one: creating a cheap tool to connect the microgameboy to your pc to update the program memory.

The chip I am using in the microgameboy is made by Freescale (the MKL02Z32 to be precise). This chip belongs to the ARM Cortex M0 family of devices. ARM is a big company that creates standard designs for chips. If you keep to the standard, software written for that standard will run on your device. Luckily, the Cortex standard includes a chapter on the debugging interface. The debugging interface allows anyone to inspect the inner state of the chip and poke around in its system memory. With some clever mangling (and the manual from Freescale – since this bit is outside the Cortex standard) you can trick the chip into updating/rewriting its own flash memory. This is what I’m going to do.

The first step of such an undertaking is (as usual): Homework.

Serial Wire Debug manual by ARM

“DAP Lite” manual by ARM

KL26 manual by Freescale
There are endless stacks of documents to be found that all refer to the debug interface in some way or other.. which always leads me to:

The second step of such an undertaking is (as usual): screw this shit..

I am not going to sit here reading 5000 pages of dense text. I’ll have a look at the actual data instead.

Last year I ordered a whole bunch of tools to deal with inspection of electrical signals. Amongst this set is an Open Workbench Logic Analyzer by GadgetFactory. It allows you to record a whole bunch of signals at the same time, at very high speeds. Using this device, I got this:

2015-02-02-0219 - Logic capture of the init sequence made by jlink

Saving this to a file and parsing the file for clock/data state transitions got me this huge array of bits:

data dump from jlink capture:

As you can see, the file contains some obvious patterns, creating diagonal lines in the data dump like 60’s wallpaper. By pressing enter at the diagonals, I separated the data dump into a more logical grouping of bits.

Spacing it better

111111111111111111111111111111111
11111111111111111111111011110011110011111111111
11111111111111111111111111111111111111111111111
10110110110110111111111111111111111111111111111
11111111111111111111111111000000000000000010100
10110011101110001010001000001111010000000001010
01011001110111000101000100000111101000001010000
00110011011110000000000000000000000000000000000
01001010110011000000000000000000000000000010100
00000001011000110000000010000000000000000000001
11111010001101100110000111100000000000000000000
00000000000011111001100000000000000000000000000
00000000010111011011001000110000000000111011100
01000000101011110110011000000000001000000000000
00111111010001101100110000000000000000000000000
00000000000000011000101100110100100000000000000
00000110001001000000011010001100110000000000000
10000000000000011111000000011111001100000000000
00000000000000000000000010111101011001100000000
00011111111111111111111101101000110011000011111
11100000000000000001111000000001111100110000000
00000000000000000000000000001011111001100101100
00000000000000000000000000110111110011000000100
10000000000000000000000000101111100110010100000
00000000000000000000000001011110101100100011010
00000000000000000000000010110100011001100100000
00000100000000000000111100000000111110011000000
00000000000000000000000000000101111010110011000
00000001111111111111111111101011010001100110000
11111111100000000000000011111000000011111001100
00000000000000000000000000000000010111110011001
01100000000000000000000000000001101111100110000
00100100000000000000000000000001011111001100101
00000000000000000000000000000010111101011001000
11010000000000000000000000000101101000110011000
10000000001000000000000001111000000001111100110
00000000000000000000000000000000001011110101100
11000000000010111111000000001111110101001011001
11011100010100010000011110100000101000000110011
01111000000000000000000000000000000000001001010
11001100000000000000000000000000001010000000001
01100011000000001000000000000000000000111111010
00110110011000000000000000000000000000000000000
00001100010110011010000000000000000000000110001
00000000001101000110011000000001011011100000000
00000111100000001111100110000000000000000000000
00000000000001010111101100000000000110001100110
00010000010010110100011001100110000101101110000
00000000011110000000111110011000000000000000000
00000000000000000101011110110000000000000000001
01000000101111101010000001100110111100000000000
00000000000000000001000110110011000000000000000
00000000000000000011010001100110000000000000100
00000000000001110110001011001101001000000000000
00000001100010011101110110011010000000000000000
00000000000000100000000101111011000000000000000
00000000000000000000101000000110011011110000000
00000000000000000000000100011011001100000000000
00000000000000000000001101000110011000011111011
01110000000000000111111000101100110100100000000
00000000000110001001111110011000000000000000000
00000000000000000101011110110010000000000000000
00000001000000001010000001100110111100000000000
00000000000000000001000110110011000000000000000
00000000000000000011010001100110000000000000100
00000000000001110110001011001101001000000000000
00000001100010011111100110000000000000000000000
00000000000001010111101100000001000000000000000
00000000000110100000011001101111000000000000000
00000000000000010001101100110000000000000000000
00000000000000110100011001100111111101101110000
00000000011111100010110011010010000000000000000
00011000100111111001100000000000000000000000000
00000000010101111011000000000000000000000000001
00000001101000000110011011110000000000000000000
00000000000100011011001100000000000000000000000
00000000001101000110011000000000000100000000000
00000111011000101100110100100000000000000000001
10001001111110011000000000000000000000000000000
00000101011110110000000000000000000000000000000
10011010000001100110111100000000000000000000000
00000001000110110011000000000000000000000000000
00000011010001100110000000000001000000000000000
01110110001011001101001000000000000000000011000
10011101110110011100000000000000000000000000001
00000000000101111011000000000000000000000000000
00000000101000000110011011110000000000000000000
00000000000100011011001100000000000000000000000
00000000001101000110011000110000000111111110000
00000111111000101100110100100000000000000000001
10001001111110011000000000000000000000000000000
00000101011110110000000000000000000000000000000
00001010100101100111011100010100010000011110100
00010100000011001101111000000000000000000000000
0

Almost repetitive, I did not manage to create a straight line but again revealed a pattern in the data. I remembered reading about the packet layout somewhere on the first pages of the manuals. The packets should be 8bit – pause – 3bit – optional pause – 33 bit – optional pause. Let’s see how that fits the data:

Syncing it up

10100101 100 11101110001010001000001111010000 0
10100101 100 11101110001010001000001111010000 0
10000001 100 11011110000000000000000000000000 0
10010101 100 11000000000000000000000000000010 1
10110001 100 00000010000000000000000000001111 1
10001101 100 11000011110000000000000000000000 0
11111001 100 00000000000000000000000000000000 0
11101101 100 10001100000000001110111000100000 0
10111101 100 11000000000001000000000000001111 1
10001101 100 11000000000000000000000000000000 0
11000101 100 11010010000000000000000000110001 0
11010001 100 11000000000000010000000000000011 1
11111001 100 00000000000000000000000000000000 0
11110101 100 11000000000001111111111111111111 1
11010001 100 11000011111111000000000000000011 1
11111001 100 00000000000000000000000000000000 0
11111001 100 10110000000000000000000000000000 1
11111001 100 00001001000000000000000000000000 0
11111001 100 10100000000000000000000000000000 0
11110101 100 10001101000000000000000000000000 0
11010001 100 11001000000000010000000000000011 1
11111001 100 00000000000000000000000000000000 0
11110101 100 11000000000011111111111111111111 0
11010001 100 11000011111111100000000000000011 1
11111001 100 00000000000000000000000000000000 0
11111001 100 10110000000000000000000000000000 1
11111001 100 00001001000000000000000000000000 0
11111001 100 10100000000000000000000000000000 0
11110101 100 10001101000000000000000000000000 0
11010001 100 11000100000000010000000000000011 1
11111001 100 00000000000000000000000000000000 0
11110101 100 11000000000010111111000000001111 1

Way better again. Especially the centre column of 100s is looking good! This is the ACK-message of the SWD-standard.

Parsing it

To parse it I had to look at the manual again *sigh*. There are some bits in the protocol that allow the chip to do sanity-checking on the signal. Start bit, stop bit, park bit, parity bit – checking for these bits allowed me to sync the signal even better. More manual readings gave me the names of the various registers and bits. Read bit, write bit, debug bit, ACK, error and all that jazz.

Behold the interpreted bit stream of a commercial programming box talking to a Freescale-chip:

needle (0x0BC11477) in LSB first format: 11101110001010001000001111010000
found needle!

skipped: 11111111111111111111111111111111111111111111111111111111011110011110011111111111111111111111111111111111111111111111111111111
AP R ?????? skipped: 11111111111111111111111111111
DP R ID DP R ID 0x0000001E 10000001 100 01111000000000000000000000000000 0
DP W CTRLSTAT -> 0x50000000 10010101 100 00000000000000000000000000001010 1
DP R CTRLSTAT 0x000000F0 10001101 100 00001111000000000000000000000000 0
AP R DATAREAD AP R ?????? DP R RDBUFF 0x00000000 10001101 100 00000000000000000000000000000000 0
AP W CSW -> 0x23000012 11000101 100 01001000000000000000000011000100 0
AP W TARGET -> 0xF0002000 11010001 100 00000000000001000000000000001111 1
AP R DATAREAD AP R TARGET 0xF0000FF0 11010001 100 00001111111100000000000000001111 1
AP R DATAREAD AP R DATAREAD AP R DATAREAD AP R DATAREAD AP R TARGET

Download the code/program/dumps HERE.

Coding it

Now that I had captured a real bit of setup-code that actually performed the thing I needed to do on the chip, I could rewrite it! With some tweaking, head-banging and by using the logic analyser again to check if my own output matched the official output – I managed to get something working. Now I have a way to dump the contents of the memory of any Cortex M0 device with an enabled SWD subsystem:

uint32_t Peek2(uint32_t address)
{
Write(false, DP_W_ABORT,0x1e);
Write(false, DP_W_SELECT,0);
Write(true, AP_TAR, address);
Write(true, AP_CSW ,0x23000012 ) ; // , SIZE_32 | AUTOINC_SINGLE |( (uint32_t)1<< (uint32_t)24) | ( (uint32_t)1<< (uint32_t)25) | ( (uint32_t)1<< (uint32_t)29));
Read(true, AP_DRW);
return Read(false, DP_R_RDBUFF );
}

Now gives the following output on my MKL02Z32 test device:

SWD-DP id: 0x0BC11477
Cortex M0 identified!
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFF7E
D0342900 22002301 4288B410 2401D32C
42A10724 4281D204 0109D202 E7F8011B
42A100E4 4281D204 0049D202 E7F8005B

Up next: writing hex-files to the built-in flash memory. To be continued..

STM32F030F4P6 “AVR programmer” board

by Krzysztof Foltman

After the initial success with LPC810, I was ready to work on something with more memory and more I/O. Searching the Farnell catalogue for simple, low-cost and hobbyist-friendly MCUs, I found the tiny STM32F030F4P6 microcontroller made by ST Microelectronics.

The microcontroller

The chip is based on an ARM Cortex M0 core, has a 48 MHz clock, 16 KB ROM and 4 KB RAM and is available in a 20-pin 0.65mm TSSOP package.
Unlike the LPC chip, it provides plenty of I/O pins and a wider selection of peripherals, including SPI and timers. It is still an entry level chip. It has none of the advanced interfaces like USB or I2S present in its larger counterparts, but that gets reflected in price – it costs about 1.50 euro in single units.
The low price is particularly important for people without experience with soldering of relatively fine-pitch SMD packages – with some basic magnification they are not prohibitively difficult to solder, but getting a number of spares makes the whole learning process less stressful.

Several options are available to make sure that I would be able to use the new microcontroller. One of them involves using an SMD breakout board and a breadboard. However, I was skeptical of the ability to get the Serial Wire Debug programmer/debugger working with the breadboard – it usually requires fairly short wires and it is sensitive to signal quality issues. There is also the “dead bug” approach of gluing the IC upside down and soldering wires to it. This I rejected because of the soldering skills required. My easiest option involved using the first version of a 10x10cm matrix protoboard that arrived from China some time earlier.

Getting it working

After soldering the chip to the PCB, I needed a way to verify that it was working. The easiest way to achieve this was adding a decoupling capacitor and the SWD port, and using an STLink v2 programmer I got from Stijn to communicate with the chip. These programmers are easily available from eBay or Aliexpress, and work fine most of the time – the issues are usually resolved by holding the chip in reset manually. The example eBay link to get one of those programmers is Here

It took some trial-and-error to get adequate signal quality: long cables or having signal wires too close makes the communication unreliable at speeds that STLink is using. I found the correct configuration file to use with OpenOCD (32f0308discovery.cfg – it is included with OpenOCD).

Some time later, I had a working SWD set-up. On my system, and OpenOCD installed in /usr/local/, the example command to erase the flash and program an .elf file is:

sudo openocd -f /usr/local/share/openocd/scripts/board/32f0308discovery.cfg -c 'init' -c "program file.elf verify reset"

The next step involved writing a small program that blinks a LED, in order to make sure that the basics work. For this, I decided to use the code and the linker script for the LPC as a starting point. The memory addresses and the entire initialization code had to be changed to reflect the differences between platforms. And no more I2C or port expanders – this chip had enough I/O to do something that I could use in practice.

The project

At the time, I was trying to make a camera rail controller for a friend. It was meant to be a simple radio-operated stepper motor controller. That in itself could be a good opportunity to try a new microcontroller, if it wasn’t for the size constraints on the device. The 10x10cm prototyping board was too large to fit in the carriage of the device, and the smaller prototyping board was through-hole only. Also, the project didn’t really require a lot of processing power, an ATmega chip would work equally well.

A project based on ATmega is probably easiest to prototype using Arduino environment. But it’s not like it’s easy to buy an ATmega with bootloader burned in, at least in Ireland – it might involve a few days of waiting or paying a good chunk of money for an original Arduino from Maplin just to source the chip. So, what if I built a device to do exactly that – to take a blank ATmega and burn the Arduino bootloader into it. Sounds like a nice learning project to use the STM32 for!

The STM32 definitely had enough I/O pins and memory, and an SPI peripheral that could be used for communicating with the AVR In-Circuit Serial Programming port. The board had enough space to host a 28-pin ZIF socket, so I added that and some user interface elements – two LEDs (green and red) with suitable resistors and a reset button. The SPI-based ICSP code was easy enough to write based on Atmel’s documentation: IMG_20140911_224839 (1)the datasheet and the AVR910 application note about using the programming interfaces. Finding the correct version of the Arduino bootloader was slightly harder than I expected, but I had it all working after a single afternoon of coding. The slightly cleaned up version of the code and very rough documentation is available here

The remaining parts are:

  • the clock crystal and the load capacitors for the ATmega – but not for the STM32 as it runs off its own internal RC clock
  • the beeper – not strictly required but a nice addition. I didn’t dare to power the beeper directly from the microcontroller, so I used an 2N3904 transistor in common emitter mode as a switch.

The programmer code checks the device ID and uploads the bootloader, sets the correct fuses and then uploads a blinky example.

Now, any time I want to make a simple device based on an ATmega microcontrolller, I don’t need to buy a pre-programmed chip – I can get a blank one and pre-program it by simply inserting the chip into the board’s ZIF socket and press a button. After a while, I get a long “beep” and a green light, announcing the fact that the chip is functional and ready to use. Even better, this board provides a simple way to check an Atmega chip that I suspect to be damaged during one of my failed experiments: a series of beeps and a red light will remove any doubt.

Summary

This Is Not Rocket Science at all.

The tricky parts is soldering and SWD signal quality, and the initialization and the linker script. One needs to remember to enable all the clocks during start-up and to take care of the interrupts. Otherwise, there may be unexpected crashes when SysTick gets triggered and there is no ISR to call. Overall, this chip is easy to obtain (Farnell), inexpensive, relatively easy to use and less limited than the 8-pin LPC chip.

There are some comparable options from other manufacturers in the same market segment: LPC811 and LPC812 from NXP (the latter is also available in a 20-pin SOIC package with 1.27mm pin pitch, making it easier to solder). My preferred option was an STMicro part because of my prior experience with their Discovery boards and the fact that most of skills learned when working with a specific microcontroller are usable across their entire chip lines. Most of the time they use the same peripheral, so parts of the code can often be ported -with minimal changes- to their 180 MHz parts. NXP parts have several other advantages, the most important being the clear documentation they provide and their very flexible pin assignment matrix.

Panel merging tool progress

It has been quite a while since I had a chance to work on the tool – I have been too busy validating the output and playing with the microgameboy, but finally I’ve had the time to incorporate some bugfixes.

The biggest one on the list has been the import of gerber files generated by Diptrace..

See here:

UPDATE – Whoops, forgot that some boards still use holes! Not everything is SMT yet 😉

Fixed the diptrace drill exports too:

ld bug 6

 

Creating breakable PCB panels for DirtyPCBs.com

DirtyPCB seems to be the first small-prototyping-service that allows you to build panels with breaktabs. I had to try this! Soon after this thought I stumbled upon the great question of “how” – here I had all these folders full of gerberfiles for boards.. but the tools to panelize them were all very primitive, expensive, unhandy, unartistic etc… time to fix this!

Down here you can see the progress I’ve made from initial concept to usable tool.

The tool shall be released in (bin + source) full after I’ve gotten the second round of panels back from DirtyPCBs (I am not going to give away a tool that will create unproducable gerbers – I need to doublecheck everything)

Dealing with PCB artwork: procedural vector art tooling for Eagle

I like decorating PCBs. After I’ve spent ages designing and routing a schematic, I want it to look nice.

Sadly – this is very badly supported by most CAD software dealing with circuit boards. Some vector art import tools exist, but they are all very far removed from an ideal art workflow. The artist can’t really get direct visual feedback on the endresult.

To fix this, I’ve started writing some tools.
The first tool is a C# module that mimics the HTML5 Canvas2D vectorographics API but outputs scripts for Eagle CAD. Eagle can only deal with cutout-polygons on copper layers! This means that polygons with holes (like the letter O) will not have their insides emptied out if they are on the silk layer. Therefore, all polygons are converted to holeless triangle meshes. This allows the tool to import complex shapes. The first usage of this system is the text function. Any font file you have can safely be converted to a holeless polygon soup.

The second tool is a full Eagle board file importer, rasterizer and interpreter so you can extract relevant curves and use them to dynamically create graphics.

The third tool will integrate the first and the second tool with a code-editor window and a Lua backend. This should allow very fast “live coded” procedural board-art graphics. I am still working this.

Screenshots of the progress in chronological order:

The Eagle script generator starts to work:
Generating Vector Art for Eagle

Holes correctly converted to Eagle polygons (triangle soup):
Generating Vector Art for Eagle - 2

Importing Eagle board XML files starts to work (this is the PCB for the first LED badge):
Eagle Board Rendering In C#

More complex boards start to work (Goldfish R3 and Protoboard v2):
Eagle Board Rendering In C# - 2

01 - SMD Pad Rotation Correct!

Added the Eagle “ArcTo” curves -> automatically converted in to seperate line segments. (See the round corners – this is the Goldfish R3 connector board)
02 - Better layer abstractions and linewidths

And finally:  first glimpse of the combined board-reader-and-script-executing tool:
03 - Some random vectorart test

I am currently still perfecting the board-import/rendering function. After the integration of all the tools has been completed I will upload the code to our Github page.