Low-power wireless communication options

March 16, 2015, 2:45 pm

I've written a number of blog posts about wireless communications, primarily 315/433Mhz ASK/OOK modules and 2.4Ghz GFSK modules. In this article I'll analyze the pros and cons of each.

UHF ASK/OOK modules

ASK modules are the simplest type of communication modules, both in their interface and in their radio coding. Single-pin operation allows them to be used with a single GPIO pin, or even with a UART. The 315 and 433Mhz bands can be received by inexpensive RTL-SDR dongles, making antenna testing relatively easy. At 78c for a pair of modules, they are a very inexpensive option for wireless communications for embedded systems. The bands are popular for low-bandwidth keyless entry systems, so fob transmitters are widely available for 315 and 433 Mhz.

For a given power output budget, 315Mhz communications has the best range since free-space loss increases with frequency. Using 433Mhz means about 3dB more attenuation than 315, and 2.5Ghz has about 15dB more attenuation than 433. The 315 and 433Mhz bands can be used for low-power control and intermittent data in the US and Canada, but in the UK it looks like only the 433Mhz band is available. Due to the simple radio encoding involving the presence or absence of a carrier frequency, sensitivity is over 100dB. Superheterodyne receivers have a sensitivity in the -107 to -111 dB range, while super-regenerative receivers have a sensitivity in the -103 to -106 dB range.

The inexpensive super-regenerative receivers may need tuning, though this can be avoided by using the more expensive superheterodyne receivers which use a crystal for a frequency reference. High-level protocol requirements such CRC or addressing must be handled in software, although libraries such as VirtualWire can greatly simplify this task.

Although the specs on most transmitters indicate an input power requirement between 3.5 and 12V, the modules I have tested will work on lower voltages. I tested a 433Mhz transmitter module powered by 2xAA NiMH providing 2.6V. At that voltage, the transmit power was about 9dB lower than when it is powered at 5V.

As David Cook explains in his LoFi project, it is feasible to use CR2032 coin cells to power projects using these modules. Depending on the MCU power consumption and the frequency of data transmission, even 5 years of operation on a single coin cell is possible.

2.4Ghz modules

I've written a few blog posts before about 2.4Ghz modules, both Nordic's nRF24l01 and Semitek's SE8R01. The nRF chip has lower power consumption of 13.5mA in receive mode, than the Semitek module which consumes 19.5mA. Either module can still be powered from a CR2032 coin cell. When using a 50-100uF capacitor as recommended by TI, capacitor leakage can add to battery drain, so use a higher voltage capacitor, such as 16 or even 50V as these will have lower leakage.

Nordic's chip has lower power consumption, you might not get a real nRF chip when you purchase a module. Besides higher power consumption, some clones are not compatible with the genuine nRF this can range from completely unable to communicate with nRF modules like the SE8R01, or simply inability to work with the dynamic payload length feature. With SE8R01 modules selling for only 41c in small quantity, if you don't require compatibility with the genuine nRF protocol, and aren't planning to try bit-banging bluetooth low energy, then these seem to be the best choice. One other reason to go with genuine nRF modules (if you can find them for a reasonable price) is for range. At 250kbps, the nRF has a receiver sensitivity of -94 dBm. The SE8R01 doesn't have a 250kbps mode, and at both 500kbps and 1mbps the receiver sensitivity is -86 dBm.

Although the power consumption of 2.4Ghz modules is similar to UHF ASK/OOK modules, the higher data rate permits lower average power. Take for example an periodic transmission consisting of a total of 10 bytes: 1 sync, 3 address, 4 payload data, and 2 CRC bytes. At 250kbps the radio would be transmitting for only 320uS and needs to be powered another 130uS for tx settling time, making the total less than half a millisecond. An ASK/OOK module transmitting at 9600kbps would have the radio powered for a little over 8ms. For a periodic transmission every 10 seconds, and with sleep power consumption of 4uA, the ASK module would consume an average of 21uA, compared to just 5uA for the nRF module.

Conclusion

For low-speed uni-directional data, a UHF ASK/OOK module will work fine, and is the simplest solution. For reliable bi-directional communication, or for high bandwidth 2.4Ghz modules are the best choice.

↧

Yet another esp8266 article

March 26, 2015, 11:32 am

≫ Next: Fastest AVR software SPI in the West

≪ Previous: Low-power wireless communication options

When I received my ESP-01 module, I didn't expect I'd end up writing a blog post about it. Unless you've spent the last six months on a research mission in Antarctica, you probably have read about these cheap and relatively easy to program Wifi modules. But as I learned a few things that I haven't read about, I decided to share my knowledge.

The first thing I learned is that the module worked fine getting it's power from a PL2303HX USB-TTL module, despite many claims I've read that the esp8266 modules draw too much power. This is likely true for FTDI USB-TTL modules, as the internal 3.3V regulator on them is only rated for 50mA. The the regulator on the PL2303HX is rated for 150mA, and my testing revealed the modules will output more than the ~200mA maximum required for the esp8266 with minimal voltage drop. I did find the high current draw when I connected power to the ESP-01 sometimes caused the USB-TTL module to reset. This was resolved by soldering a 22nF capacitor between the 3.3V output and Gnd.

As most guides will tell you, CH_PD/CHIP_EN needs to be high in order for the module to boot, so I soldered a short wire to Vcc. Some people also say to pull RST high, however, like many MCU's, the esp8266 has an internal pull-up on reset, so no external pullup is required. Similarly, GPIO0 and GPIO2 need to be high to select boot from SPI flash. GPIO0 has an internal pull-up on boot, and GPIO2 defaults to high, so nothing needs to be wired to these pins.

An easy way to tell if you ESP is working is to do a Wifi scan. After I powered it up, I could see an open access point named ESP_XXXXXX, where XXXXXX is part of the MAC address.

To communicate with the ESP-01, I started by using Putty at 9600bps to try using the AT commands. I wasn't getting any response when I typed, so I reset the module by touching a resistor between Gnd and RST. This caused the blue LED to flash, and along with a bit of garbage, the following text appeared:
[Vendor:www.ai-thinker.com Version:0.9.2.4]

What I eventually figured out was that the esp8266 requires all AT commands to end with CRLF. Putty seems to send just CR for [Enter], so I entered <CTRL-M><CTRL-J> in order to send a CRLF.

I decided to try the nodeMCU eLua firmware, so I jumpered GPIO0 low, and ran the nodemcu-flasher. After the firmware was flashed, I was able to connect (still at 9600bps) with putty and enter elua code. The next step was to try an IDE for elua development on the esp8266.

I found three different IDEs, LuaLoader, Esplorer, and Lua Uploader. I tried LuaLoader first, but had problems with serial communications - I wasn't seeing any text coming from the esp8266. It's handy DTR & RTS control buttons worked, as I could get the module to reset by wiring RTS to reset on the module, then toggling RTS low then back to high. I tried running Esplorer, but when I found out it needs Java SE installed first, I moved on to trying Lua Uploader rather than waiting for the Java runtime to download and install.

As can be seen from the screen shot above, Lua Uploader is rather spartan, but it worked fine for me. By default it loads with a blink program that toggles GPIO2. Communication with the ESP is rather slow at the default 9600bps, so I saved the following init.lua file to the ESP for 115.2kbps communication:
uart.setup(0, 115200, 8, 0, 1, 1)

Conclusion

I'm impressed with the capability of these little modules. Next I plan to get another module with more GPIO, and try out the C SDK.

↧

Fastest AVR software SPI in the West

March 29, 2015, 8:36 pm

≫ Next: ESP8266 SPI flash performance

≪ Previous: Yet another esp8266 article

Most AVR MCUs like the ATmega328p have a hardware I/O shift register, but only on a fixed set of pins. Arduino's shiftOut function is horribly slow, so a number of faster implementations have been written. I'll look at how fast they are, and explain an implementation in AVR assembler that's faster than any C implementation, and I'll even claim that it is the fastest software SPI for the AVR.

Adafruit spiWrite

void spiWrite(uint8_t data)

{

uint8_t bit;

for(bit = 0x80; bit; bit >>= 1) {

SPIPORT &= ~clkpinmask;

if(data & bit) SPIPORT |= mosipinmask;

else SPIPORT &= ~mosipinmask;

SPIPORT |= clkpinmask;

}

This code comes from the Adafruit Nokia 5110 LCD library. It's a bit odd because it doesn't use a loop counting down from 8 to 0 for the bits to be shifted, it shifts a bit through the 8 bits in a byte. While it is much faster than Arduino's shiftOut from using direct port manipulation instead of the slow digitalWrite, its far from an optimal implementation in C. I compiled the code using avr-gcc 4.8 with -Os, and disassembled the code using avr-objdump -D:

00000000 <spiWrite>:

0: 28 e0 ldi r18, 0x08 ; 8

2: 30 e0 ldi r19, 0x00 ; 0

4: 90 e8 ldi r25, 0x80 ; 128

6: 2d 98 cbi 0x05, 5 ; 5

8: 49 2f mov r20, r25

a: 48 23 and r20, r24

c: 11 f0 breq .+4 ; 0x12

e: 2c 9a sbi 0x05, 4 ; 5

10: 01 c0 rjmp .+2 ; 0x14

12: 2c 98 cbi 0x05, 4 ; 5

14: 2d 9a sbi 0x05, 5 ; 5

16: 96 95 lsr r25

18: 21 50 subi r18, 0x01 ; 1

1a: 31 09 sbc r19, r1

1c: 21 15 cp r18, r1

1e: 31 05 cpc r19, r1

20: 91 f7 brne .-28 ; 0x6

22: 08 95 ret

The loop for each bit compiles to 14 instructions, and takes 17 clock cycles for a 0 and 18 for a 1. Although most AVR instructions take a single cycle, the cbi and sbi instructions for clearing and setting a single bit take two cycles. Branches, when taken, are also two-cycle instructions.

Generic spi_byte

void spi_byte(uint8_t byte){

uint8_t i = 8;

do{

SPIPORT &= ~mosipinmask;

if(byte & 0x80) SPIPORT |= mosipinmask;

SPIPORT |= clkpinmask; // clk hi

byte <<= 1;

SPIPORT &=~ clkpinmask; // clk lo

}while(--i);

return;

}

I've seen variants of this code used not just for AVR, but also for PIC MCUs. It is faster than the Adafruit code in part because the loop counts down to zero, and as experienced coders know, on almost every CPU, counting up from zero to eight is slower than counting down to zero. The disassembly shows the code to be 40% faster than the Adafruit code, taking 12 cycles for a 0 and 13 for a 1.

00000024 <spi_byte>:

24: 98 e0 ldi r25, 0x08 ; 8

26: 2c 98 cbi 0x05, 4 ; 5

28: 87 fd sbrc r24, 7

2a: 2c 9a sbi 0x05, 4 ; 5

2c: 2d 9a sbi 0x05, 5 ; 5

2e: 88 0f add r24, r24

30: 2d 98 cbi 0x05, 5 ; 5

32: 91 50 subi r25, 0x01 ; 1

34: c1 f7 brne .-16 ; 0x26

36: 08 95 ret

AVR optimized in C

Looking at the assembler code, half of the loop time is taken by the cbi and sbi two-cycle instructions. The key to further speed optimizations is to write code that will compile to single-cyle out instructions instead. The mosi and clk pins can be cleared by reading the port state before the loop, then writing the 8 bits of the port with mosi and clk cleared:

uint8_t portbits = (SPIPORT & ~(mosipinmask | clkpinmask) );

do{

SPIPORT = portbits; // clk and data low

This also saves having to clear the clk pin at the end of the loop, for a total savings of 3 cycles. With this technique, the time per bit can be reduced to 9 cycles. By using the AVR PIN register, another cycle can be shaved off the loop. The datasheet does not describe the PIN register in detail, stating little more than, "Writing a logic one to PINxn toggles the value of PORTxn". What this means, for example, is that writing 0x81 to PINB will toggle the state of bit 0 and bit 7, leaving the rest of the bits unchanged. Here's the final code:

void spi_byteFast(uint8_t byte){

uint8_t i = 8;

uint8_t portbits = (SPIPORT & ~(mosipinmask | clkpinmask) );

do{

SPIPORT = portbits; // clk and data low

if(byte & 0x80) SPIPIN = mosipinmask;

SPIPIN = clkpinmask; // toggle clk

byte <<= 1;

}while(--i);

return;

}

The disassembly shows that although the code size has increased, the loop for transmitting a bit takes only 8 cycles. More speed can be obtained at the cost of code size by having the compiler unroll the loop (enabled with -O3 in gcc). This would reduce the time per bit to 5 cycles.

00000050 <spi_byteFast>:

50: 25 b1 in r18, 0x05 ; 5

52: 2f 7c andi r18, 0xCF ; 207

54: 98 e0 ldi r25, 0x08 ; 8

56: 40 e1 ldi r20, 0x10 ; 16

58: 30 e2 ldi r19, 0x20 ; 32

5a: 25 b9 out 0x05, r18 ; 5

5c: 87 fd sbrc r24, 7

5e: 43 b9 out 0x03, r20 ; 3

60: 33 b9 out 0x03, r19 ; 3

62: 88 0f add r24, r24

64: 91 50 subi r25, 0x01 ; 1

66: c9 f7 brne .-14 ; 0x5a

68: 08 95 ret

Assembler

I learned to code in assembler (6502) over thirty years ago, and started to learn C a few years after that. When gcc was first released in 1987, it generated code that was much larger and slower than assembler. Although it has improved significantly over the years, what surprises me is that it or any other C compiler still rarely matches hand-optimized assembler code. You might think that there's nothing left to optimize out of the 7 instructions that make up the the loop above, but by making use of the carry flag, I can eliminate the loop counter. That saves a register and reduces the loop time to 7 cycles from 8:

spiByte:

in r18, SPIPORT ; save port state

andi r18, ~(mosipinmask | clkpinmask)

ldi r20, mosipinmask

ldi r19, clkpinmask

lsl r24

ori r24, 0x01 ; 9th bit marks end of byte

spiBit:

out SPIPORT, r18

brcc zeroBit

out SPIPORT-2, r20 ; PORT address -2 is PIN

lsl r24

out SPIPORT-2, r19 ; clk hi

brne spiBit

ret

When looking for fast software SPI code, the best I could find was 8 cycles per bit. I read a couple posts on AVRfreaks claiming 7 cycles is possible, but no code was posted. Unrolled, the above assembler code is still 5 cycles per bit, the same as the optimized C version. So to back up my claim about the fastest code and hand-optimized assembler being better than the compiler, I need to reduce the timing to 4 cycles per bit. I can do it using the AVR's T flag, with the bst and bld instructions that transfer a single bit between the T flag and a register.

spiFast:

in r25, SPIPORT ; save port state

andi r25, ~clkpinmask

ldi r19, clkpinmask

halfByte:

bst r24, 7

bld r25, MOSI

out SPIPORT, r25 ; clk low + data

out SPIPORT-2, r19 ; clk hi

bst r24, 6

bld r25, MOSI

out SPIPORT, r25 ; clk low + data

out SPIPORT-2, r19 ; clk hi

bst r24, 5

bld r25, MOSI

out SPIPORT, r25 ; clk low + data

out SPIPORT-2, r19 ; clk hi

bst r24, 4

bld r25, MOSI

out SPIPORT, r25 ; clk low + data

out SPIPORT-2, r19 ; clk hi

swap r24

eor r1, r19 ; r1 is zero reg

brne halfByte

ret

The loop is half unrolled, doing two loops of 4 bits, with the function using a total of 23 instructions. Fully unrolled the function would use 35 instructions/cycles (plus return), saving 4 cycles over the half unrolled version.

Conclusion

Including overhead, the spiFast assembler code is just under half the speed of hardware SPI running at full speed (2 cycles per bit). With the assistance of a hardware timer to generate the clock, and a port dedicated to just the mosi line, it's theoretically possible to output one bit every two cycles using a sequence of lsl and out instructions. But for a fully software implementation that doesn't modify anything other than the mosi and clk bits, you won't find anything faster than 4 cycles per bit. Copies of the code are available on my github repo: spi.S and spiWrite.c.

↧

ESP8266 SPI flash performance

March 30, 2015, 11:29 am

≫ Next: A 4mbps shiftOut for esp8266/Arduino

≪ Previous: Fastest AVR software SPI in the West

Despite the popularity of the ESP8266, I have yet to see a detailed datasheet published. Nava Whiteford, on his blog, has links to a summary datasheet and a Cadence tensilica core that the chip is based on. None of this provides any details on how the memory controller pages in data from the SPI flash, nor the speed of the communications. About all that is clear from the datasheet and chip markings is that it uses a quad-SPI serial flash chip.

I decided to find out the performance of the SPI flash, as well as get an idea of what the cache line fill size of the chip is. By looking at the pin-out of the flash chip, I determined that pin 6 is the clock. After some probing and playing with the settings on my scope, I captured the clock burst shown above.

Based on the 500ns horizontal scale, the clock burst lasts a little more than 2uS. Zooming in shows that the clock is exactly 40Mhz, or half of the ESP8266 80Mhz clock and have of the maximum 80Mhz speed rating of the SPI flash. Given the burst lasted a little more than 2uS, the total number of clock pulses is in the range of 85-90. Accounting for the overhead of commands to enable quad SPI mode and address setup, it seems the burst corresponds to reading 32 bytes from the flash, and therefore the cache line size is likely 32 bytes.

Conclusion

The clock signal is clean, and with a rise + fall time of 11.1ns, could be increased to 90Mhz without significant distortion or attenuation. With documentation on the registers to change the clock speed to 160Mhz, the ESP8266 can be run at double speed without overclocking the SPI flash.

↧

A 4mbps shiftOut for esp8266/Arduino

April 1, 2015, 4:22 pm

≫ Next: Zero-wire auto-reset for esp8266/Arduino

≪ Previous: ESP8266 SPI flash performance

Since I finished writing the fastest possible bit-banged SPI for AVR, I wanted to see how fast the ESP8266 is at bit-banging SPI. The NodeMCU eLua interpreter I initially tested out on my ESP-01 has little hope of high-performance since it is at best byte-code compiled. For a simple way to develop C programs for the ESP8266, I decided to use ESP8266/Arduino, using Jeroen's installer for my existing Arduino 1.6.1 installation. Starting with a basic shiftOut function that worked at around 640kbps, I was able to write an optimized version that is six times faster at almost 4mbps.

I modified the spi_byte AVR C code to use digitalWrite(), and call it twice in loop():
void shiftOut(byte dataPin, byte clkPin, byte data)
{
byte i = 8;
do{
digitalWrite(clkPin, LOW);
digitalWrite(dataPin, LOW);
if(data & 0x80) digitalWrite(dataPin, HIGH);
digitalWrite(clkPin, HIGH);
data <<= 1;
}while(--i);
return;
}

void loop() {

shiftOut(DATA, CLOCK, 'h');

shiftOut(DATA, CLOCK, 'i');

}

Since I don't have a datasheet for the esp8266 that provides instruction timing, and am just starting to learn the lx106 assembler code, I used my oscilloscope to measure the timing of the data line:

The time to shift out 8 bits of data is around 12.5us, for a speed of 640kbps. Looking at the signal in more detail I could see that the time between digitalWrite(dataPin, LOW) and digitalWrite(dataPin, HIGH) was 425ns. Rather than setting the data pin low, then setting it high if the bit to shift out was a 1, I changed the code to do a single digitalWrite based on the bit being a 0 or a 1:

void shiftOut(byte dataPin, byte clkPin, byte data)

{

byte i = 8;

do{

digitalWrite(clkPin, LOW);

if(data & 0x80) digitalWrite(dataPin, HIGH);

else digitalWrite(dataPin, LOW);

digitalWrite(clkPin, HIGH);

data <<= 1;

}while(--i);

return;

}

This change increased the speed slightly to 770kbps. Suspecting the overhead of calling digitalWrite as being a large part of the performance limitations, I looked at the source for the digitalWrite function. If I could get the compiler to inline the digitalWrite function, I figured it would provide a significant speedup. From my previous investigation of the performance of digitalWrite, I knew gcc's link-time optimization could do this kind of global inlining. I enabled lto by adding -flto to the compiler options in platform.txt. Unfortunately, the xtensa-lx106-elf build of gcc 4.8.2 does not yet support lto.

After looking at the source for the digitalWrite function, I could see that I could replace the digitalWrite with a call to a esp8266 library function GPIO_REG_WRITE:

void shiftOutFast(byte data)

{

byte i = 8;

do{

GPIO_REG_WRITE(GPIO_OUT_W1TC_ADDRESS, 1 << CLOCK);

if(data & 0x80)

GPIO_REG_WRITE(GPIO_OUT_W1TS_ADDRESS, 1 << DATA);

else

GPIO_REG_WRITE(GPIO_OUT_W1TC_ADDRESS, 1 << DATA);

GPIO_REG_WRITE(GPIO_OUT_W1TS_ADDRESS, 1 << CLOCK);

data <<= 1;

}while(--i);

return;

}

This modified version was much faster - the oscilloscope screen shot at the beginning of this article shows the performance of shiftOutFast. One bit time is 262.5ns, for a speed of 3.81mbps. This would be quite adequate for driving a Nokia 5110 black and white LCD which has a maximum speed of 4mbps.

Conclusion

While 4mbps is fast enough for a low-resolution LCD display or some LEDs controlled by a shift register like the 74595, it's quite slow compared to the 80Mhz clock speed of the esp8266. Each bit, at 262.5ns is taking 21 clock cycles. I doubt the esp8266 supports modifying an I/O register in a single cyle like the AVR does, but it should be able to do it in two or three cycles. While I don't have a proper datasheet for the esp8266, the Xtensa LX data book is a good start. Combined with disassembling the compiled C, I should be able to further optimize the code, and maybe even figure out how to write the code in lx106 assembler.

↧

Zero-wire auto-reset for esp8266/Arduino

April 6, 2015, 3:11 pm

≫ Next: Building avr-gcc from source

≪ Previous: A 4mbps shiftOut for esp8266/Arduino

A little over a year ago I developed a zero-wire auto-reset solution for Arduino. After I started using Arduino for the esp8266, I realized I could do the same thing with the ESP-01.

Flashing the esp8266

In order to download code to the esp8266 after reset, GPIO0 and GPIO15 must be low, and GPIO2 must be high. The ESP-01 has GPIO15 grounded, and GPIO2 is set high after reset. GPIO0 is pulled up to Vcc after reset, so in order to download code to the flash, this must be pulled low. Although esptool-ck supports using RTS and DTR for flashing the esp8266, many cheap USB-TTL modules don't break out those lines. With USB-TTL modules that break out DTR, the DTR line should be connected to GPIO0 in order to pull the line low after reset. Otherwise DTR needs to be grounded with a jumper or by connecting a push-button switch to ground.

The circuit

The auto-reset circuit I used on the esp8266 is a simplified version of the circuit I used with the pro mini. It consists of just a 7.5K resistor between Rx and RST, and a 4.7uF capacitor between RST and Vcc. The values are not critical, as long as the RC constant is between 10ms and 100ms, so if what you have on hand is a 15K resistor and 1uF capacitor, that should work fine. A serial break signal is 250ms long, which is why I suggest a an RC constant of less than 100ms to allow the capacitor to discharge and trigger a reset before the break signal ends. If the RC constant is less than 10ms, a sequence of zero bytes transmitted to the esp8266 could unintentionally trigger a reset. At 9600bps, each bit is 104.2us long, so 8 zero bits plus the start bit would last 938us. Several zero-bytes in a row, even with the high voltage of the stop bit in between, could trigger a reset. The esptool default upload speed is 115.2kbps, so unwanted resets are quite unlikely.

The auto-reset circuit has an added benefit of improving the stability of the esp8266 module. The RST pin on the esp8266 is extremely sensitive. Before I added the auto-reset circuit, simply touching a probe from my multimeter to the RST pin would usually reset the module, even when I tried adding a 15K pullup resistor to Vcc. I would also get intermittent "espcomm_sync failed" messages when trying to upload code. Since adding the auto-reset circuit, I can probe the RST pin without triggering a reset, and my uploads have been error-free.

Getting the updated esptool-ck

By the time you are reading this, Ivan may have already integrated my patch for esptool-ck. If the issue is still open, then you can download the updated esptool-ck. Extract the esptool.exe into hardware\tools\esp8266. This version also includes support for 921.6kbps uploads, which can be enabled by modifying putting esp01.upload.speed=921600 in hardware\esp8266com\esp8266\boards.txt.

↧

Building avr-gcc from source

April 9, 2015, 4:42 pm

≫ Next: Continuous Integration: the wonderful world of free build servers

≪ Previous: Zero-wire auto-reset for esp8266/Arduino

Although 8-bit AVR MCUs are widely used, it is hard to find recent builds of avr-gcc. The latest releases of gcc are 4.9.2 and 4.8.4, yet the latest release of the Atmel AVR Toolchain only includes gcc 4.8.1. For CentOS 6, the most recent RPM I could find is 4.7.1. To take advantage of the latest improvements to compiler features like link-time optimization, it is often necessary to build gcc from source. When I first attempted to build gcc for AVR targets, I quickly discovered it's not as simple as downloading the source then running "configure; make install". Picking away at it over the course of several months, I've figured out how to do it. This method should work for avr targets, and with a small change to the build options, for other targets like Arm and Mips.

Although both the avr-libc and gcc sites have some information on building gcc, I found both fell short of being concise and unambiguous. The biggest source of problems I encountered was other libraries that gcc requires for building. GCC's prerequisites indicates, "Several support libraries are necessary to build GCC, some are required, others optional." I (mis)interpreted the list of libraries starting with GNU Multiple Precision Library as being required only if those features were enabled in the compiler. In the end I was only able to build avr-gcc 4.9.2 when I included GMP, MPFR, and MPC. The ISL library was not required.

Required source files

The first thing to download before GCC is gnu binutils, which includes utilities like objdump for disassembling files. If you already have an earlier version of binutils, it is not necessary to build a new version. For example, on my server I have Atmel AVR Toolchain 3.4.4, which includes avr-gcc 4.8.1 and binutils 2.24. In order to build avr-gcc 4.9.2 I don't need to make a new build of binutils. I did try building binutils 2.25 (the latest), but instead of debugging a compile error I decided to stick with 2.24. For building binutils, the following configure options were sufficient (though perhaps not necessary):

-v --target=avr --with-gnu-ld --with-gnu-as --quiet --enable-install-libbfd --with-dwarf2 --disable-werror CFLAGS="-Wno-format-security"

The next thing to download is GCC, followed by GMP, MPFR, and MPC. I used GMP 5.1.3, MPFR 3.1.2, and MPC 1.0.3. After extracting all the packages, links in the GCC source directory need to be made, named gmp, mpfr, and mpc respectively, linking to their source trees. Then gcc can be configured with the following options before running make:

-v --target=avr --disable-nls --with-gnu-ld --with-gnu-as --enable-languages="c,c++" --disable-libssp --with-dwarf2

Build script

Rather than downloading, extracting, and building gcc manually, I started with a build script made by Rod Moffitt and a couple other contributors. To use it, first run getfiles.gcc.sh which will download the files, then buildavr-gcc.sh. After a long build process, the binaries will be in /usr/local/avr/bin/, which you should then add to your shell PATH variable.

↧

Continuous Integration: the wonderful world of free build servers

April 15, 2015, 12:33 pm

≫ Next: Adapting an ESP-01 module for breadboard use

≪ Previous: Building avr-gcc from source

While contributing to the esp8266/Arduino project, Ivan posted a link to a test build using Appveyor. After a bit of research, I learned that there is a whole slew of companies offering cloud-based build servers in this space called continuous integration. More impressive is that it is free for open-source projects with most companies in this space.

When I first started writing software it was in basic and assembler on a Commodore 64. When writing small programs on fixed-configuration systems like that, the development cycle was reasonably quick, with even "large" assembler programs taking seconds to build. Deployment testing was simple as well; if it worked on my C64 it would work on everyone else's. As computers got bigger and more complex, so did the development cycle. While working on large projects at places like Nortel, full system builds could take several hours or even days. Being able get quick feedback on small code changes is very important to software development productivity. The availability of low-cost, on-demand services like Amazon Web Services has enabled companies in the CI space to offer build services with minimal infrastructure investment.

Who's Who

The esp8266/Arduino project uses Appveyor for Windows builds and Travis for Linux builds. Other CI companies offering Linux-based build services include Drone.io, Snap CI, and my favorite, Codeship. All of these companies offer some level of service for free to open-source developers, so I decided to try all four of the Linux-based CI services.

For my work with embedded systems, I have been writing build scripts for avr-gcc, which I intend to extend to building a gcc cross-compiler for the xtensa lx106 CPU on the esp8266. Full builds of binutils, avr-gcc, and avr-libc take a few hours on an Intel Core i5, so getting a working build was a slow process. Having a large build like this also turned out to be helpful differentiating between the different CI services.

One thing all the CI services have in common is they make it easy to set up an account and try their service if you already have a github account. With Google Code shutting down, everyone in open-source development should have a github account already. While the CI services support a number of different languages, I was only concerned with C++ using the gcc compiler. All the servers had gcc >= 4.4 and typical tools like autoconf, flex, and bison.

Travis

Travis was the first CI service I tried, and it turned out to be one of the more complicated. In order to get Travis to set up for your build (set environment variables, download dependencies), you need to make a .travis.yml file in the root of your repository. The format is similar to a shell script, so it wasn't too hard to figure out. After a bit of experimenting I was able to get a build started.

From some of the posts I read online, I was concerned whether the build would complete in the allowed time of 50 minutes. The problem I ended up having was not build time but build output. After 4MB of log output, Travis terminates the build. If my build failed I wanted to see where in the build process the problem occurred. So I turned off minor log output from things like tar extracting dependency libraries, but I still hit the 4MB limit.

Another problem you might have with large amounts of log output relates to how your browser handles it. Firefox started freezing on me when I tried to view a 4MB log file, but Chrome was OK.

Drone.io

Drone's service was easier to setup, allowing a shell script to be written in their web interface, which would be run to start your build. Drone has a limit of only 15 minutes on free builds, which turned out to be the showstopper with their service.

Codeship

I almost missed the boat on Codeship since they don't even list C/C++ in their supported languages. I guess a gcc installation is taken for granted for Linux-based CI. Codeship, like Drone, allows you to write a build script in their web interface. Unlike Travis and Drone, no sudo access is availabe on the build servers, so installing updated packages is not possible. Since the servers have a reasonably recent version of gcc (4.8.2-19ubuntu1) and gnu tools, this was not a problem for me. Their build servers (running on AWS) are nice and fast, with a full build, using make -j4, taking about 13 minutes.

Codeship doesn't seem to have any limit on build time, though they do have a 10MB log output limit. Fortunately that is just a limit of what is displayed in the web inteface, and the build does not stop. The most impressive thing about Codeship's service is that they give you ssh access to a build server instance for debugging! Clicking on Open SSH Debug Session gives you the IP and port to ssh into, assuming you've already updated your account with your ssh public key.

On the debug server, your code is already copied to the "clone" directory. The servers are running Ubuntu 14.04.2 LTS, The servers seem to have un-throttled GigE ports, as a download of gcc 4.9.2 clocks in at 24MB/s or 200mbps. With the debug server I was able to manually run my build, review config.log files, and copy files using scp to my computer for later review.

Snap CI

One way Snap CI is different than the other services, is that their servers run CentOS instead of Ubuntu. I started using RedHat before Debian and Ubuntu existed, and never had a good reason to leave rpm-based distributions, so I like the CentOS support. The version of gcc on their servers (4.4.7) is rather old, but they enable sudo so you can upgrade that with a newer RPM. They also have a limited shell interface called snap-shell. It's not full ssh access like Codeship, but it does make it easy to check the environment by running things like "gcc --version".

Snap CI also uses AWS servers, and build times were very similar to Codeship. If your builds require downloading a lot of prerequisite files, Snap may take a bit longer than Codeship though, as gcc 4.9.2 took about twice as long to download on Snap.

Conclusion

CI services eliminate the time and cost of setting up and maintaining build servers. They simplify software testing by having a clean server instantiated for each build. No more broken or incompatible builds because someone installed a custom library version on the build server that normal users don't have on their machine. I closed my Drone.io account, and will probably close my Travis account too. I'll keep using both Codeship and Snap, to be sure the software I'm working on can build on both Ubunto and Centos. If I continue to support programs like picoboot-avrdude that builds under Linux and MinGW, I'll also try out Appveyor.

↧

Adapting an ESP-01 module for breadboard use

April 20, 2015, 7:21 am

≫ Next: nRF24l01 control with 2 MCU pins using time-division duplexed SPI

≪ Previous: Continuous Integration: the wonderful world of free build servers

While esp8266 ESP-01 modules can easily be programmed after hooking up some dupont jumpers to a USB-TTL module, using them on a breadboard without an adapter or modification is not possible. The obvious method of making an adapter with a 2x4-pin female header, some stripboard, and 2 1x4 male headers. I thought of an even simpler way of adapting the ESP-01 for a breadboard without using any extra parts.

Of the 8 pins on the ESP-01, the CH_PD pin should be permanently tied to Vcc, so only 3 of the four middle pins are needed. If you use my zero-wire reset solution, only the GPIO0 and GPIO2 pins are needed. To modify the ESP-01 for breadboard use, heat up the CH_PD pin with a soldering iron, then pull it out with a pair of needle-nose pliers after the solder is liquid. Then solder a short wire from Vcc to the CH_PD pad. Next heat up the remaining 3 middle pins, and push them until they stick up out of the PCB. To do this I put my needle-nose pliars under the pins, then pushed down on the module. If you go too fast and get lumps of solder on the pin, add some flux and heat up the solder to level out the solder so jumper wires can smoothly plug into the pins. If you're not using my reset solution, I still recommend a capacitor on RST as it will reduce or eliminate spurious resets. The RST line on the esp8266 is very sensitive, at least compared to RST on 8-bit AVR MCUs. When you are done you your module should look like one below, and can easily plug into a breadboard.

↧

nRF24l01 control with 2 MCU pins using time-division duplexed SPI

May 23, 2015, 6:29 pm

≫ Next: picobootSTK500 v1 release

≪ Previous: Adapting an ESP-01 module for breadboard use

Doing more with pin-limited MCUs seems to be a popular challenge, as my post nrf24l01+ control with 3 ATtiny85 pins is by far the most popular on my blog. A couple months ago I had an idea of how to multiplex the MOSI and MISO pins, and got around to working on it over the past couple weeks. The result is that I was able to control a nRF24l01+ module using just two pins on an ATtiny13a. I also simplified my design for multiplexing the SCK and CSN lines so it uses just a resistor and capacitor. Here's the circuit:

Starting with the top of the circuit, MOMI represents the AVR pin used for both input and output. The circuit is simply a direct connection to the slave's MOSI (data in) pin, and a resistor to the MISO. Since this is not a standard SPI configuration, I've written some bit-bang SPI code that works with the multiplexing circuit. To read the data, the MOMI pin is simply set to input. Before bringing SCK high, MOMI is set to output and the pin is set high or low according to the current data bit. The 4.7k resistor keeps the slave from shorting out the output from the AVR if the AVR outputs high, or vice-verse.

Looking at the SCK/CSN multiplexing part of the circuit, I've removed the diode that was in the original version. The purpose of the diode was to discharge the capacitor during the low portion of the SCK clock cycles, so the voltage on the CSN pin wouldn't move up in accordance with the typical 50% duty cycle of the SPI clock. My bit-bang duplex SPI code is written so the clock duty cycle is less than 25%, keeping CSN from going high while data is being transmitted. The values for C1 and R1 are not critical and are just based on what was within reach when I built the circuit; in fact I'd recommend lower values. 470Ohms * .22uF gives an RC time constant of 103uS, meaning SCK needs to be held low for >103uS for C1 to discharge enough for CSN to go low. Something like a 220Ohm resistor and .1uF capacitor would reduce the delay required for CSN to go low to around 25uS.

The R2 is far more important. The first value I tried was 1.5K, and after fixing a couple minor software bugs, it seemed to be working OK. When I looked at the signals on my scope, I saw a problem:

The yellow trace shows the voltage level detected on the MOMI pin at the AVR. Each successive high bit was a bit lower voltage, so after more than a few bytes of data, all the bits would likely be read as zero. I suspect this has something to do with the internal capacitance of the output drivers on the nRF module, as well as it's somewhat weak drive strength, documented in the datasheet at table 13. A 4.7K resistor seems to be optimal, though anything from 3.3K to 6.8K should work.

Software

Here is the AVR code for the time-division duplexed SPI:

uint8_t spi_byte(uint8_t dataout)

{

uint8_t datain, bits = 8;

do{

datain <<= 1;

if(SPI_PIN & (1<<SPI_MOMI)) datain++;

sbi (SPI_DDR, SPI_MOMI); // output mode

if (dataout & 0x80) sbi (SPI_PORT, SPI_MOMI);

SPI_PIN = (1<<SPI_SCK);

cbi (SPI_DDR, SPI_MOMI); // input mode

SPI_PIN = (1<<SPI_SCK); // toggle SCK

cbi (SPI_PORT, SPI_MOMI);

dataout <<= 1;

}while(--bits);

return datain;

}

I also wrote unidirectional spi_in and spi_out functions that work with the multiplexed MOSI/MISO. Besides being faster than spi_byte, these functions work with the SE8R01 modules that have inconsistent drive strength on their MISO line.

The functions are in halfduplexspi.h, and I also wrote spitest.c, which will print the value of registers 15 through 0. Here's a screen capture of the output from spitest.c:

↧

picobootSTK500 v1 release

June 1, 2015, 12:39 pm

≫ Next: AVR eeprom debug log

≪ Previous: nRF24l01 control with 2 MCU pins using time-division duplexed SPI

I've just released v1.0 of my arduino-compatible picoboot bootloater. It now includes support for EEPROM reads, and has been tested on an ATmega168p pro mini (the beta release was only tested on a 328p pro mini). It also fixes a possible bug where the bootloader could hang while writing the non-read-while-write section of the flash. Since it's been a few months since the beta release, which has been working well on a couple m328p modules, I've decided to bump the release to v1.

The bootloader only takes 224 bytes of flash space, so there's room left to add support for eeprom writes, and possibly auto-baud for the serial in the future.

Hex files for m168 and m328 are included in the github repo, and the Makefile includes a rule to use avrdude for flashing the bootloader. If you are using the Arduino IDE with an ATmega328p, picoboot is drop-in compatible with the optiboot bootloader used on the Uno, so just select the Uno in the boards menu. For the ATmega168, modify the board.txt file to support the faster upload speed and extra flash space:

Using picoboot increases the unused flash space by 12.5% compared to the stock bootloader on the ATmega168 boards such as the 168p Pro Mini.

↧

AVR eeprom debug log

June 3, 2015, 2:58 pm

≫ Next: Rigol DS1054Z frequency counter accuracy

≪ Previous: picobootSTK500 v1 release

Using text output for debugging is a common technique in both embedded and hosted environments. In embedded environments the overhead of printf() or Wiring's Serial.print() can be quite large - over 1KB. A lightweight transmit-only soft UART like my BBUart with some code to convert binary to hex will take 64 bytes, but on an 8-pin AVR, dedicating a pin to a soft UART may be a problem. For some old parts like the ATtiny13a, the accuracy of the internal RC oscillator can also make UART output problematic. I recently purchased 5 ATtiny13a's, and running at 3.3V, the oscillator for one of them was closer to 9.2Mhz than the nominal 9.6Mhz specified in the datasheet.

My solution is to use the EEPROM for debug logging. The code takes only 22 bytes of flash, and the data log can be read using avrdude. The eelog function will use up to 256 bytes of EEPROM as a circular log buffer. Just include eelog.h, then make calls eelog() passing a byte to add to the log. I wrote a test program which logs address 0x3F through 0x00 of the AVR I/O registers. Then I used avrdude to save the EEPROM in hex form to a file:

Then the file can be viewed in a text editor. Another option would be to save the EEPROM as a binary file and use a hex editor. Here's the contents of the ee13.hex file:
:2000000000009D9D000000000000000700002400000000000000000000000000000000007B
:2000200000202002020202000009000000000000000000000000003000000000000000003F
:00000001FF

From the log file, the value of stack pointer low (SPL) is 0x9D, or 2 bytes less than the end of RAM (0x9F). Considering the call to main() uses 2 bytes, it looks like the eelog function is working as expected.

↧

Rigol DS1054Z frequency counter accuracy

July 14, 2015, 7:15 pm

≫ Next: $3 USB gamepad teardown

≪ Previous: AVR eeprom debug log

I recently found out that in addition to a software frequency measurement (shown in the bottom right) the DS1000Z series has a hardware frequency counter (shown in the top right). The hardware counter is enabled by pressing the "measure" button, select counter, CH1. The display shows 6 digits, or 1 ppm resolution, but I was unable to find a specified accuracy for the counter. My testing suggests the accuracy at ~25C ambient temperature is 1-2ppm.

The first measurements I took were with a couple old metal can 4-pin oscillators I had salvaged. One is a Kyocera 44.2368MHz that I measured at 44.2369MHz. The second was a M-tron 40.000000MHz that I measured at 39.9999MHz. The next thing I measured was a generic 12.000MHz crystal on a USB device which measured 12.0001MHz. Together those measurements suggested an accuracy of <10ppm. I don't have a high-precision clock source such as an oven-controlled crystal oscillator or GPS receiver with a timing output, so I needed another way to precisely measure the accuracy of the frequency counter.

My solution was to accurately measure the 1Khz test signal output from the scope since the frequency counter measured it at an exact 1.00000kHz. I don't have access to a calibrated frequency such as a 5381A, but I do have Kasper Pedersen's nft software. I connected the test posts for the 1kHz output to the Rx line on a USB-TTL module, and started up nft.

From the mode options I selected pulse at 1kHz. I could tell the pulses were being detected because the "Events" count was going up by about 1000 per second.

I did a few 300s runs that gave an average error of -0.98ppm. I then let it run for two 1000s tests which resulted in an average error of -1.68ppm. I don't recall Dave's teardown identifying the timing source, but given the amount of error, I'd rule out an OCXO. The accuracy is a bit better than the +-10ppm for a typical crystal oscillator, so maybe it uses a temperature-controlled crystal oscillator (TCXO). If anyone knows for sure, drop a line in the comments.

In addition to testing accuracy, I tested the frequency range. I probed the antenna output from a 433.92Mhz ASK/OOK transmitter. The software frequency counter identified it as 435Mhz, but the hardware counter showed 66.1680Mhz. The signal level was low (around 300mV), so that may have caused problems for the hardware counter. I suspect it is good up to 100Mhz, which is more than I expect to need in the foreseeable future. The accuracy and frequency range is sufficient for the things I want to do like checking oscillators on MCUs. I found one of my $2 Chinese Pro Minis was oscillating at 15.9973Mhz. The -169ppm error would be acceptable for a ceramic resonator, but this was with a HC-49S package crystal oscillator.

↧

$3 USB gamepad teardown

July 18, 2015, 2:45 pm

≫ Next: Externally clocking (and overclocking) AVR MCUs

≪ Previous: Rigol DS1054Z frequency counter accuracy

I recently received a USB gamepad I ordered off Aliexpress for a little more than $3. I got it for a RetroPie box I'm planning to build, so I don't need anything fancy. A USB controller chip alone can easily cost $1, so I was curious to see what went into making these. The photo shows it is pretty simple.

The PCB is single-sided bakelite, since it is really cheap. While double-sided FR4 PCBs cost around 5c/sq in, even in volume, a single-sided bakelite board is under 2c/sq in. The USB controller chip is on the other side of the board covered in an epoxy blob, so I can't say what kind of controller chip it is. besides the controller chip, the only electronic components are the 6Mhz resononator and the ceramic capacitor. The wires connecting the L/R buttons to the PCB are cheap - similar to the wires twist ties are made from. The controller looks like it has good strain relief, with the cord winding around a few plastic posts.

The controller was detected (under Windows 7) as a HID-compliant game controller. I haven't finished setting up my RetroPie box yet, so I tried it out with Doom. The button feel wasn't the greatest, but all 12 of the buttons worked. Overall, I'm satisfied with the controller considering the low price.

↧

Externally clocking (and overclocking) AVR MCUs

July 28, 2015, 3:12 pm

≫ Next: Ralph's rant: non-portable AVR code

≪ Previous: $3 USB gamepad teardown

People familiar with AVR boards such as Arduinos likely know most AVR MCUs can be clocked from an external crystal connected to 2 of the pins. When the AVR does not need to run at a precise clock frequency, it is also common to clock them from the internal 8Mhz oscillator. Before CPUs were made with internal oscillators or inverting amplifiers for external crystals, they were clocked by an external circuit. Although you won't see many AVR projects doing this, every AVR I have used supports an external clock option. One (extreme) example of a project using an external clock is Brad's Quark85 video game platform. Some AVRs such as the tiny13a and the tiny88 do not support an external crystal, so the internal oscillator or an external clock circuit are the only options. The 4-pin metal can pictured above is a clock circuit hermetically sealed for precision and stability. They can be bought from Asian sources for under $2.

A common reason for needing an external clock for an AVR MCU is from accidentally setting the fuses for an external clock. Once the fuses are set to external clock, they cannot be reprogrammed without providing an external clock signal. Wiring the oscillator is simple; connect power and ground, then connect the output to the CLKI pin of the MCU. On the ATtiny13a, this is pin 2 (PB3). On the ATmega328-AU, this is pin 7 (PB6).

The output of the oscillators is very stable and accurate, around a few ppm, as measured by my Rigol scope. The output is almost rail-to-rail (0-5V), and quite clean:

Although the connection is simple, it's not foolproof. During my experimentation, I accidentally plugged my oscillator backwards (connecting 5V to Gnd and Gnd to 5V), which quickly fried it. Now I'll be extra careful with the M-Tron 40Mhz oscillator so I don't kill that too!

AVRs are known for being easy to overclock, but I was uncertain whether an ATtiny13a rated for 20Mhz would work when overclocked to more than double it's rated speed. I experienced no problems flashing code with avrdude and running my bit-bang UART at either 40 or 44.3Mhz with a 5V supply. At 3.3V it crashed most of the time, only running OK occasionally.

Another way to provide an external clock is to build a ring oscillator using a 7404 hex inverter or similar chip. A 3-stage ring oscillator I build using a 7404 generated a clock close to 30Mhz:

Since the frequency is inversely proportional to the number of stages, a 5-stage oscillator using the same 7404 would generate a frequency of 18Mhz. I tried to make a single-stage oscillator with the 7404 and also with a 74LS00, but was unsuccessful, They are just not fast enough to generate a 90Mhz clock. Considering the 7404 I used is a Fairchild part with a 1984 date stamp, I'm pleased with how well this 30-year-old part works.

The last way of getting a clock source I'll describe is to tap off the XTAL pin of an AVR (or other MCU) that is using an external crystal. Most AVRs can drive the external crystal in low-power of full-swing mode. For the ATmega8a, the CKOPT fuse enables full swing mode. If the AVR is driving the crystal in low-power mode, the peak-to-peak voltage will not be enough to work as the external clock for another AVR. By soldering a wire to one of the XTAL pins you can use it to clock another MCU. I've labeled the XTAL pins in yellow on a chinese USBasp clone:

And here's a shot from my scope connected to the 12Mhz crystal on the USBasp:

Finally, if your external clock is slower than 8Mhz (like if you were to use a 555 timer to generate the clock) you'll probably need to use a slower SPI bit clock setting with avrdude. I've found avrdude -B 4, specifying a 4 microsecond clock period will work with AVRs clocked as low as 1Mhz.

↧

Ralph's rant: non-portable AVR code

August 2, 2015, 1:21 pm

≫ Next: Pigggy-prog project ideas

≪ Previous: Externally clocking (and overclocking) AVR MCUs

One thing I like about AVR MCUs is that in addition to instruction-set, a number of them have some degree of I/O register-level compatibility. For example, both the ATtiny85 and ATtiny84a have PORTB at I/O address 0x18. Because of this, I was able to write my 64-byte picoboot bootloader, which uses a soft UART on PB1, so that a single binary works on both the tiny85 and tiny84.

I recently though I could take advantage of the register-level compatibility between the ATmega328p and ATmega168p in my arduino compatible picoboot bootloader. The source is already identical, and the only difference in the binary files is the flash start address and the signature bytes reported. My idea was to build a version which returned the signature bytes of the 328p, but that loaded on the 168p. When flashed to a 168p, it would look like a Uno to the Arduino IDE, so people could switch between a 328p board and a 168p board without having to modify the boards.txt file. Obviously projects with a code size larger than 16Kb wouldn't work, but for everything else, I thought it was a great idea. But it didn't work.

The bootloader would initially work OK; clicking Upload in the Arduino IDE would seem to upload the code to the 168p board when Uno was selected as the target, but the uploaded code wouldn't work. I double-checked the fuse settings for the 168p. I flashed the board back to the regular 168p bootloader, selected my modified pro mini 168 target in the boards menu, and uploaded. Everything worked fine, so there was nothing wrong with the board. I compared the disassembly of the normal 168p bootloader and my 168p masquerade bootloader as I was calling it; the only difference was the signature bytes reported. I even reviewed the 168p/328p datasheet in case I missed an important difference - and found nothing.

I then decided to verify that the bootloader was properly flashing the uploaded code and hadn't somehow corrupted the flash. I uploaded a basic blink program using the 168p masquerade bootloader, then connected a USBasp to read back the full contents of the flash, including the bootloader:
avrdude -c usbasp -C /etc/avrdude.conf -p m168p -U flash:r:flash168masq.hex:i

Then I used avr-objcopy to convert the hex file to elf:
avr-objcopy -I ihex flash168masq.hex -O elf32-avr flash168.elf

Finally, I used avr-objdump to disassemble the elf file:
avr-objdump -D flash168.elf

The reset vector was a jump to 0x00ae:
0: 0c 94 57 00 jmp 0xae ; 0xae
...
ae: 11 24 eor r1, r1
b0: 1f be out 0x3f, r1 ; 63
b2: cf ef ldi r28, 0xFF ; 255
b4: d8 e0 ldi r29, 0x08 ; 8
b6: de bf out 0x3e, r29 ; 62

b8: cd bf out 0x3d, r28 ; 61

The code at 0x00ae first clears the zero register (r1), then clears SREG(0x3f). Clearing SREG is redundant since section 7.3.1 of the datahsheet shows that SREG is always cleared after reset. Clearing it again wasn't going to cause any problems though. The next four instructions initialize the stack (SPL and SPH). I immediately recognized this as the problem. I described how this was redundant in Trimming the fat from avr-gcc code. In this case it wasn't redundant, it was wrong! Since avr-gcc thought it was generating code for a m328p, it included the (normally just redundant) code to initialize the stack to 0x08FF. But on the m168p, the end of RAM, and therefore the reset value of the stack pointer, is 0x04FF. With an improperly initialized stack, it was obvious why programs uploaded to the 168p masquerading as a 328p weren't working.

So the superfluous code emitted by avr-gcc not only wastes space, it interferes with releasing binary code that runs on a number of different AVR MCUs. I think it also demonstrates the dangers of developers writing code with a "it shouldn't hurt," attitude rather than a "is it necessary?" attitude. I don't know who first said it, but it was a wise man that recognized when building a project you should include everything necessary but nothing more.

↧

Pigggy-prog project ideas

August 9, 2015, 11:31 am

≫ Next: DC converter modules using fake LM2596 parts

≪ Previous: Ralph's rant: non-portable AVR code

I've started working on a new project I'm calling piggy-prog. The hardware requirements are cheap and simple - a Pro Mini and a breadboard. The pro mini boards are cheaper than a USBasp (around 150c on Aliexpress), and by piggy-backing over a DIP AVR (like the ATtiny pictured above), no jumper wires or custom programming cable will be required. The plan is to support the 8-pin tinies like the tiny13a and the tinyx5, and the 14-pin tiny84.

The piggy-prog should be a lot safer than socket-based programmers like the stk500, especially for the 8-pin parts like the tiny85. With the 8-pin AVRs, putting the chip in the wrong way around (rotated 180 degrees) results in reversing the polarity of the power - Vcc to Gnd and Gnd to Vcc:

By selectively powering the pins of the target chip with the low-current I/O pullup power from the Pro Mini, it is possible to probe and detect the target chip without any risk of damage to either the target chip or the pro mini. Given the lack of a clamping diode on the reset pin going to Vcc, it is possible to detect which pin is reset, and therefore detect when the chip is rotated 180 degrees.

I'll use the stk500 protocol since it is supported by avrdude, and is the protocol used when "AVR ISP" is selected in the Arduino IDE programmers menu. And since my picobootSTK500 bootloader implements a stripped down version of the stk500 protocol, I'll be able to leverage some of the code I've already written.

Proof of concept

To test the idea, I wired up an ATtiny85 on a breadboard with connections to piggyback a pro mini running ArduinoISP.

I modified the code to change LED_ERR from 8 to 2, since pin 8 connects to Gnd on the tiny85, and I changed LED_PMODE to 3. I first tested the ArduinoISP code without the connection to the tiny85, but was always getting a "programmer not responding" error:

$ avrdude -C /etc/avrdude.conf -c avrisp -p t85 -P com16 -b 19200

avrdude: stk500_recv(): programmer is not responding

After connecting an LED and resistor to pin9, I could see the LED heartbeat, but whenever I ran avrdude the heartbeat would stop (and another LED connected to LED_ERR would not light up). This seems to be a but in the ArduinoISP code, since when I plugged the pro mini into the breadboard on top of the tiny85 it worked fine:

$ avrdude -C /etc/avrdude.conf -c avrisp -p t85 -P com16 -b 19200

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.06s

avrdude: Device signature = 0x1e930b

avrdude: safemode: Fuses OK (E:FE, H:DF, L:E1)

avrdude done. Thank you.

If I can find the but that is causing the ArduinoISP code to hang when there is no target, I'll probably build on that code rather than starting from scratch.

↧

DC converter modules using fake LM2596 parts

August 23, 2015, 10:19 am

≫ Next: Cheap TL431 voltage references

≪ Previous: Pigggy-prog project ideas

Kerry Wong recently tested some cheap LM2596 DC buck converter modules, very similar to the ones I purchased off Aliexpress a over a year ago for around 80c ea. One of the comments indicated these actually use clones of the LM2576 re-labeled as 2596. The switching frequency of the LM2576 is around 50KHz, vs 150KHz for the LM2596, making it easy to see the difference on a scope.

I pulled out my DS1054Z, one of the DC buck converter modules that I had previously adjusted for 3.3V output, and connected the input to 5V. Here's the output on pin 2:

Measuring of the 5us scale, two cycles takes about 36us, or 18us per cycle, or 55.5kHz. So the ones I received are fakes. I did test the modules after I received them, and found they are good for about 2A @5V with 12V in, so they weren't a total waste of money.

The latest "LM2596" modules I've seen online clearly do not use a real or even fake 2496. The SOIC-8 part appears to be a MPS MP1584. With a switching frequency of up to 1.5Mhz and a smaller form factor, the modules look like a reasonable value at 42c ea, even though they're falsely advertised as LM2596. Strangely, some sellers correctly advertise the same modules as MP1584 converters, but at several times the price.

↧

Cheap TL431 voltage references

August 27, 2015, 7:46 pm

≫ Next: Calibrating a cheap crappy tire multimeter

≪ Previous: DC converter modules using fake LM2596 parts

Until a year ago I had never heard of the TL431. Then I read Ken Shirriff's blog post, as well as other mentions of the TL431 on hackaday.com and eevblog.com. I found out the 431 is useful not only as a voltage reference, but also as a constant current control, and even a voltage controlled oscillator.

I had started suspecting my cheap (~$20) auto-ranging multimeter was reporting voltages a bit on the high side, and when I found 100 TL431s selling for less than 150c, I ordered them. While waiting for them to arrive I tried to find out more information about the manufacturer, Wing Shing Computer Components of Hong Kong. I could not find an active web site (at least in English), and although I found an old datasheet for the WS-TL431, I could not find anything current. I did find another Aliexpress seller that posted a photo of a box full of WS-TL431A showing a 0.3% accuracy rating, which, considering the low price, is quite good. Even 1% rated genuine TI TL431 parts are difficult to find for less than 2c each.

Once I received the package, I checked out the chip markings, which were all the same:

TL431A

155SD

I suspect the 155 is a date code for 2015, 5th week, indicating these are new parts. The old datasheet from Wing Shing shows the TL431A part as only 1%, and a TL431AA as 0.5%, and nothing listed for a 0.3% part. I don't think I'm perpetuating an unfair stereotype to say that the Chinese are notorious for bad or non-existent documentation. I think that the parts I received are actually rated to within 0.3% at 25C, and the manufacturer has not undertaken to produce an updated datasheet (or English website, for that matter). Other compatible parts such as Linear's LT1431 is rated at a 0.4% initial tolerance, and the price is in line with similar Chinese TL431 parts such as the ALJ TL431A and the CJ431. After checking the WS TL431 chip markings, I setup a simple circuit on my breadboard with a 270 Ohm input resistor (which should give about 9.5mA) from a ~5V USB power supply to test the parts.

I tested a total of 25 parts with ambient temperature of 24C. The average voltage reading was 2.513, and the range was from 2.506 to 2.517, or 2.5115V +-0.23% The measurements are consistent with the parts being 0.3% rated, as well as suggesting my meter is reading about 0.6-0.7% high.

The next thing I tried was to crack open the TO-92 package with a pair of pliers in an attempt to expose the die. Like Ken, I was able to expose the copper anode (seen in the very first picture), but was not able to expose the die. The die appears to be around 0.6mm x 1mm, so even if I was able to expose the die, with only a magnifying glass, I doubt I would have been able to see much.

My intention in trying to expose the die was to see if the Wing Shing parts are fuse trimmed like the TI part depicted by Ken. Two fuses give four different combinations of trimming options, which should show up as more than one peak in the distribution of the voltages. Without a die, I could still analyze my measurements and look for peaks. A simple shell command was all I needed:
sort voltages.txt | uniq -c
1 2.506
1 2.508
2 2.509
3 2.510
2 2.511
3 2.512
1 2.513
3 2.514
3 2.515
3 2.516
3 2.517

Even with only a quarter of the parts tested, it is evident the voltages are concentrated around 2.510V, 2.512V, and 2.515/2.516V. While more data points would be helpful, the testing is consistent with fuse-trimmed 0.3% parts.

The first practical circuit I made with the TL431 uses it as a 2.5V zener for battery reconidtioning. I had been using a 270 Ohm resistor to discharge the batteries. With the TL431 acting as a 2.5V zener, a high current red LED and 160 Ohm resistor add up to an addition 2.5V drop, very close to the 4.8V total when discharging a 12-cell battery to 0.4V/cell.

I'd like to have a discharge closer to 0.1C, which would be around 130mA, but the red LED is rated for 50mA maximum continuous current. The TL431 datasheet has a simple constant current circuit, and by making a couple small modifications to that circuit I think I can make a 130mA constant current discharge circuit with a cut-off voltage just below 5V.

↧

Calibrating a cheap crappy tire multimeter

August 30, 2015, 12:39 pm

≫ Next: Diodes, diodes everywhere

≪ Previous: Cheap TL431 voltage references

Anyone living in Canada is likely familiar with CT products carrying the Mastercraft branding. They are significantly overpriced to allow for heavy sale discounts of 50-60% that usually happen a few times a year. About 10 years ago I bought a Mastercraft model 52-0052-2 auto-ranging multimeter when it was on sale for C$20 (about US$15 at current exchange rates). I use it for low-voltage, and have another multimeter with a current clamp that I use for household (mains) 120/240V testing.

I've been using it a lot over the last few years, and started getting a feeling that it was reading a bit high, based on readings from 3.3 and 5V regulators, and even from comparisons with battery voltage readings. After testing a batch of TL431 voltage references, I was able to confirm that it is reading between 0.6% and 0.7% high. I also ordered a couple 0.05% REF5050s, which will allow me to double-check my TL431 measurements.

After opening up the meter, I found there are 8 trimmer pots, and no indications on the board as to which of these adjusts the voltage. I eventually found a discussion on EEVblog that indicates the meter is made by the Hong Kong company Colluck, and is spec'd for 0.8% accuracy on voltage readings. I have what DIPlover calls the old model, which measures 9cm wide and 18.5cm long. While I still couldn't find documentation on how to adjust the calibration, I did find a review of another cheap multimeter that had several trimmer pots, and the pot labled VR1 was the voltage calibration trimmer. I figured I had little to loose by trying the same pot on my meter.

In case VR1 didn't adjust the voltage, I needed to adjust it back to it's original position. I used an ultra-fine sharpie to mark a small line on the top and the base of the trim pot so I could locate the original position. I used a TL431 which was reading 2.513V, which I though should read around 2.496V, connecting it to my meter probes with a hook probes. With the first small adjustment of the pot, the voltage reading went up to over 2.53V, so I had the right pot. The sensitivity was a bit of a problem, as the tiniest adjustments I could make were undershooting and overshooting. After several tries, I got a reading of 2.496V, which I think is within 0.1%. With a couple of REF5050's it should be possible to calibrate it to +- 1mV, or 0.04%. But given how sensitive the trim pot is, I won't touch it as long as it is within 2mV.

↧