Linux and the I2C and SPI interfaces

Intro

Most of the people that reading that blog, except that they don’t have anything more interesting to do, are probably more familiar with the lower embedded stuff. I like the baremetal embedded stuff. Everything is simple, straight-forward, you work with registers and close to the hardware and you avoid all the bloatware between the hardware and the more complicated software layers, like an RTOS. Also, the code is faster, more efficient and you have the full control of everything. When you write a firmware, you’re the God.

And then an RTOS comes and says, well, I’m the god and I may give you some resources and time to spend with my CPU to run your pity firmware or buggy code. Then you become a semi-god, at best. So Linux is one of those annoying RTOSes that demotes you to a simple peasant and allows you to use only a part of the resources and only when it decides to. On the other hand, though, it gives you back a lot more benefits, like supporting multiple architectures and have an API for both the kernel and the user-space that you can re-use among those different architectures and hardware.

So, in this stupid project we’ll see how we can use a couple of hardware interfaces like I2C and SPI to establish a communication between the kernel and an external hardware. This will unveil the differences between those two worlds and you’ll see how those interfaces can be used in Linux in different ways and if one of those ways are better than the other.

Just a note here: I’ve tested this project in two different boards, on a raspberry pi 3 model B+ and on a nano-pi neo. Using the rpi is easier for most of the people, but I prefer the nano-pi neo as it’s smaller and much cheaper and it has everything you need. Therefore, in this post I will explain (not very thorough) how to make it work on the rpi, but in the README.md file in the project repo, you’ll find how to use Yocto to build a custom image for the nano-pi and do the same thing. In case of the nano-pi you can also use a distro like armbian and build the modules by using the sources of the armbian build. There are so many ways to do this, so I’ll only focus one way here.

Components

Nanopi-neo

I tried to keep everything simple and cheap. For the Linux OS I’ve chosen to use the nanopi-neo board. This board costs around ~$12 and it has an Allwinner H3 cpu @800 or @1200MHz and a 512MB RAM. It also has various other interfaces, but we only care about the I2C and the SPI. This is the board:

You can see the full specs and pinout description here. You will find the guide how to use the nano-pi neo in the repo README.md file in here:

https://bitbucket.org/dimtass/linux-arduino-spi-i2c/src/master/

Raspberry pi

I have several raspberry pi flying around here, but in this case I’ll use the latest Raspberry Pi 3 Model B+. That way I can justify to my self that I bought it for a reason and feel a bit better. In this guide I will explain how to make this stupid project with the rpi, as most of the people have access to this rather to a nano pi.

Arduino nano

Next piece of hardware is the arduino-nano. Why arduino? Well, it’s fast and easy, that’s why. I think the arduino is both bless and curse. If you have worked a lot with the baremetal stuff, arduino is like miracle. You just write a few line of codes and everything works. On the other hand, it’s also a trap. If you write a lot of code in there, you end up losing the reality of the baremetal and you become more and more lazy and forget about the real hardware. Anyway, because there’s no much time now, the Arduino API will do just fine! This is the Arduino nano:

Other stuff

You will also need a photoresistor, a LED and a couple of resistors. I’ll explain why in the next section. The photoresistor I’m using is a pre-historic component I’ve found in my inventory and the part name is VT33A603/2, but you can use whatever you have, it doesn’t really matter. Also, I’m using an orange led with a forward voltage of around 2.4V @ 70mA.

Project

OK, that’s fine. But what’s the stupid project this time? I’ve though the most stupid thing you can build and would be a nice addition to my series of stupid projects. Let’s take a photo-resistor and the Arduino mini and use an ADC to read the resistance and then use the I2C interface as a slave to send the raw resistance value. This value actually the amount of the light energy that the resistor senses and therefore if you have the datasheets you can convert this raw resistance to a something more meaningful like lumens or whatever. But we don’t really care about this, we only care about the raw value, which will be something between 0 and 1023, as the avr mega328p (arduino nano) has a 10-bit ADC. Beautiful.

So, what we do with this photo-resistor? Well, we also use a PWM channel from the Arduino nano and we will drive a LED! The duty cycle of the PWM will control the LED brightness and we feed the mega328p with that value by using the SPI bus, so the Arduino will be also an SPI slave. The SPI word length will be 16-bit and from those only the 10-bit will be effective (same length as the ADC).

Yes, you’ve guessed right. From the Linux OS, we will read the raw photo-resistor value using the I2C interface and then feed back this value to the PWM LED using the SPI interface. Therefore, the LED will be brighter when we have more light in less bright in dark conditions. It’s like the auto-brightness of your mobile phone’s screen? Stupid right? Useless, but let’s see how to do that.

Connections

First you need to connect the photo-resistor and the LED to the Arduino nano as it’s shown in the next schematic.

As you can see the D3 pin of the Arduino nano will be the PWM output that drives the LED and the A3 pin is the ADC input. In my case I’ve used a 75Ω resistor to drive the LED to increase the brightness range, but you might have to use a higher value. It’s better to use a LED that can handle a current near the maximum current output of the Arduino nano. The same goes for the resistor that creates a voltage divider with the photo-resistor; use a value that is correct for your components.

Next you need to connect the I2C and SPI pins between the Arduino and the rpi (or nanopi-neo). Have in mind that the nano-pi neo has an rpi compatible pinout, so it’s the same pins. These are the connections:/

Signal Arduino RPI (and nano-pi neo)
/SS D10 24 (SPI0_CS)
MOSI D11 19 (SPI0_MISO)
MISO D12 21 (SPI0_MOSI)
SCK D13 23 (SPI0_CLK)
SDA A4 3 (I2C0_SDA)
SCL A5 5 (I2C0_SCL)

You will need to use two pull-up resistors for the SDA and SCL. I’ve used 10ΚΩ resistors, but you may have to check with an oscilloscope to choose the proper values.

Firmwares

You need to flash the Arduino with the proper firmware and also boot up the nanopi-neo with a proper Linux distribution. For the second thing you have two option that I’ll explain later. So, clone this repo from here:

https://bitbucket.org/dimtass/linux-arduino-spi-i2c/src/master/

There you will find the Arduino sketch in the arduino-i2c-spi folder. Use the Arduino IDE to build and to upload the firmware.

For the rpi download the standard raspbian strech lite image from here and flash an SD card with it. Don’t use the desktop image, because you won’t need any gui for this guide and also the lite image is faster. After you flash the image and boot the board there are a few debug tweaks you can do, like remove the root password and allow root passwordless connections. Yeah, I know it’s against the safety guidelines, don’t do this at your home and blah blah, but who cares? You don’t suppose to run your NAS with your dirty secrets on this thing, it’s only for testing this stupid project.

To remove the root password run this:

passwd -d root

Then just for fun also edit your /etc/passwdfile and remove the xfrom the root:x:... line. Then edit your /etc/ssh/sshd_config and edit it so the following lines are like that:

PermitRootLogin yes
#PubkeyAuthentication yes
PasswordAuthentication yes
PermitEmptyPasswords yes
UsePAM no

Now you should be able to ssh easily to the board like this:

ssh root@192.168.0.33

Anyway, just flash the arduino firmware and the raspbian distro and then do all the proper connections and boot up everything.

Usage

Here’s the interesting part. How can you retrieve and send data from the nanopi-neo to the Arduino board? Of course, by using the I2C and the SPI, but how can you use those things inside Linux. Well, there are many ways and we’ll see a few of them.

Before you boot the raspbian image on the rpi3, you need to edit the /boot/config.txt and add this in the end of the file, in order to enable the uart console.

enable_uart=1
Raw access from user space using bash

Bash is cool (or any other shell). You can use it to read and write raw data from almost any hardware interface. From the bitbucket repo, have a look in the bash-spidev-example.sh. That’s a simple bash script that is able to read data from the I2C and then send data to the SPI using the spidev module. To do that the only thing you need to take care of is to load the spidev overlay and install the spi-tools. The problem with the debian stretch repos is that the spi-tools is not in the available packages, so you need to either build it yourself. To do this, just login as root and run the following commands on the armbian shell:

apt-get install -y git
apt-get install -y autotools-dev
apt-get install -y dh-autoreconf
apt-get install -y i2c-tools
git clone https://github.com/cpb-/spi-tools.git
cd spi-tools
autoreconf -fim
./configure
make
make install

Now that you have all the needed tools installed you need to enable the i2c0 and the spidev modules on the raspberry pi. To do that add this run the raspi-config tool and then browse to the Interfacing options and enable both I2C and SPI and then reboot. After that you will be able to see that there are these devices:

/dev/spidev0.0
/dev/i2c-1

 

This means that you can use one of the bash scripts I’ve provided with the repo to read the photo-resistor value and also send a pwm value to the LED. First you need to copy the scripts from you workstation to the rpi, using scp (I assume that the IP is 192.168.0.36).

cd bash-scripts
scp *.sh root@192.168.0.36

Before you run the script you need to properly set up the SPI interface, using the spi-config tool that you installed earlier, otherwise the default speed is too high. To get the current SPI settings run this command:

pi-config -d /dev/spidev0.0 -q

If you don’t get the following then you need to configure the SPI.

/dev/spidev0.0: mode=0, lsb=0, bits=8, speed=10000000, spiready=0

To configure the SPI, run these commands:

spi-config -d /dev/spidev0.0 -m 0 -l 0 -b 8 -s 1000000
./bash-spidev-example.sh

With the first command you configure the SPI in order to be consistent with the Arduino SPI bus configuration. Then you run the script. If you see the script then the sleep function is sleep for 0.05 secs, or 50ms. We do that for benchmark. Therefore, I’ve used the oscilloscope to see the time difference every SPI packet and the average is around 66ms (screenshots later on), instead of 50. Of course that includes the time to read from the I2C and also send to the SPI. Also, I’ve seen a few I2C failures with the following error:

mPTHD: 0x036e
mError: Read failed
PTHD: 0x
n\PTHD: 0x036d
xPTHD: 0x036e

Anyway, we’ve seen that way that we are able to read from the I2C and write to the SPI, without the need to write any custom drivers. Instead we used the spidev module which is available to the mainline kernel and also a couple of user-space tools. Cool!

Using a custom user-space utility in C

Now we’re going to write a small C utility that opens the /dev/i2c-1 and the /dev/spidev0.0 devices and read/writes to them like they are files. To do that you need to compile a small tool. You can do that on the rpi, but we’ll need to build some kernel modules later so let’s use a toolchain to do that.

The toolchain I’ve use is the `gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf` and you can download it from here. You may seen that this is a 32-bit toolchain. Yep, although rpi is a 64-bit cpu, raspbian by default is 32-bit. Then just extract it to a folder (this folder in my case is /opt/toolchains) and then you can cross-build the tool to your desktop with these commands:

export CC=/opt/toolchains/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc
${CC} -o linux-app linux-app.c
scp linux-app root@192.168.0.36:/root

Then on the rpi terminal run this:

./linux-app /dev/i2c-1 /dev/spidev0.0

Again in the code I’ve used a sleep of 50ms and this time the oscilloscope shows that the average time between SPI packets is ~50ms (screenshots later on), which is a lot faster compared to the bash script. In the picture you see the average is 83, but that’s because sometimes there are some delays because of the OS like 200+ms, but that’s quite expected in a non PREEMPT-RT kernel. Also, I’ve noticed that there was no missing I2C packet with the executable. Nice!

You won’t need the spidev anymore, so you can run the raspi-config and disable the SPI, but leave the I2C as it is. Also, in the /boot/config.txt make sure that you have those:

dtparam=i2c_arm=on
#dtparam=i2s=on
#dtparam=spi=off
enable_uart=1

Now reboot.

Use custom kernel driver modules

Now the more interesting stuff. Let’s build some drivers and use the device-tree to load the modules and see how the kernel really handles these type of devices using the IIO and the LED subsystems. First let’s build the IIO module. To do that you need first to set up the rpi kernel and the cross-toolchain to your workstation. To do that you need first to get the kernel from git and run some commands to prepare it.

git clone https://github.com/raspberrypi/linux.git
cd linux

Now you need to checkout the correct hash/tag. To do this run this command to rpi console.

uname -a

In my case I get this:

Linux raspberrypi 4.14.79-v7+ #1159 SMP Sun Nov 4 17:50:20 GMT 2018 armv7l GNU/Linux

That means that the date this kernel was build was 04.11.2018. Then in the kernel repo to your workstation, run this:

git tag

And you will get a list of tags with dates. In my case the `tag:raspberrypi-kernel_1.20181112-1` seems to be the correct one, so check out to the one that is appopriate for your kernel, e.g.

git checkout tag:raspberrypi-kernel_1.20181112-1

Then run these commands:

export KERNEL=kernel7
export ARCH=arm
export CROSS_COMPILE=/opt/toolchains/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-
export KERNEL_SRC=/rnd2/linux-dev/rpi-linux
make bcm2709_defconfig
make -j32 zImage modules dtbs

This will build the kernel, the modules and the device-tree files and it will create the Module.symversthat is needed to build our custom modules. Now on the same console (that the above parameters are exported) run this from the repo top level.

cd kernel_iio_driver
make
dtc -I dts -O dtb -o rpi-i2c-ard101ph.dtbo rpi-i2c-ard101ph.dts
scp ard101ph.ko root@192.168.0.36:/root/
scp rpi-i2c-ard101ph.dtbo root@192.168.0.36:/boot/overlays/

Now, do the same thing for the led driver:

cd kernel_led_driver
make
dtc -I dts -O dtb -o rpi-spi-ardled.dtbo rpi-spi-ardled.dts
scp ardled.ko root@192.168.0.36:/root/
scp rpi-spi-ardled.dtbo root@192.168.0.36:/boot/overlays/

And then run these commands to the rpi terminal:

mv ard101ph.ko /lib/modules/$(uname -r)/kernel/
mv ardled.ko /lib/modules/$(uname -r)/kernel/drivers/leds/drivers/iio/light/
depmod -a

And finally, edit the /boot/config.txtfile and make sure that those lines are in there:

dtparam=i2c_arm=on
#dtparam=i2s=on
dtparam=spi=off
enable_uart=1
dtoverlay=rpi-i2c-ard101ph
dtoverlay=rpi-spi-ardled

And now reboot with this command:

systemctl reboot

After the reboot (and if everything went alright) you should be able to see the two new devices and also be able to read and write data like this:

cat /sys/bus/iio/devices/iio\:device0/in_illuminance_raw
echo 520 > /sys/bus/spi/devices/spi0.0/leds/ardled-0/brightness

Have in mind, that this is the 4.14.y kernel version and if you’re reading this and you have a newer version then a lot of things might be different.

So, now that we have our two awesome devices, we can re-write the bash script in order to use those kernel devices now. Well, the script is already in the bash-scripts folder so just scp the scripts and run this on the rpi:

./bash-modules-example.sh

The oscilloscope now shows an average period of 62.5ms, which is a bit faster compared to the raw read/write from bash in the first example, but the difference is too small to be significant.

Conclusions

Let’s see some pictures of the various oscilloscope screenshots. The first one is from the bash script and the spidev module:

 

The second is from the linux-app program that also used spidev and the /dev/i2c.

And the last one is by using the kernel’s iio and led subsystems.

So let’s make some conclusions about the speed in those different cases. It’s pretty clear that writing an app in the user space using the spidev and the /dev/spi is definitely the way to go as it’s the best option in terms of speed and robustness. Then the difference between using a bash script to read/write to the bus with different type of drivers [spidev & /dev/i2c] vs [leds & iio] is very small.

Then, why writing a driver for iio and leds in the first place if there’s no difference in performance? Exactly. In most of the cases it’s muck easier to write a user-space tool to control this kind of devices instead of writing a driver.

Then are those subsystems useless? Well, not really. There are useful if you use them right.

Let’s see a few bullet points, why writing a user-space app using standard modules like the spidev is good:

  • No need to know the kernel internal
  • Independent from the kernel version
  • You just compile on another platform without having to deal with hardware specific stuff.
  • Portability (pretty much a generalization of the above)
  • If the user space app crashes then the kernel remains intact
  • Easier to update the app than updating a kernel module
  • Less complicate compared to kernel drivers

On the other hand, writing or having a subsystem driver also has some nice points:

  • There are already a lot of kernel modules for a variety of devices
  • You can write a user space software which is independent from the hardware. For example if the app accesses an iio device then it doesn’t have to know how to handle the device and you can just change the device at any time (as long it’s compatible in iio terms)
  • You can hide the implementation if the driver is not a module but fixed in the kernel
  • It’s better for time critical operations with tight timings or interrupts
  • Might be a bit faster compare to a generic driver (e.g. spidev)
  • Keeps the hardware isolated from the user-space (that means that the user-space devs don’t have to know how to deal with the hardware)

These are pretty much the generic points that you may read about generally. As a personal preference I would definitely go for a user-space implementation in most of the cases, even if a device requires interrupts, but it’s not time critical. I would choose to write a driver only for very time critical systems. I mean, OK, writing and know how to write kernel drivers it’s a nice skill to have today and it’s not difficult. In the bottom line, I believe most of the times, even on the embedded domain, when it comes to I2C and SPI, you don’t have to write a kernel driver, unless we talking about ultra fast SPI that is more than 50MHz with DMAs and stuff like that. But there are very few cases that really needs that, like audio/video or a lot of data. Then in 95% of the rest cases the spidev and the user-space is fine. The spidev is even used for driving framebuffers, so it’s proven already. If you work in the embedded industry, then probably you know how to do both and choose the proper solution every time; but most of the time on a mainstream product you may choose to go with a driver because it’s “proper” rather “needed”.

Anyway, in this stupid project you pretty much seen how the SPI and I2C devices are used, how to implement your own I2C and SPI device using an Arduino and then interface with it in the Linux kernel, either by using the standard available drivers (like spidev) or by writing your custom subsystem driver.

Finally, this is a video where both kernel modules are loaded and the bash script reads the photo-resistor value every 50ms via I2C and then writes the value to the brightness value of the led device.

Have fun!

STM32F373CxT6 development board

Intro

Probably most of you people that browsing places like this, you already know the famous “blue pill” board that is based on the stm32f103c8t6 mcu. Actually, you can find a few projects here that are based on this mcu. It’s my favorite one, fast, nice peripherals, overclockable and dirt cheap. The whole PCB dev board costs around 1.7 EUR… I mean, look at this.

If you try to buy the components to build one and also order the PCBs from one the known cheap makers, that would cost more than 10 EUR per unit for a small quantity. But…

There’s a new player in town

Don’t get me wrong, the stm32f103 is still an excellent board and capable to do  most of the things I can image for small stupid projects. But there are also a few things that can’t do. And this is the gap that this new mcu comes to fill in. I’m talking about the stm32f373, of course. The STM32F373CxT6 (I’ll refer to it as f373 from now on) is pin to pin compatible with the STM32F103C8T6, but some pins are different and also is like its buffed brother. The new f373 also comes in a LQFP48 package, but there are significant differences, which I’ve summed in the following table.

STM32F373C8 STM32F103C8
Core Arm Cortex-M4 Arm Cortex-M3
RAM Size (kB) 16 20
Timers (typ) (16 bit) 12 4
Timers (typ) (32 bit) 2
A/D Converters (12-bit channels) 9 10
A/D Converters (16-bit channels) 8
D/A Converters (typ) (12 bit) 3
Comparator 2
SPI (typ) 3 2
I2S (typ) 3

So, you get a faster core, with DSP, FPU and MPU, DACs, 16-bit ADCs, I2S and more SPI channels, in the same package… This MCU is mind-blowing. I love it. But, of course is more expensive. Mostly I like the DACs because you can do a lot of stupid stuff with these and then the 16-bit ADC, because more bits means less noise, especially if you use proper software filters. The frequency is the same, so I don’t expect much difference in terms of raw speed, but I’ll test it at some point.

Also it’s great that ST already released a standard peripheral library for the mcu, so you don’t have to use this HAL crap bloatware. There’s a link for that here. The StdPeriphLib supports, now makes this part my favorite one, except that fact that… I haven’t tested it yet.

Where’s my pill?

For the f103 we have the blue pill. But where’s my f373 pill? There’s nothing out there yet. Only a very expensive dev board that costs around $250. See here. That’s almost 140 “blue pills”… Damn. Therefore, I had to design a board that is similar to the “blue pill” but uses the f373. Well, some of you already thought of why not remove the f103 from the “blue pill” and solder the f373 instead and there you are, you have the board. Well, that’s true… You can do that, some pins might need some special handling of course, but where’s the fun in this? We need to build a board and do a stupid project!

Well, in case you want to replace the f103 on the “blue pill” with the f373, then this is the list with the differences in the pin out.

PIN F103 F373
21 B10 E8
22 B11 E9
25 B12 VREFSD+
26 B13 B14
27 B14 B15
28 B15 D8
35 VSS_2 F7
36 VDD_2 F6

So the problematic pins are mainly the 35 & 36 and that’s because in case of the f373 the F7 is connected straight to ground and the F6 to 3V3 and that makes them unusable. According to the manual the F7 pin has these functionalities: I2C2_SDA, USART2_CK and the F6 these: SPI1_MOSI/I2S1_SD, USART3_RTS, TIM4_CH4, I2C2_SCL. That doesn’t mean that you can’t use the I2C2 as the I2C2_SDA is also available to pin 31 (PA10) and also the SPI1_MOSI is available to pin 41 (PB5) and I2C2_SCL to pin 30 (PA9). So no much harm, except the fact that those alternative pins overlap other functionalities, you might not be able to use.

Therefore, just change the f103 on the “blue pill” with the f373, if you like; but don’t fool yourself, this won’t be stupid enough, so it’s better to build a board from the scratch.

The board

I’ve designed copied the schematics of the “blue pill” and create a similar board with a few differences though. First, the components like resistors and capacitors are 0805 with larger pads, instead of 0603 on the original “blue pill”. The reason for that is to help older people like me with bad vision and shaky hands to solder these little things on the board. I’m planning to create a branch though with 0603 parts for the younger people. Soon…

I’ve used my favorite KiCAD for that as it’s free, opensource and every open-hardware (and not only) project should use that, too. I think that after the version 5, more and more people are joining the boat, finally.

This is the repo with the KiCAD files:

https://bitbucket.org/dimtass/stm32f373cb_board/src/master/

And this is the 3D render of the board:

Conclusion

You can replace the f103 on the “blue pill” with the f373, but that’s lame. Don’t do that. Always go with the hardest, more time consuming and more expensive solution and people will always love you. I don’t know why, it just works. So, instead of replace the IC, build a much more expensive board from the scratch, order the PCBs and the components, wait for the shipment and then solder everything by yourself. Then you have a great stupid project and I’ll be proud of you.

Have fun!

Electronic load using ESP8266

Intro

Long time, no see. Well, let me tell you a thing, parenting is the end of your free time to do your stupid projects. It took me a few months just to complete a very small stupid project like this… Anyway, this time I wanted to build an electronic load. There are so many designs out there with various different user interfaces, but I wanted something more useless than a pot to control the output current. Therefore, I though why not build an electronic load with a web interface? And which is the best option for that? Of course, esp8266. Well… not really, but we’ll see why later. Still, this is build on the esp8266 anyways.

Electronic load circuit

There are various ways to make an eload, but the easiest way to use a N-MOSFET and an opamp that it’s negative feedback drives the Gate-Source of the MOSFET. This is a screenshot of the main circuit:

You can find the kicad project for the eload circuit here:

https://bitbucket.org/dimtass/esp8266-web-eload/src/master/kicad/eload/

In the above circuit there’s a first stage opamp that is driven from a DAC. This opamp amplifies the signal with a gain of 2.2 and then the second stage opamp drives the MOSFET. The source of the MOSFET is connected on the PSU under test and the source to an array of parallel 10Ω/1W resistors, which makes an effective 1Ω/10W resistor. The gate and source of the MOSFET are connected in the negative feedback loop of the opamp. This means the opamp will do what opamps do and will “mirror” the voltage on the (+) input to the (-) input. Which means that whatever voltage is applied on the (+) input then the opamp will drive the output in a way that both inputs have the same voltage. Because the gate and the source of the MOSFET are part of the feedback loop, then in this circuit it means the voltage on the source will be the same as the (+) input. Therefore, if you connect a load that is 1Ω then the current will be I=V/R and because R=1Ω, that means Ι=V. So, if 5V are applied in the (+) input then 5A will flow on the resistors.

It’s generally better to have resistors in parallel, because that means that the current will be split to these resistors, which means less heat and no need for extra cooling on a single resistor.

There is also another opamp which buffers the voltage on the negative loop and drives a voltage divider and an RC filter. We need a buffer there in order not to apply any load on the feedback, as the opamp will draw only a tiny current which doesn’t affect the performance of the circuit. The voltage divider is an 1/10 divider that limits the maximum 10V input to the max 1V allowed ADC input of the esp8266. Then there is a first grade RC filter with a pole at ~1Hz to filter out any noise, as we only care about the DC offset.

As the opamp is powered with a +10V then it means that if it’s a rail-to-rail opamp then we can have as much as 10A on the load. The circuit is design for a bit less though. But, let’s see the components.

Components

For this project I’ve decided to build a custom PCB board instead of using a prototype breadboard and connect the different components with solder. The reason for that was “low noise” for the ADCs and DACs. Well, that didn’t work eventually, but still it’s a nice board. The main components I’ve used are:

ESP-12E

This is the main component. This is the esp8266 module with 4MB flash and the esp8266 core which can run up to 160MHz. It has two SPI interfaces, one used for the onboard EEPROM and one it’s free to use. Also it has a 12-bit ADC channel which is limited to max 1V input signals. This is a great limitation and we’ll see later why. You can find this on ebay sold for ~1.5 EUR, which is dirt cheap.

ESP-12E

As you see the vendor is the AI-THINKER and the model is ESP8266MOD and most of the times this is what you find in ebay, but you may also find modules with no markings.

MCP6H04

This is an IC with 4x low noise single supply, rail-to-rail output opamps and a voltage range up to 16V, which is more than enough for the purpose. You can find the specs here. This costs around 0.85 EUR for single unit in Arrow or other electronic component distributors. I wouldn’t buy these ICs from ebay.

FDP6030BL

This is a N-channel logic level power MOSFET with a very low Rds(on) (0.018Ω @ 10V). We need a logic level MOSFET because the Vgs needs to be as low as possible. You can find all the specs here. Of course you can use any other MOSFET you like as long as it has the same footprint. That costs approx 1 EUR, but you can use a cheaper one.

MCP4821

This is a low noise 12-bit DAC with an internal VREF of 2.048V, a 2x gain capability and it also has an SPI interface. This thing is a beast. I love it. Easy to use and excellent precision for this kind of projects. This costs ~1.8 EUR. Well, yeeeeeah it’s a bit expensive, I know. We’ll get there soon.

FT232RL

This is a USB to serial IC that supports also the extra hardware signals like DTR and RTS. These are needed actually to simplify the flashing procedure of the esp8266, so you don’t have to use jumpers to put the chip in to programming mode. You can see the trick is used in the schematics, you just need two transistors and two resistors. This costs ~4 EUR… Another expensive component. Not good, not good. There are also other cheaper ones like the CP2102, but they are rare to find because they are very popular in the Chinese vendors that build those cheap USB to UART boards.

PSU

For the power supply we need a couple of stuff. Because the board is USB powered we get the 5V for free from the USB. But we also need 3.3V and 10V. For the first one it’s easy we just need a voltage regulator like the AMS1117-3.3 (or any other variation of the 1117 you can find). For the +10V that are needed to drive the opamps it’s a bit more complicated and for this purpose you need a charge pump IC like the SP6661. This IC can be connected in a way that can double it’s input voltage; therefore if we use the +5V from the USB we can get +10V. The cost of the SP6661 is ~1.5 EUR.

Project source code

The source code is really simple, because I’ve used the arduino libs for the esp8266, therefore I didn’t write much code. Although, for prototyping and also to do quick stuff the arduino libs are nice, I don’t like using them and I prefer to write everything baremetal. Actually, this is how I’ve started the project but at some point I figured out that it needed quite much time for a stupid project. Therefore, I decided to go with the arduino libs.

Originally, I’ve started with the cnlohr’s esp82xx, which although it’s an excellent start template to start with and it has quite a few nice things in there, like a custom small file system and websockets; at the same time it’s a bit tight to it’s default web interface and it’s quite an effort to strip out all the unused stuff and add your own things. Therefore, after already spending some time with this template and bulding on top of it, I’ve decided to go with the arduino libs for the esp8266 because in that case I just had to write literally a few lines of code.

You can find the source code here:
https://bitbucket.org/dimtass/esp8266-web-eload/src/master/

Also the kicad files are in there, so you can edit/order your own board. Read the README.md file for more info how to build the binary and upload the web server files.

Web interface and functionality

Before I’ve build the PCB I’ve tested most of the functionality on a breadboard, just to be sure that the main stuff is working. 

Noise, noise, noise

Well, although the esp8266 has an ADC, there are two problems. First it’s maximum input voltage is 1V, which makes it prone to noise from external interference. Second it’s crap. When esp8266 is using it’s WiFi radio to transmit then there’s a lot of noise in the ADC, so much noise that it makes the ADC readings completely unusable even with analog external filters or software low pass or IIR filters. The error can even as much as 4-bit and that’s total crap, especially on low ADC values. Here we can see an image that when the DAC value is 0, then the ADC input is that crap signal.

That’s more than 200mV of noise. But, if I disconnect the RC filter output from the ADC and just probe it, then I get this output.

The noise drops to 78mV, which is still high, but much better than before. Therefore, you see that there’s a lot of noise in the system and especially there’s a lot of noise when the RC filter output is connected on the ADC; which means that the esp8266 creates a lot of noise itself. That’s really bad.

Anyway, the components are already expensive and the price is getting out of the scope for a stupid project, therefore I prefer having this crap ADC input instead of spending more for a low noise SPI 12-bit ADC. In the end of the article I will tell you what is the best option and one of my next stupid projects.

Web interface

That was what the whole project was made for. The web interface… I mean, ok, all the rest of the things could be made much less expensive and with better results in terms of noise, but let’s proceed with this. This is the web interface that is stored in the esp8266 upper 1M flash area.

The upper half of the screen is the ADC read value (converted to volts) and the lower half is the output value of the DAC (also converted to volts). The DAC value is very accurate and I’ve verified with my Fluke 87V in the whole range. I’ve also added a libreoffice calc sheet with the measured values. In real time the ADC value is changing constantly on the web interface, even with the RC filter and also a software low pass filter.

The web interface is using websockets in the background. There are many benefits on using websockets and you can find a lot of references in the internet. The days before websockets, for similar projects, I had to use ajax queries, which are more difficult to handle and also they can’t achieve the high speed comms of the websockets. There are a couple of excellent videos with websockets and esp8266, like this one here.

To control the current that will flow to the load resistors you just need to drag the slider and select the value that you want. You can actually only trust the slider value and just use the ADC reading as an approximate value and just verify that the Vgs is close to the wanted value.

This is the setup on my workbench that I’ve used to build the circuit and run the above tests.

You may see that I’ve solder a pin header pin on the ADC of the esp8266 board. That’s because the Lolin esp8266 board has a voltage divider in the ADC pin in order to protect the input from the 3V3 voltage you may insert as this is the reference voltage for the board.

PCB board

As I’ve mentioned I’ve created a PCB board for that and I’ve ordered it from seeedstudio, which by the time had the cheapest price. So for 5 PCBs, including shipping cost me 30 EUR. This is a 3D image from kicad, but the real one is black. I’ll post it when I get it and assemble it.

As you can see I’ve exported the same pinout as the Lolin board to make it usabke also for other tasks or tests. The heatsink will be much larger as 9A is quite a lot, but I couldn’t find larger 3D model (I think there’s a scale factor for the 3D model in kicad when you assign the 3D object to the footprint, but anyway).

The maximum current can be set up to 4.095 * 2.2 = ~9A. This is the 2.048 VREF of the DAC, multiplied by 2 with the DAC’s gain and multiplied by 2.2 from the gain of non-inverting opamp amplifier stage. Beware that if you tend to use that high current then you need a very good cooling solution. One way to limit the current in order not to make a mistake is to change the code and use the 1x gain and change also the values in the eload.js javascript file in the web interface. That change will limit the current to 4.5A.

Conclusion

Well, this was a really stupid project. It’s actually a failure in terms of cost and noise. It would be much better to use a microcontroller and replace most of the parts in there. For example instead of using a DAC, the USB-to-serial IC or an external ADC for less noise, you could use a microcontroller that has all those things integrated and also costs much less. For example a nice mcu for this would be the STM32F373CBT6 which has two 16-bit ADCs, two 16-bit DACs, supports USB device connection and other peripherals to use for an SPI display for example. This controller costs only around 5 EUR and it could replace a few parts. Also, you can implement a DFU bootloader to update both firmwares (STM32 and esp8266). Therefore, this might be a future stupid project upgrade. Actually, I’m trying to find an excuse to use the STM32F373CBT6 and build something around it…

Finally the prototype worked for me quite nice and I’ll wait for the board to arrive and test it with my PSU and see how accurate it is. I can’t see any reason why should anyone build this thing, but this is why stupid projects are all about, right?

Have fun!

Adding armbian supported boards to meta-sunxi Yocto (updated)

Intro

Yocto is the necessary evil. Well, this post is quite different from the others, because it’s more related with stuff I do for living, rather fun. Still, sometimes they can be fun, too; especially if you do something other than support new BSPs. Therefore, this post is not much about Yocto. If you don’t know what Yocto is, then you need to find another resource as this is not a tutorial. On the other hand if you’re here because the title made you smile then go on.

Probably you’re already know about the allwinner meta layer, meta-sunxi. Although sunxi is great and they’ve done a great job the supported boards are quite limited. On the other hand, armbian supports so many boards! But if you’re a Yocto-man then you know that this doesn’t help much. Therefore, I thought why not port the u-boot and kernel patches to the meta-sunxi layer and build images that support the same allwinner boards as armbian?

And the result was this repo that does exactly that. Though it’s still a work in progress.

https://bitbucket.org/dimtass/meta-allwinner-hx/

This repo is actually a mix of the meta-sunxi and armbian and only supports H2, H3 and H5 boards from nanopi and orange-pi. The README.mdis quite detailed, so you don’t really need to read the rest post to bring it up and build your images.

More details please?

Yes, sure. Let’s see some more details. Well, most of the hard work is already done on armbian and meta-sunxi. In the armbian build, they have implemented a script to automatically patch the u-boot and the kernel and also all the patches are compatible with their patch system. Generally, the trick with armbian is that actually deletes most of the files that it touches and apply new ones, instead of patching each file separately. Therefore, the patches sizes are larger but on the other hand is much easier to maintain. It’s a neat trick.

The script that is used in armbian to apply the patches is in lib/compilation.sh. There you’ll find two functions, advanced_patch() and process_patch_files() and these are the ones that we would like to port to the meta-sunxi layer. Other than that, armbian uses the files in config/boards/*.conf to apply the proper patches (e.g. default, next, dev). Those are refer to the  patch/kernel. There, for example, you’ll find that sunxi has the sunxi-dev, sunx-next and sunxi-next-old folders and inside each folder there are some patches. If you build u-boot and and kernel for a sunxi-supported board, then you’ll find in the output/debug/output.log and output/debug/patching.log which patches are used for each board.

Therefore, I’ve just took the u-boot and kernel patches from armbian and implemented the patching system in the meta layer. To keep things simple I’ve added the patch functions in both u-boot and kernel, instead of implement a bbclass that could handle both. Yeah, I know, Yocto has a lot of nice automations, but some time it doesn’t worth the trouble… So in my branch you’ll find the do_patch.sh script in both recipes-bsp/u-boot/do_patches.shand recipes-kernel/linux/linux-stable/do_patch.sh. Both script share the same code, which is the code that is used also from armbian. The patches for u-boot and the kernel are in a folder called patches in the same path with the scripts.

Last but not least, I’ve also to add the option to create .wic.bz2 and .bmap images. Please use them. If you want lightning-fast speed when you flash images to SD card or eMMC.

Conclusion

If you want to use Yocto to build custom destributions/images for allwinner H2, H3 and H5, then you can use this meta layer. This is just a mix of the meta-sunxi layer and the patch system from armbian, which offers a much more wider board support. For now I’ve ported most of the nano-pi boards that use H2, H3 and H5 cpus and soon I’ll do the same for the orange-pi board (update: done). For the rest boards (A10 etc) you can still use the same layer.

Also support for wic images and bmap-tools is a good to have, so use it wherever you can.

Have fun!

Driving an ILI9341 LCD with an overclocked stm32f103 (updated)

Intro

LCDs… I think LEDs and LCDs are probably the most common wet dream of people who like playing with embedded. I mean, who doesn’t like blinking LEDs and furthermore who doesn’t like to draw graphics with a microcontroller on an LCD? If you don’t press Alt+F4 now please.

LEDs are easy. Toggling a pin or flashing a LED is the first breath of every project. There you know that your microcontroller’s heart is beating properly and it doesn’t have arrhythmia. Then, driving a character LCD is a bit more harder, but still easy; but driving a graphic LCD is definitely more harder, especially if you’re starting from the scratch. Do you have to start from the scratch, though? Nah… There are many projects out there and also this is another one.

OK, so what’s the motivation behind this, if there are so many projects out there? For me it was the fact that I don’t like to use all these arduino-like stuff with my STMs, I don’t like HAL and I couldn’t find a proper cmake project that builds out of the box, without special dependencies like specific IDEs, compilers e.t.c. With this project you just download cmake and a gcc compiler, point the cmake toolchain to your gcc compiler and run build. Then it works… (maybe)

Of course, if you are regular customer here, there’s no need to say that this is a completely stupid project. It does nothing.  No, really. You won’t see any fancy graphics that other people post with their STMs in youtube. You’ll see just a yellow screen. Why? Because I just wanted to benchmark and have a template to use for any other project.

Note: I’ve updated the code and post, because I’ve added support for the xpt2046/ads7843 touch controller. I’ve used SPI2 with DMA to read the touch sensor and also I’m using the /PENIRQ interrupt pin and not polling.

Overclocking, SPI, DMA and other fancy buzzwords

If you found this by searching on the web, then you’re probably here because you know exactly what you want. SPI & DMA!!1! The reason that I like bluepill stm32 boards is that the have a lot of DMA channels and they are dirt-cheap. On top of that you can overclock it up to 128MHz!

So, why DMA is important? Well, I won’t bother you with the details here, if you need to know more than it’s much much faster, then you need to do some web-searching for the specific details, as there are people that already explain these stuff better than me. The fact is that by using DMA on SPI’s tx/rx the transfer speed sky-rockets and you can achieve the maximum available bandwidth.

On the other hand, overclocking is more interesting. The stm32f103 can be easily overclocked by just changing the PLL value. Then you can increase the main clock from 72MHz to 128MHz. Not bad, right? Especially if you think that a project which drives a graphic LCD will benefit a lot from this speed increase. I assume that you’re not crazy enough to do that on a commercial project, but if you do then you’re my hero.

In this project I’ve done some benchmarking with two different system clocks @72MHz and @128MHz and there’s a significant difference as you see later in the benchmarks.

Components

STM32

I’m using an stm32f103c8t6 board (bluepill). These modules cost less than €2 in ebay and you may have already seen me using them also in other stupid projects.

ILI9341

There is a very standard LCD module that you find in ebay and it costs around $7. It’s 2.8″ TFT, supports 240×320 resolution, it has a touch interface and an sd card holder. The part name is TJCTM24028-SPI and is the following one:

It’s a beauty, right?

USB-uart module

You need this to print the FPS count every second and also if you want to add your custom commands. You can find these on ebay with less than€1.50 and it looks like that

ST-Link

Finally, you need an ST-Link programmer to upload the firmware, like this one:

Or whatever programmer you like to use.

Pinout connections

As the LCD is not wireless you need to connect it somehow to your stm32. We’re lucky in this case, because both LCD and the stm32 have conductive pins that if they’re connected with each other in the proper way then it may work. The connections you need to do are:

STM32 ILI9341 (LCD)
PA0 LED
PA2 RESET
PA3 D/C
PA4 CS
PA5 SCK
PA6 SDO(MISO)
PA7 SDI(MOSI)
3.3 VCC
GND GND
STM32 ILI9341 (touch conrtoller)
PB8 T_IRQ
PB14 T_DO
PB15 T_DIN
PB12 T_CS
PB13 T_CLK
STM32 UART module
PA9 (TX) RX
PA10 (RX) TX
GND GND

You can power the stm32 from the USB connector.

Project source code

You can download the project source code from here:

https://bitbucket.org/dimtass/stm32f103-ili9341-dma/src/master/

All you need to do is install (or have already installed) a gcc toolchain for ARM. I’m using the gcc-arm-none-eabi-7-2017-q4-major, that you can find here. Just scroll down a bit, because there are newer toolchains; but from the tests I’ve done here, it seems this produces the most compact code. Then depending your OS, change the path of the toolchain in the TOOLCHAIN_DIR variable in the project’s file cmake/TOOLCHAIN_arm_none_eabi_cortex_m3.cmake. Last, run ./build.sh on Linux or build.cmd on Windows to build and then flash the bin/hex on the stm32.

Results

The SPI speed of the stm32f103 by using DMA can achieve an SPI clock up to 36MHz when the mcu is running on the default highest frequency which is 76MHz. That’s really fast already compared to other Cortex-M3 mcus with the same frequency (even faster mcus). To use the default 72MHz clock you need to comment out line 47 in main.c here, otherwise the clock will set to 128MHz.

This is a capture of CLK/MOSI when sending the byte 0xAA.

With the above clock settings, the stm32f103 achieves 29 FPS when drawing all the pixels of the screen.

By overclocking the mcu to 128MHz the SPI/DMA speed is much higher. To enable the oveclocking you need to un-comment line 47 here (SystemCoreClock = overclock_stm32f103();), which by default is already enabled.

This is a capture of CLK/MOSI when sending the byte 0xAA.

Now you can see that the SPI clock frequency is now ~63MHz, which is almost the double as the previous. That means that the updating of all the pixels on the screen can be done in a rate of 52 fps, which is quite amazing for this €2 board.

Touch controller (UPDATE)

I’ve lost my sleep knowing that I didn’t implemented the touch control interface, therefore I’ve updated the code and added support for the touch controller and also a calibration routine.

So, now there are two function modes. The first one (and default) is the benchmark mode and the other mode is the calibration. During the calibration mode you can calibrate the sensor and you need to do that if you want to retrieve pixel x/y values. Without calibration you’ll only get the adc sensor values, which may or may not be what you want.

To enter the calibration mode you need to send a command to the uart port. The supported commands are the following:

MODE=<MODE>
     BENCH : Benchmark mode
     CALIB : Calibration mode

FPS=<MODE>
    0 : Disable FPS display
    1 : Tx FPS on UART
    2 : Tx FPS on UART and on display

TOUCH=<MODE>
      0 : Do not Tx X/Y from touch to UART
      1 : Tx X/Y from touch to UART

The default values are MODE=BENCH, FPS=0, TOUCH=0. Therefore to enter to the calibration mode send this command to the UART: MODE=CALIB.

The calibration routine is very simple, so do not expect fancy handling in there. Even the switch statement in `ili9341_touch_calib_start()` is not needed as it’s a completely serial process. I was about to implement a state machine, but it didn’t worth it, but I left the switch in there.

So, when you enable the calibration mode, you’ll get this screen.

Then you need to press the center of the cross. Behind the scenes, this code is in the ili9341_touch_calib.c file, inside the ili9341_touch_calib_start() function, which at the state STATE_DRAW_P1 draws the screen and then waits in STATE_WAIT_FOR_P1 for the user to press the cross. I’ve  added bouncing control in the  xpt2046_polling() function, but generally the xpt2046 library doesn’t have. So the xpt2046_update() function which updates the static X/Y variables of the lib doesn’t have de-bouncing. The reason for this is that this provides a generic solution and some times de-bouncing is not always wanted, so if it does it can be implemented easily.

Anyway after pressing the point on the screen then other 2 points will be appear and after that the code will calculate the calibration data. The calibration is needed if you want to get the touch presses expressed in pixels that correspond to the screen pixels. Otherwise, the touch sensor only returns 12-bit ADC values, which are not very useful if you need to retrieve the pixel location. Therefore, by calibrating the touch sensor surface to the LCD, you can get the corresponding pixels of the pixels that are just under the pressure point.

The algorithm for that is in the touch_calib.c file and it actually derives from an application note from TI which is called “Calibration in touch-screen systems” and you can probably find it in this link. The only thing worth a note is that there are two main methods for calibration which are the 3 and 5-point. Most of the times you find the 5-point calibration method, but also the 3-point gives a good result. There a calc file in the sources source/libs/xpt2046-touch/calculations.ods that you can use to emulate the algorithm results. In this file the (X’1,Y’1), (X’2,Y’2), (X’3,Y’3) are the points that you read from the sensor (12-bit ADC values) and (X1,Y1), (X2,Y2), (X3,Y3) are the pixels in the center of each cross you draw on the screen. Therefore, in the ods file I’ve just implemented the algorithms of the pdf file. The same algorithms are in the code.

After you finish with the calibration, the calibration data are printed out in the serial port and also are stored in the RAM and they are used for the touch presses. After a reset those settings are gone. If you want to print the ADC and pixel values (after the calibration) on the serial port then you can send the TOUCH=1 command on the UART. After that every touch will display both values.

Finally, if you enable to show the FPS on the LCD you will see something weird.

This displays the FPS rate of drawing all the pixels on the screen and you see a tearing in the number, which is 51. This happens because the number print overwriting that part of the screen when is displayed and also it gets overwritten from the full display draw. The reason for that is to simplify the process and also do not insert more delay while doing the maths.

Generally, I think the code is straight forward and it serves just a template for someone to start writing his code.

Conclusion

stm32f103, rocks. That’s the conclusion. Of course, getting there is not as easy as using an arduino or a teensy board, but the result pays off the difficulty of getting there. And also when you get there is not difficult anymore.

Again, this is a really stupid and completely useless project. You won’t see any fancy graphics with this code, but nevertheless is a very nice template to create your own graphics by using an optimized SPI/DMA code for the stm32f103 and also with the overclocking included. That’s a good deal.

Of course, I didn’t wrote the whole code from the scratch, I’ve ported some code from here which is a port from the Adafruit library from the arduino. So it’s a port of a port. FOSS is a beautiful thing, right?

Have fun!

Added macros to CuteCom

Intro

The last years I’m only using Linux to my workplace and since I’ve also started using only Linux at home, too; I’ve found my self missing some tools that I was using with Windows. That’s pretty much the case with everyone that at some point tries or tried the same thing. Gladly, because many people already do this more and more the last years, there are many alternatives for most of the tools. Alternatives can be either better or worse than the tool you we’re using, of course, but the best thing with FOSS is that you can download the code and implement all the functionality you’re missing yourself. And that’s great.

Anyway, one the tools I’ve really missed is br@y’s terminal. I assume that every bare metal embedded firmware developer is aware of this amazing tool. It’s everything you need when you develop firmware for a micro-controller. For embedded Linux I prefer putty for serial console, though. Anyway, this great tool is only for Windows and although you can use Wine to run it on Linux, soon you’ll find out that when you develop USB CDC devices then the whole wine/terminal thing doesn’t work well.

CuteCom

There are many alternatives for Linux (console and gui) terminal apps and I’ve used most of those you can find in the first 7 pages of google results. But after using Bray’s terminal for so many years, only one of them seem to be close enough to that; and that’s CuteCom. The nice thing with CuteCom is that it’s written with Qt, so it’s a cross-platform app and also Qt is easy and nice to write code.

Tbh, I’m familiar with Qt since Trolltech era and after the Nokia era. I’ve written a lot of code in Qt, especially for the Nokia n900 phone and Maemo OS. But since Nokia abandoned Maemo and also Meego (former Tizen), I’ve started doing other stuff. I was really disappointed back then, because I believe n900 and Maemo could be the future until everything went wrong and Nokia abandoned everything and adopt Windows for their mobiles. I’ll moan another time for how much Microsoft loves Linux.

Anyway, Qt may also affected my decision to go with CuteCom, but the problem was that the functionality that I was using most from Bray’s terminal wasn’t there. And I mean the macros. Let me explain what macros are. Macros are just predefined data that you can send over the serial port by pressing the corresponding macro button. And also you can use a timer for every macro and send it automatically every x programmable intervals in milliseconds. That’s pretty much all you need when you developing a firmware. But this functionality was not implemented yet in CuteCom.

Therefore, I had to implement it myself and also find an excuse to write some Qt again.

Result

I’ve branched CuteCom from github and added the macro functionality in here:

https://github.com/dimtass/CuteCom

I’ve done a pull request, but I can’t tell if it gets merged or not. But anyways if you are a macro lover like myself, then you can download it from the above branch.

Edit: Macros are now merged to the master git branch, thanks to Meinhard Ritscher.

https://github.com/neundorf/CuteCom

I’ll add here a couple of notes how to build it, because it’s not very clear from the README file. You can either clone the repo and use QtDesigner to load the project and and build it, or you can use cmake. In case you use cmake you need the Qt libs and header (version >= 5) in your system.

If you don’t have Qt installed then you need to do the following (tested on Ubuntu 18.04):

git clone https://github.com/neundorf/CuteCom
cd CuteCom
sudo apt install cmake qtbase5-dev libqt5serialport5-dev
cmake .
make
make install

This build and install cutecom in /usr/local/bin/cutecom. Then you can create a desktop launcher

gedit ~/.local/share/applications/CuteCom.desktop

And add the  following:

#!/usr/bin/env xdg-open
[Desktop Entry]
Version=1.0
Type=Application
Terminal=false
Icon[en_US]=cutecom
Name[en_US]=CuteCom
Exec=/usr/local/bin/cutecom
Comment[en_US]=Terminal
Name=CuteCom Comment=Terminal
Icon=cutecom

If you have installed another Qt SDK then you can just point cmake there and build like this:

cmake . -DCMAKE_PREFIX_PATH=/opt/Qt/5.x/gcc_64
make
sudo make install

This will be installed in `/usr/local/bin/cutecom` (try also `which` just to be sure…)

Finally, you’ll need a desktop icon or a launcher. For Ubuntu you can create a `CuteCom.desktop` file in your `~/.local/share/applications` path and paste the following:

#!/usr/bin/env xdg-open
[Desktop Entry]
Version=1.0
Type=Application
Terminal=false
Icon[en_US]=cutecom
Name[en_US]=CuteCom
Exec=env LD_LIBRARY_PATH=/opt/Qt/5.11.1/gcc_64/lib /usr/local/bin/cutecom
Comment[en_US]=Terminal
Name=CuteCom
Comment=Terminal
Icon=cutecom

The result should look like this:

Have fun!

Joystick gestures w/ STM32

Intro

Time for another stupid project, which adds no value to the humanity! This time I got my hands on one of these dirt-cheap analog joysticks on ebay that cost less than €1.5. I guess that you can make a lot of projects with them by using them as normal joysticks, but for some reason I wanted to do something more pointless than that. And that was to make a joystick that outputs gestures via USB.

By gestures I mean, that instead of sending the ADCs in realtime, instead support only the basic directions like up, down, left, right and button press and then send the gesture combinations through USB. If you’re using mouse gestures to your web browser, then you know what I mean.

OK, let’s see the components and result.

Components

STM32

I’m using a stm32f103c8t6 board. These modules cost less than €2 in ebay and you may have already seen me using them also in other stupid projects.

Joystick

You can find those joysticks in ebay if you search for a joystick breakout for arduino. Although they’re cheap the quality is really nice and also the stick feeling is nice, too. This is how it looks like

As you can see from the image there is +5V pin, which of course you need to connect to your micro-controller’s (the stm32 in this case) Vcc; which 3V3 and not only to +5V. The VRx pin it the x-axis variable resistor, the VRy is the y-axis variable resistor and the SW is the button. The switch output is activated if you press the joystick down. The orientation of the x,y axis is valid when you place the joystick to your palm while you can read the pin descriptions.

ST-Link

Finally, you need an ST-Link programmer to upload the firmware, like this one:

Or whatever programmer you like to use.

USB-uart module

You don’t really need this for the project, but if you like to debug or add some debugging message your own, then you’ll need this. You can find these on ebay with less than€1.50 and it looks like that

Making the stupid project

I’ve build the project on a breadboard. You can use a prototype breadboard if you want to make this a permanent board. I’ve added support for both USB and UART in the project. The USB, of course, is the easiest and preferred way to connect the device to your computer, but the UART port can be used for debugging. So, let’s start with a simple schematic of how everything is connected. This a screenshot from KiCad.

Therefore, the PA0 and PA1 are connected to VRx and VRy and they are set as ADC inputs. In the source code I’m using both ADC1 and ADC2 channels at the same time. The ADC1 channel is also using DMA, which is not really necessary as the conversion rate doesn’t need to be that fast, but I’m re-using code that I’ve already written for other projects. The setup of the ADCs is in the hw_config.c file in the source code. The ADCs are continuously convert the VRx and VRy inputs in the background as they are based on interrupts, but only every JOYS_UPDATE_TMR_MS milliseconds the function joys_update() updates the algorithm with the last valid values. The default update rate is 10ms, but you can trim it down to 1ms if you like. You can also have a look in the joys_update() function in joystick.c and trim the JOYS_DEBOUNCE_CNTR and JOYS_RECOGNITION_TIME_MS to your needs. The first one controls the debounce sensitivity and the second one the timeout of the gesture. That means the time in ms that the recognition timer will expire after the joystick is released in the center position and then send the recorded gesture.

The source code can be found here:
https://bitbucket.org/dimtass/stm32f103-usb-joystick

To build the code you need cmake and to flash it you need ST-Link. Have a look in the README.md file in the repo for details. Also you need to point to your arm toolchain.

Because I’m using the -flto -O3 flags, you need to make sure that you use a GCC version newer that 4.9

I’ve tested the code with this version:

arm-none-eabi-gcc (GNU Tools for Arm Embedded Processors 7-2017-q4-major) 7.2.1 20170904 (release) [ARM/embedded-7-branch revision 255204]

The code size is ~14.6KB.

Finally, this is an example video of how the joystick gesture performs.

Have fun!

GCC compiler size benchmarks

Intro

Compilers, compiliers, compilers…

The black magic behind producing executable binaries for different  kind of processors. All programmers use them, but most of them don’t care about the internals and their differences. Anyway, this post is not about the compiler’s internals though, but how the different versions perform regarding the size that is produced.

I’ve made another benchmark few months ago here, but that was using different compilers (GCC and clang) and different C libraries. Now I’m using only GCC, but different versions.

Size doesn’t matter!

Well, don’t get me wrong here, but sometimes it does. Typical scenario is when you have a small microcontroller with a small flash size and your firmware is getting bigger and bigger. Another scenario is that you need to sacrifice some flash space for the DFU bootloader and then you realize that 4-12K are gone without writting any line of code for you actual app.

Therefore, size does matter.

Compiler Flags

Compililers come with different optimisation flags and the -Os flag commands the compiler to optimize specifically for size.

OK, so the binary size matters only when you the -Os!

No, no, no. The binary size matters whatever optimisation flag you use. For example your main need may be to optimise for performance. An example is if you’re using a fast toggle gpio, e.g. implementing a custom bit-banging bus to program and interface an FPGA (like the Xilinx’s selectmap). In this case you may need the -O1/2/3 optimisation more than -Os, but still the size matters because you’re limited in flash space. So, two different compiler versions may have even 1KB difference for the same optimization level and that 1KB may be critical someday to one of your projects!

And don’t forget about the -flto! This is an important flag if you need size optimisation; therefore, all the benchmarks are done with and without this flag also.

Benchmarking

I’ve benchmarked the following 9 different GCC compiler versions:

  • gcc-arm-none-eabi-4_8-2013q4
  • gcc-arm-none-eabi-4_9-2014q4
  • gcc-arm-none-eabi-5_3-2016q1
  • gcc-arm-none-eabi-5_4-2016q2
  • gcc-arm-none-eabi-5_4-2016q3
  • gcc-arm-none-eabi-6_2-2016q4
  • gcc-arm-none-eabi-6-2017-q1-update
  • gcc-arm-none-eabi-6-2017-q2-update
  • gcc-arm-none-eabi-7-2017-q4-major

It turned out that all the GCC6 compilers performed exactly the same; therefore, without reading the release notes I assume that the changes have to do with fixes rather optimisations.

The code I’ve used for the benchmards is here:
https://bitbucket.org/dimtass/stm32f103-usb-periph-expander

This is my next stupid project and it’s not completed yet, but still it compiles and without optimisations creates a ~50KB binary. To use your toolchain, just change the toolchain path in the `TOOLCHAIN_DIR` variable in the `cmake/TOOLCHAIN_arm_none_eabi_cortex_m3.cmake` file and run ./build.bash on Linux or build.cmd on Windows.

Results

These are the results from compiling the code with different compilers and optimisation flags.

gcc-arm-none-eabi-4_8-2013q4

flag size in bytes size in bytes (-flto)
-O0 51908
-O1 32656
-O2 31612
-O3 39360
-Os 27704

gcc-arm-none-eabi-4_9-2014q4

flag size in bytes size in bytes (-flto)
-O0 52216 56940
-O1 32692 23984
-O2 31496 22988
-O3 39672 31268
-Os 27563 19748

gcc-arm-none-eabi-5_3-2016q1

flag size in bytes size in bytes (-flto)
-O0 51696 55684
-O1 32656 24032
-O2 31124 23272
-O3 39732 30956
-Os 27260 19684

gcc-arm-none-eabi-5_4-2016q2

flag size in bytes size in bytes (-flto)
-O0 51736 55724
-O1 32672 24060
-O2 31144 23292
-O3 39744 30932
-Os 27292 19692

gcc-arm-none-eabi-5_4-2016q3

flag size in bytes size in bytes (-flto)
-O0 51920 55888
-O1 32684 24060
-O2 31144 23300
-O3 39740 30948
-Os 27292 19692

gcc-arm-none-eabi-6_2-2016q4, gcc-arm-none-eabi-6-2017-q1-update, gcc-arm-none-eabi-6-2017-q2-update

flag size in bytes size in bytes (-flto)
-O0 51632 55596
-O1 32712 24284
-O2 31056 22868
-O3 40140 30488
-Os 27128 19468

gcc-arm-none-eabi-7-2017-q4-major

flag size in bytes size in bytes (-flto)
-O0 51500 55420
-O1 32488 24016
-O2 30672 22080
-O3 40648 29544
-Os 26744 18920

Conclusion

From the results it’s pretty obvious that the -flto flag makes a huge difference in all versions except GCC4.8 where the code failed to compile at all with this flag enabled.

Also it seems that when no optimisations are applied with -O0 then the -flto instead of doing size optimisation, actually created a larger binary. I have no explain for that, but anyways it doesn’y really matter, because there’s no point in using -flto at all in such cases.

OK, so now let’s get to the point. Is there any difference between GCC versions? Yes, there is, but you need to see that in different angles. So, for the -Os flag it seems that the GCC7-2017-q4-major produces a binary which is ~380 bytes smaller without -flto and ~550 bytes with -flto from the second better GCC version (GCC6). That means that GCC7 will save you from changing part to another one with a bigger flash, only if your firmware exceeds the size by those sizes with GCC6. But, what are the changes, right? We’re not talking about 8051 here…

But wait… let’s see what happens with the -O3 though. In this case using the -flto flag GCC7 creates a binary which is 1KB smaller compared to the GCC6 version. That’s big enough and that may save you from changing to a larger part! Therefore, the size matters also for other optimisation levels like the -O3. This also means that if your code size getting larger and you need the max performance optimisation, then the compiler version may be significant.

So, why not use always the latest GCC version?

That’s a good question. Well, if you’re writing your software from the scratch now, then probably you should. But if you have an old project which is compiling with an old GCC version, then this doesn’t mean that it will also compile with -Wall in the newer version. That’s because between those two versions there might be some new warnings and errors that doesn’t allow the build. Hence, you need to edit your code and correct all the warnings and errors. If the code is not that big, then the effort may not be that much; but if the code is large then it means that you may need to spend much time on it. It’s even worse if you’re porting code that is not yours.

Therefore, the compiler version does matter for the binary size for all the available optimisation flags and depending your code size and processor you might need to choose between those versions depending your needs.

Have fun!

Why I moved away from github

I bet in every language in the word there’s a phrase similar to that.

Opinions are like butt holes. Everyone has one, but nobody wants to hear about yours.

Anyway, that my blog, so that’s my butt-hole-opinion.

I don’t really hate Microsoft. I mean, I may speak dirty about them sometimes; but that’s my temperament, I don’t have anything against them. I’m using microsoft products since 1985. My first OS was MS-Dos 2.0, running on a Schneider PC with 640KB Ram and a 3.5″ 720K FDD. It was a beast! I was coding GW-Basic on my yellow-black CRT and I was enjoying my animations on the Hercules video card. Since then, I’ve seen and used almost every MS product. I enjoyed Windows 95, 98, NT, XP and Office 2003 a lot. Nowadays, I’m not using their products that much, except VS Code. Also Windows 10 seems to be OK’ish, but there are also many things that I don’t like in them. To me, it seems that when you need to do more advance things, then it’s like an NT-relic with a 2018 makeup. So, generally I have a meh opinion for them, but that’s all.

So, why I decided to leave GitHub? Well, firstly it was how it felt when I’ve read the headline. When I’ve read the news I was like “WTF? Why?” It left a bitter taste in the tip of my tongue. Usually, that’s enough for me to make logical decisions. But this time I said to my self, OK,  let’s analyze this a bit.

Therefore, I’ve asked my self.

Let’s say that this moment you decide to create your new repository and you have 3-4 options and one of them is Microsoft. Would you choose that?

Well, the answer for me is, No. No, I wouldn’t choose a Microsoft service. I don’t want to. I don’t like to add another Microsoft product/service to my life. Thanks.

Then, why also remain to GitHub?

Also, there are other things that are also important. Like the kind of information that will gain access. I don’t want Microsoft to trace what I do in my free time and which projects I wish to develop or contribute. I’m most certain that they will do data-mining in every data that they have available. It’s even worse for Microsoft employees, because now their employer will have access to that information.

Anyway, there are so many different butt holes around this story, that there’s not enough room for all. But for me personally, it was the first reaction when I’ve heard the news that mostly affect my decision and that was:

Oh, no! No more Microsoft in my life.

My repos are now located here:
https://bitbucket.org/dimtass

container_of()

Intro

A small post for one of the more beautiful and useful macros in C. I consider the container_of() macro to be the equivalent of the Euler’s identity for the C language. Like the Euler’s identity is considered to be an example of mathematical beauty, the container_of is considered to be an example of C programming beauty. It has everything in there and it’s more simple than it looks. So, let’s see how it works.

Explanation

Actually there are many good explanations of how container_of works, some of them are good, some just waste internet space. I’ll give my explanation hoping that it won’t waste more internet space.

The container_of macro is defined in several places in the Linux kernel. This is the macro.

#define container_of(ptr, type, member) ({ \
    const typeof(((type *)0)->member) * __mptr = (ptr); \
    (type *)((char *)__mptr - offsetof(type, member)); }h

That’s a mess right. So, lets start breaking up things. First of all, there are other two things in there that need explanation. These are the typeof and the offsetof.

The typeof is a compiler extension. It’s not a function and it’s not a macro. All it does is that during compile type evaluates or replaces the type of the variable that the typeof() has. For example, consider this code:

int tmp_int = 20;

Then typeof(tmp_int) is int, therefore everywhere you use the typeof(tmp_int) the compiler will place that with the int keyword. Therefore, if you write this:

typeof(tmp_int) tmp_int2 = tmp_int;

then, the compiler will replace the typeof(tmp_int) with int, so the above will be the same as writing this:

int tmp_int2 = tmp_int;

The offsetof is another beautiful macro that you will find in the Linux kernel and it’s also defined in several places. This is the macro

#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
The purpose of this function is to retrieve the offset of the address of a member variable of a structure. Let’s make that more simple. Let’s consider the following image.
This is a struct that has a green and a blue member. We can write this as
struct x {
     green a;
     blue  b;
};
Suppose that the tape measure is the RAM and each cm of the tape is a byte; then if I ask what’s the offset of blue member in the RAM then the answer is obvious and it’s 120. But what’s the offset of the blue member in the structure? In that case you need to calculate it by subtract 118 from 120, because 120 is the offset of the blue member in the tape (RAM) and 118 is the offset of the structure in the tape (RAM). So this needs to do a subtraction to calculate the relative offset, which is 2.
Now, lets do this.

What happened now? You see it’s the same structure but now I’ve slide the tape measure so the offset of the struct starts from zero. Now if I ask, what’s the offset of the blue member, then the answer is obvious and it’s 2.

Now that the struct offset is “normalized”, we don’t even care about the size of the green member or the size of the structure because it’s easy the absolute offset is the same with relative offset. This is exactly what &((TYPE *)0)->MEMBER does. This code dereferences the struct to the zero offset of the memory.

This generally is not a clever thing to do, but in this case this code is not executed or evaluated. It’s just a trick like the one I’ve shown above with the tape measure. The offsetof() macro will just return the offset of the member compared to zero. It’s just a number and you don’t access this memory. Therefore, doing this trick the only thing you need to know is the type of the structure.

Also, note that the 0 dereference doesn’t declare a variable.

Ok, so now let’s go back to the container_of() macro and have a look in this line

const typeof(((type *)0)->member) * __mptr = (ptr);

Here the ((type *)0)->member doesn’t declare or point to variable and it’s not an instance. It’s a compiler trick to point to the member type offset, as I’ve explained before. The compiler, by knowing the offset of the member in the structure and the structure type, knows also the type of the member in that index. Therefore, using the example of the tape measure, the typeof() the member on the offset 2 when the struct is dereferenced to 0, is blue.

So the code for the example with the tape measure becomes:

const typeof(((type *)0)->member) * __mptr = (ptr);
const blue * __mptr = (ptr);

and

(type *)((char *)__mptr - offsetof(type, member));

becomes

(x *)((char *)__mptr - 2);

The above means that the address of the the blue member minus the relative offset of the blue is dereferenced to the struct x. If you subtract the relative offset of the blue member from the address of the blue member, you get the absolute address of the struct x.

So, let’s see the container_of() macro again.

#define container_of(ptr, type, member) ({ \
    const typeof(((type *)0)->member) * __mptr = (ptr); \
    (type *)((char *)__mptr - offsetof(type, member)); }

Think about the tape measure example and try to evaluate this:

container_of(120, x, blue)

This means that we want to get a pointer in the absolute address of struct x when we know that in the position 120 we have a blue member. The container_of() macro will return the offset of the blue member (which is located in 120) minus the relative offset of blue in the x struct. That will evaluate to 120-2=118, so we’ll get the offset of the x struct by knowing the offset of the blue member.

Issues

Well, there are a few issues with the container_of() macro. These issues have to do with the some versions of gcc compilers. For example, let’s say that you have this structure:

struct person {
    int age;
    char* name;
};
If you try to do this:
struct person somebody;
somebody.name = (char*) malloc(25);
if (!somebody.name) {
    printf("Malloc failed!\n");
    return;
}
strcpy(somebody.name, "John Doe");
somebody.age = 38;
char* person_name = &somebody.name;
struct person * v =container_of(person_name , struct person, name);h

Then if you have a GCC compiler with version 5.4.0-6 then you’ll get this error:

error: cannot convert ‘char*’ to ‘char* const*’ in initialization
     const typeof(((type *)0)->member) * __mptr = (ptr);

Instead if you do this:

int * p_age = &somebody.age;
struct person * v =container_of(p_age, struct person, age);

Then the compiler will build the source code. Also, if you use a later compiler then both examples will be built. Therefore, have that in mind that the type checking thing is mainly a compiler trick and needs the compiler to handle this right.

Conclusion

container_of() and offsetof() macros are a beautiful piece of code. They are compact, simple and have everything in there. In 3 lines of beauty. Of course, you can use container_of() without know how it works, but where’s the fun then?