Press enter to see results or esc to cancel.

Checksum: The Easiest Way to Find Data mismatch in Your IoT Project | Part 1

I frequently design IoT projects and write firmware for them. I faced a few weird data mismatch-related problems in these projects which I could overcome easily if I planned properly.  Today I will discuss an interesting bug of an IoT project related to the communication protocol. 

The system had a base station and hundred of sensor nodes. The sensor nodes store data in flash memory after a certain interval and send data to the base station. The  Base station stores that data upon receiving. In the whole path, there are few layers of communications. 

Checksum for Communication project in IoT Project

  1. Sensor node read sensors and store data in flash memory using SPI after a certain interval
  2. Sensor nodes read data from the Flash using SPI and transfer data to the nRF24L radio module using SPI
  3. nRF24L01 Radio module transfers data to the base station wirelessly
  4. Base station read data from nRF24L01 upon receive using SPI protocols and store data into the flash memory using SPI protocol.

In the whole data path, there are 4 layers of SPI communication and one layer of RF communication. As nRF24L01 has CRC to detect errors we can consider that RF communication is error-free. So we have to design our SPI communication such a way that we can detect data mismatch in any of the SPI communication protocols.  Before that let’s discuss checksum and other basic.

I frequently design IoT projects and write firmware for them. I faced a few weird data mismatch-related problems in these projects which I could overcome easily if I planned from the very beginning.  Today I will discuss an interesting bug of an IoT project related to the communication protocol.

The system I will focus on has a base station and hundred of sensor nodes. The sensor nodes store data in flash memory after a certain interval and send data to the base station at scheduled time. The  Base station stores that data upon receiving. In the whole path, there are few layers of communications.

https://i0.wp.com/embedschool.com/wp-content/uploads/2021/05/nRF24-IoT-Project.jpg?resize=700%2C394&ssl=1
  1. Sensor node read sensors after a certain interval and store data in flash memory using SPI protocol.
  2. Sensor nodes read data from the flash memory using SPI protocol and transfer data to the nRF24L radio module using SPI protocol.
  3. nRF24L01 Radio module transfers data to the base station wirelessly
  4. Base station read data from nRF24L01 upon receive using SPI protocol and store data into the flash memory using SPI protocol.

In the whole data path, there are 4 layers of SPI communications and one layer of RF communication. As nRF24L01 has CRC to detect errors so we can consider that RF communication is error-free.

So one day while I was checking data I found that I was getting garbage data on the base station side. I checked all the SPI communication thoroughly. Code was perfect. The Radio communication was ok. So It took a lot of time to figure out which one was the real culprit. Later I realized that problem was in PCB design. Data is ok up to Base station radio. While the base station MCU reading data from the radio, the data was getting into garbage due to the SPI PCB tracing. Of course it was bad design. So It took lot of time to figure out the problems. But I could detect it easily if I check data after every communication protocols. That’s why checksum comes into play.

So I have to design my rest of the communication mechanism such a way that we can detect data mismatch in any of the SPI communication protocols.  Before getting started let’s discuss on checksum.

What is checksum?

So you have an array of data stream. Think about  that you have  following data packet,  data[] = [20, 67, 80, 45, 140] . You want to send this data packet from sensor node to base station. Throughout the path, there is a chance of data can be corrupted due to noise in the communication protocol or design flaw.

Think about analogy. You have given a cheque to someone. So how did back verify that the cheque is issued by you? The bank has your signature sample. So they will use your cheque signature with their sample. So here cheque is you data array that you want to send to the base station and you need to add a signature with your data array so that the base station can verify that data is correct.

There are many ways to calculate signature of a packet or file. The simple one that a microcontroller can handle is checksum. CRC is another option. But in the blog our main point is checksum.

So to calculate checksum,

  1. add all the bytes in the packet and keep the sum in 16 bit integer
  2. Keep only the lowest 8 bits from the 16 bits sum.

Consider the previous data.

data[] = [20, 67, 80, 45, 140]

20+67+80+45+140=352

352 Binary = 0b0000 0001 0110 0000

Keeping the only the lowest 8bits 0b0110 0000 =96

So 96 is our checksum signature.

Now we will add this signature with the existing data array.

data[] = [20, 67, 80, 45, 140, 96]

Now you have the checksum that is the signature of your data. Attach the checksum with your packet.

Now after any of the communication event the controller can calculate the checksum and match that with the attached checksum to identify data mismatch. So we have 4 layers of SPI communication. After each communication we have to calculate checksum and match the checksum with the attached packet checksum(96). So easy. Right??

This way we can identify any data mismatch. This way we can also identify that which parts is your main culprit.

In the next segment I will explain the whole scenario in details with source code.