Microcontroller Embedded C Programming Lecture 150| Assembly code analysis of packed and non packed structure

  • Post author:
  • Post category:Blog

 

Assembly code analysis of packed and non packed structure

 

Packed Structure: In a packed structure, the compiler does not insert any padding between structure members to align them with memory addresses. This results in a compact memory layout but can cause performance penalties due to unaligned memory accesses.

Non-Packed Structure: In a non-packed structure, the compiler adds padding between structure members to align them with memory addresses. This ensures better memory access performance but results in a larger memory footprint.

 

In this article, let’s do one exercise where we are going to do an assembly code analysis of packed and non-packed structure access.

Figure 1. Assembly code analysis of packed and non-packed structure access
Figure 1. Assembly code analysis of packed and non-packed structure access

 

 

Assembly code analysis of non-packed structure:

I have one project 009packed_Vsnonpacked. Here I created one non-packed structure. After that, I created a global variable of that and initialized every member element of that variable, then I printed that. This project I wrote for my target hardware, as shown in Figure 2. 

Let’s analyze the disassembly of that code. The right-hand side shows the disassembly of those initializations. 

Figure 2. Non-packed structure code
Figure 2. Non-packed structure code

Let’s start with line 26. You can see that, I am copying data 0xAA into the data.data1 variable. ldr, movs, strb are the instructions generated. You can see that, there is an instruction store b(strb), which means storing a byte in memory. Here the compiler used strb instruction because we are dealing with a byte.

In the same way, check the instructions for the next one(line 27). Here we are putting a word value into a variable. That’s why you can see that it is using str instruction.

‘STRB’ is an instruction to ‘store a byte’ into memory

‘STR’ is an instruction to ‘store a word’ into memory

So, depending upon the data type the compiler uses different instructions, which are supported by the ARM instruction set architecture. 

For byte it uses ‘strb’, for word it uses ‘str’, and for the short, you can see that it uses ‘strh’. 

strh’ means store a half word. ‘h’ stands for half word. 

 

And you should also note that,

For the ‘STR’ instruction the address operand must be word-aligned

For the ‘STRH’ instruction the address operand must be half word-aligned. That means the address that you give as an operand must be half-word aligned. Otherwise, the instruction will not work. That’s a restriction on these instructions. 

‘STR’ works only with word-aligned addresses, and ‘STRH’ works only with halfword-aligned addresses.

That’s a reason why all your data gets stored in the memory in an aligned fashion.

 

Assembly code analysis of packed structure:

Let’s analyze the same disassembly when we use an unaligned packed structure.

 In packed structure, you have to see one attribute, that is __attribute__((packed)); as shown in Figure 2.

This you have to use along with the structure definition. This structure becomes a packed structure which leads to unaligned data storage.

Figure 3. Packed structure code
Figure 3. Packed structure code

Here is a disassembly of packed structure access. This looks different from the previous one(Figure 1). 

Let’s start with line 26. We are putting 0xAA into memory, where the data1 variable is located. You can see that strb r2, [r3, #0] is an aligned transaction because it’s a byte of data. So, strb is used. 

But take a look at the codes generated for the next statement(Line 27). In the earlier statement, there were only 3 instructions, but here you can see that now we have a lot of instructions.

 Why is that? This clearly shows that accessing unaligned data increases your code size. The compiler has added lots of instructions here.

 

Why is that?

The reason is to store a word in memory only STR instruction is sufficient. But, here STR cannot be used. Because, as I said previously STR works with only word-aligned addresses. But here you are trying to store a word value in its non-natural boundary address. That’s a reason STR can’t be used here. 

The transaction will be divided into a number of strb. So, that’s the reason why you see four strb instructions here(line 27). 

So, by using a packed structure, you may save some memory. But you increased the code size and also made the processor talk to memory a lot of times, which decreases the performance. In the previous case, the processor was accessing memory only one time to store a word data, and now it is accessing the memory 4 times.

In the earlier case, the processor was talking to the memory only one time using STR. Now the processor is talking to the memory 4 times, there you lose lots of block cycles. So, that’s how the performance degrades if you use unaligned data access.

You see here(line 27 disassembly), STR is divided into 4 strb, and also for the data manipulation lots of ‘orn’ instructions have been used. That surely decreases the performance of your application.  So, adding all these instructions will surely increase the executable code size of your application.  

After that, take a look at the data3 statement(line 28), this is aligned access. That’s why, you see only 3 instructions here, strb is used. That’s fine.

And then check the data4 statement. Again this is unaligned access. So, here you are trying to put the short value into its non-natural boundary address. So, that’s why again two strb’s are used here. So, this would have been done easily by 1 strh as I showed in the earlier part of this article.

 

That’s proof that you know using unaligned access degrades performance and increases the code size of the final executable. That’s why you should be careful before you pack your structure. 

You can compare the code size here (as shown in Figure 1), for unpacked, it takes 5112 and for packed, it takes 5144, you see an increase in this number. 

In the following article, I will cover typedef with structure.

 

FastBit Embedded Brain Academy Courses

Click here: https://fastbitlab.com/course1

 

FastBitLab

The FastBit Embedded Brain Academy uses the power of internet to bring the online courses related to the field of embedded system programming, Real time operating system, Embedded Linux systems, etc at your finger tip with very low cost. Backed with strong experience of industry, we have produced lots of courses with the customer enrolment over 3000+ across 100+ countries.