The Binary Transcendence

Thursday, November 25, 2010

Translating C Constructs to MSP430 Assembly Code

Function and its Parameters

The sample program is:

When a function is called, some housekeeping is normally done which appears as the function's prologue in the assembly code (this would not be true if the function is declared with the attribute "naked").

The procedure is:
1. The current value of r4 (used as frame pointer in MSP430 family) is pushed
   into the stack. The stack pointer (r1) automatically gets decremented by 2.
2. r1 gets decremented again by an offset, thus allocating a stack frame. The
   offset by which r1 gets decremented depends on the number of local variables
   in the called function.
   Here, r1 gets decremented by 4, since there are two local variables for fun(), a
   and b.
3. The current value of the stack pointer r1 is copied into r4 (the register r4 thus
   indicates the frame pointer for the currently executing function).
   All further manipulations of the local variables will be with reference to the
   frame pointer r4.
4. After the body of the called function is executed, the same offset as above is
   added back to the stack pointer r1, thus deallocating the stack frame.
5. The current value of r1 is popped into r4, thus retrieving the previous stack
   frame. The stack pointer r1 gets auto-incremented by 2.

The assembler directives __FrameSize and __FrameOffset gives the size and offset of the frame allocated for the function fun().

Pointers

The sample program is:

On generating the corresponding assembly code:

The code can be traced as follows:
1. The number 10 is stored in the memory address which is at an offset of 2 bytes
   from the location pointed to by the frame pointer r4.
2. The memory address of 10, i.e. the value of (r4 + 2), is stored in the memory
   location pointed to by r4.
3. The data in the location pointed to by the register r4 is safely interpreted as
   another memory address, and the number 20 is stored in this particular address.

And you thought "pointers" were magic !!!

Variables

The most simple case would be:

But on still expecting the prologue and epilogue:

The number 100 is stored in a memory location addressed with reference to the frame pointer r4.
The variable i is local to the function main() so no extra work.

Static Variables

The demonstration will be like:

Wondering how the assembly code would look like:

The static integer 200 is stored in a similar way as above, but to a different address. Also this address (label i.1194) is located in the .data section, instead of the usual .text section.

All global and static variables (which have their lifetime as long as the whole program), are stored in the .data section.

Pointers to Functions

Considering the sample code:

A simple pointer to a function accepting void and returning void is created. It is assigned the memory address of the function fun(). Then this pointer to a function f is called.

Awaiting the assembly code:

The pointer f is local to the function main(). Hence, a stack frame of size 2 is allocated, as expected. The value of f, i.e. the memory address of function fun(), is stored in the location pointed to by the frame pointer r4. This address is then simply passed to the call instruction.

The Arsenal Of An Embedded System Programmer

When confronting any new microcontroller or microprocessor or practicallly, a development board, there are some things to keep in mind before diving into the possibly arduous debugging session you are going to have with the system.

Bits and more bits ...

The Rules
1. Never fully trust anything written before you, anywhere you may see it.
2. If you are forced to behave otherwise, go back to Rule 1.

The Programmer's Model
   You can't possibly know everything about the interconnections, circuitry and other design specifications of the chip under concern and also the development board, when you work on it for the first time. But then, these factors aren't really much of a concern. What you primarily need is something else.

   Even if you don't know how the chip is built, you must know how you can control it, and also the features available at the higher level. You need to have a model of your own for the chip, called the Programmer's Model. Through this model, you must be aware of the following:

1. Homework
   Identify the manufacturer, family, type of device (value line, low/high density,
   etc ..), and architecture (von Neumann, Harvard, etc ..).
   Get the Family Manual, Device Specific Manual, and any other pdf that you may
   find useful.
   The manufacturer may also publish an errata sheet, which may become useful
   in some rare cases.

2. Instruction set
   Know whether the instructions are 16 or 32 bit. Be familiar with some common
   instructions.
   Understand the different addressing modes provided in the device.
   Are the instructions aligned by 2 or 4?
   Does it suite more to a RISC or CISC style?

3. The Memory Map
   Which are the memory areas for flash, RAM, interrupt vectors, peripheral
   registers and special function registers (SFRs)?
   Where is the starting location of stack stored?

4. Registers
   Which are the General Purpose Registers, Program Counter, Status Register and
   Special Function Registers?

5. The Vector Table
   Where is the interrupt vector table present? Which interrupts do the vectors
   represent in the table?
   Which is the reset vector?

6. The Modules
   Know the inbuilt modules in the package (ADC, Timer, USART, etc ..).
   All the Control Word Registers needed and how to manipulate them, will be
   usually specified in the Family Manual.

7. Modes of Operation
   In some systems, the processor by itself may operate in different modes.
   Also know about the various low power modes normally available for the system
   as a whole.

8. The Runtime Framework
   C is the normal choice for embedded programming.
   If interested, learn the device specific startup functions that are called before
   main().
   You may also write a simple linker script.

9. Potential Bugs
   Always keep an eye out for them. Unless it is something like a heisenbug for
   example, it can be traced down. The time taken depends.

The ARM Cortex-M3

The ARM Architecture

The ARM is a 32-bit reduced instruction set computer (RISC) instruction set architecture (ISA) developed by ARM Holdings. It was known as the Advanced RISC Machine, and before that as the Acorn RISC Machine. The ARM architecture is the most widely used 32-bit ISA in terms of numbers produced. They were originally conceived as a processor for desktop personal computers by Acorn Computers, a market now dominated by the x86 family used by IBM PC compatible and AppleMacintosh computers.

The ARM Cortex-M3

The ARM Cortex family is a new generation of processors that has a standard CPU and system architecture. Unlike other ARM CPUs, the Cortex family is a complete processor core in itself.

It comes in three series:
A series: For high end applications, using complex OS and user
   applications. It supports ARM, Thumb and Thumb-2
   instruction sets.
R series: They follow more of a RT system profile. They too
   supports ARM, Thumb and Thumb-2 instruction sets.
M series: For microcontroller applications, and other
   cost-sensitive projects. It supports only Thumb-2
   instruction set.

There is a relative performance level for all these devices, ranging from 1-8. The highest level for M series is 3.

The ARM Cortex-M3 provides the entire heart of a microcontroller, including timer, memory map, interrupt system, etc.
It has a Harvard Architecture, with about 4 GB total address space.

Operating Modes

In privileged mode, the CPU has access to the full instruction set.
In unprivileged mode, xPSR related functions and access to most registers in the Cortex processor control space are disabled.

Fig 1. The Cortex-M3 operating modes

Both the Thread and Handler modes execute in privileged mode.

Programmer's Model

The Cortex CPU RISC processor has a load/store architecture. To perform data processing operations, operands must be loaded into a central register file, and the data operations are performed on these registers, and the result stored back to memory.

Fig 2. The load/store architecture of Cortex-M3

Register File

There are sixteen 32-bit registers in the processor register file, with an extra 32-bit xPSR (Program Status Register).

Fig 3. The Cortex-M3 register file and xPSR

The Link Register (LR) stores the return address of each procedure call.

There are two stacks, main stack and process stack, to support the two operating modes. Register R15 is the Program Counter (PC).

Memory Map

The memory map for the code area, SRAM area, and the peripheral devices are shown below.

Fig 4. A portion of the Cortex-M3 memory map

Features

1. Unaligned memory access - The Cortex-M3 can make unaligned memory access, which ensures that SRAM is efficiently used.

2. Bit Banding - By this technique, direct bit manipulation can be performed on sections of peripheral and SRAM memory spaces, without the need for any special instructions (normal bit manipulations require READ, MODIFY, WRITE which is expensive in terms of number of cycles).

3. Nested Vector Interrupt Controller (NVIC) - It is a standard unit within the Cortex core, thus making the process of porting the code to different microcontrollers easier. It is designed to support nested interrupts and there are 16 levels of priority.

By the interrupt preemption technique, high priority interrupts can preempt low priority ones. By the tail chaining technique, successive interrupts can be added to the tail queue, thus reducing the latency in handling those interrupts.

Thursday, November 4, 2010

Analysing Jump Tables in MSP430 Assembly Code

Jump Table

A jump table is an array of pointers to functions or an array of assembly code jump instructions.

In assembling, jump tables are the most efficient method to handle switch statements with a large number of cases. The jump table is created only once and the required field in the table can be accessed simply by indexing.

Especially in embedded systems, where there is a heavy constraint in available memory, jump tables can be efficient while consuming lesser memory too.

Sample Program

Fig 1. Sample Program

The switch has only four cases, hence there is no need for a jump table. The cases are implemented simply as:

Fig 2. The switch implementation without jump table

The behavior is almost as expected.

Now, I need a switch with enough cases, to get the attention of the gcc compiler heuristics.

Fig 3. More cases for the switch

I have to check the corresponding assembly code generated for the above program, to be sure.

Fig 4. Lookup table created - PartI

Fig 5. Lookup table created - PartII

It worked, the compiler decided that a jump table is really essential now.

The "mov #1,@r4" line stores the value of variable "i". There are 8 cases, numbered from 0 to 7. Hence first "i", i.e, @r4 is compared with 8, for obvious reasons.

Analysing the 'jump table'

The jump table has been created, starting from the address denoted by the label ".L11".

The first entry in the table holds label ".L3" which is the starting address of the block of statements under "case 0:".
The next entry is ".L4", which is the starting address of the block of statements under "case 1:".
And so on ... Till "case 7:".
There are 8 ".word"s in the jump table too. Correct!

The line N in the jump table holds the starting address of the block of statements under the corresponding "case N:". In other words, each line is the offset to be added to ".L11", to execute the required case statements.

Decoding ...

"r15" holds the value to be switched.

"rla r15" rotates left arithmetically the value inside r15, once (multiplication by 2).
Remember that even addressing is required for MSP430 family.

"add #.L11,r15" adds the present value of "r15" (similar to offset), with the address of the label ".L11" (similar to base address).
"r15" now contains the address of the line that lies at the given offset from ".L11".

After the "mov @r15,r15" line, "r15" now contains the starting address of a block of statements under the selected "case".

"br r15" simply branches to the address pointed to by r15.

Clean.

Issues

How can you justify that jump tables are friendly to embedded systems?

Its true that a jump table has a particular overhead for itself.

Suppose there are a very large number of switch cases. Then, this "jump to index" overhead will be much lower than the cost to perform N case comparisons. That is why jump tables are usually preferred.

Jump tables work only when the case identifiers are consecutive.

For example, case 1, case 2, case 3, etc ...

In situations where the cases are random and spread over a large range, suitable searching methods are needed. Normally, binary search is used. The correct case can then be selected in 3 or 4 steps without performing N comparisons each time.

Wednesday, November 3, 2010

Assembling in MSP430G2231

Sample Program

mov #0x0260,r5
mov #0x0270,r6

   Loop:
cmp #0,@r5
jz End
mov @r5,@r6
incd r5
incd r6
jmp Loop

   End:
mov #0x01,&0x22
mov #0x01,&0x21

This code demonstrates a simple implementation of 'strcpy' in msp430 assembly code. The first string is present in the location 0x0260. It is to be copied to another memory location starting from 0x0270. The RAM area of MSP430G2231 lies in the range 0x0200 to 0x027F.

Whats worth noticing is the ease with which some operations are defined which are otherwise very difficult in other assembly codes.

The program exits gracefully by lighting the red led, after successfully copying the string.

Notations

# - This symbol is used to indicate a pure number. The
   number can be an integer, in binary or a
   hexadecimal.
   For example, "mov #0x0260,r5" will move the hex
   number 0260 to register r5.

@ - It can happen that, the data stored in a register is
   the address of another memory location. The actual
   value inside this address can be accessed by using
   the '@' symbol. When '@' is used, the value in a
   register is interpreted to be the address of a memory
   location, and the actual data present in this location
   is fetched.
   The line "cmp #0,@r5" compares the number 0 with
   the data in the memory location pointed to by the
   value of r5.

& - When the address of a location is to be used directly,
   the '&' symbol is used. If not, the address is
   interpreted as just a number, thereby generating errors.

Notable Feature

@ and again @
   The line "mov @r5,@r6" is simple, sleek, easy-to-understand, self explanatory and normally illegal in other assembly languages.

   Technically, the '@' operation is emulated for the destination part. The "mov @r5,@r6" line will be changed to "mov @r5,0x0(r6)" after running msp430-gcc.

Conclusion

The MSP-EXP430G2 Launchpad (TI) for the MSP430 family

Altogether, there are only 27 instructions with about 7 addressing modes in the MSP430 family, which are easy to grasp and employ.

Coding in MSP430 family is fun!

Tuesday, October 26, 2010

Remote Debugging the MSP-EXP430G2 LaunchPad from TI

Remote Debugging in GDB

There is an inbuilt ability for gdb to also debug programs that reside in remote machines using a gdb-specific protocol. The remote machine is connected to the host via a serial line, or through a port. This remote connection is called a gdb proxy.

While inside gdb, give as:
(gdb) target remote localhost:2000

This would enable gdb to perform all debugging operations on a program connected to the localhost machine through the port 2000.

There is a prerequisite that the machine that is to be present in the same port must have set permissions for an external debugger.

Sample Program

Fig 1. Sample program - led1.c

This sample program named 'led1.c', is only used to demonstrate remote debugging.

The Preparation

Connect the LaunchPad to the system. Now the sample program is cross-compiled, and downloaded into the LaunchPad.
For further details, refer:
switching-on-launchpad-leds.html

For necessary reasons, I am calling the current terminal, "Terminal1".

The 'mspdebug' has a built-in command that enables it to run a GDB remote stub on a specified TCP/IP port. If no port is specified, 2000 is taken as default.

Give as:
   (mspdebug) gdb

A message will be displayed as:
   Bound to port 2000. Now waiting for connection...

At this time, open another terminal. I'm calling it "Terminal2".
In Terminal2, give as:
   msp430-gdb -q a.out

Here, 'a.out' is the LaunchPad-specific executable binary obtained by cross-compiling the above sample program.

Now, connect to the remote machine already waiting in port 2000 as:
   (gdb) target remote localhost:2000

An acknowledgement message will be displayed as:
   Remote debugging using localhost:2000
   0x0000fc00 in _reset_vector__ ()

If you check back in Terminal1, messages similar to the following will have been displayed:
   Client connected from 127.0.0.1:47558
   Clearing all breakpoints...
   Reading 2 bytes from 0xfc00

The current states of the two terminals is as shown:

Fig 2. Terminal1 (left side) and Terminal 2 (right side)

On listing 'led1.c' in Terminal2, the memory addresses from which the bytes are read will be displayed in Terminal1, simultaneously.

Fig 3. Listing the sample program

Fig 4. Single stepping through runtime libraries

On further single steps from this point, the runtime libraries through which the control passes until main( ) is reached, can be observed directly !!!
Notice that a considerable number of bytes have been read.

Now, single step till the instruction 'P1OUT = 0x01' is reached.

The Action

At this point, the next single step will cause it to execute, which will pass a high voltage (binary 1) to the red led on the LaunchPad, i.e., do it, and see the red LaunchPad led (P1.0) flash bright !!!

On next step, a binary 0 is passed to P1.0, causing it to be off.

Single step again, and see the green LaunchPad led (P1.6) flash before your eyes !!!

Turn it off too, and keep on single stepping, until you relish the wonderful thing thats happening infront of you ... This is GDB at its best !!!

At all these points, the memory addresses from which reading takes place are displayed in Terminal1.

N. B.

Properly exit from both mspdebug in Terminal1 and GDB in Terminal2, before disconnecting the LaunchPad from the system.
Exit from GDB in Terminal2 first, and then mspdebug in Terminal1.

Addendum

It was one of the cutest moments, to actually 'see' GDB in work.

I am crazy on LaunchPad !!!

Switching on the LaunchPad LEDs ...

Installation

The basic amenities are:

mspgcc
libusb-dev
libreadline-dev
mspdebug

After these are installed, msp-430-gcc or msp-430-gcc-4.4.3 can be used to cross-compile the C code.

The 'libusb-dev' library contains necessary libraries for sucessfully connecting the LaunchPad through the USB cable.

The 'libreadline-dev' library is for enabling history for the commands typed inside the 'mspdebug' environment.

Ensure that the LaunchPad has been detected by:

dmesg | tail

Your LaunchPad will be assigned to the device: /dev/ttyACM0.

The mspdebug is used for interacting with, erasing or burning the flash memory of the MSP chip. It also allows to debug the downloaded program present inside the MSP chip flash memory, through the inbuilt JTAG or Spy-By-Wire support.

The eZ430-RF2500 tool of the mspdebug supports the USB connection and also provides Spy-By-Wire debugging.

Switch on your LEDs !!!

A sample program led2.c, which lights up both the LEDs on th LaunchPad when the switch S2 on P1.3 is pressed.

Fig 1. The sample program

First, cross-compile the code.

msp430-gcc-4.4.3 -g led2.c

Now connect the LaunchPad. Then:

sudo mspdebug rf2500

Fig 2. Inside mspdebug

To download the code, use:

(mspdebug) prog a.out

Fig 3. Downloading the code

Now run it, as:

(mspdebug) run

Fig 4. Running the code

Pages