Description

5/5 - (1 vote)

Objective

The objective of this experiment is to familiarize you with assembly language programming
concepts, ARM’s Cortex instruction set (ISA), assembly addressing modes and ARM CMSIS
library. The ARM calling convention will need to be respected, such that the assembly code can
later be used with the C programming language.

In the follow-up experiments, the code developed
here will be used in larger programs written in C. We will introduce the Cortex Microprocessor
Software Interface Standard (CMSIS) application programming interface that incorporates a large
set of routines optimized for different ARM Cortex processors.

Preparation for the Lab

To prepare for Lab 1, you will need to attend Tutorial 1, where you will learn how to create and
define projects, specifically purely assembly code projects. The tutorial will show you how to let
the tool insert the proper startup code for a given processor, write and compile the code, as well as
provide the basics of program debugging.

Elaborate details about debugging in Keil are provided
in the document entitled “Debugging with Keil” which will be uploaded on my courses.
Other documents that will be of importance include the Cortex M4 programming manual, and
Quick reference cards for ARM ISA, all available within the course online documentation. More
useful references are found in the tutorial slides. But since the reference material is huge
(hundreds of pages over the course of the semester), make sure to attend the tutorials so that the
TA will guide you where to look.

Background on FIR filter

The design of FIR filters using windowing is a simple and quick technique. There are many pages
on the web that describe the process, but many fall short on providing real implementation details.
All the required information to put together your own algorithm for creating low pass, high pass,
band pass and band stop filters based on a number of different windows are accessible online.

The
impulse response of an Nth-order discrete-time FIR filter lasts exactly N + 1 samples (from first
nonzero element through last nonzero element) before it then settles to zero. For a causal discretetime FIR filter of order N, each value of the output sequence is a weighted sum of the most recent
input values:
2
Where:
x[n] is the input signal,
y[n] is the output signal,
N is the filter order,
bi is a coefficient of the filter.
The Experiment

Part I – FILTER:

The filter will hold its coefficients (b0 + b1 + …) that are five single-precision floatingpoint numbers. The filtering loop should only get one value at each time and generate one output.
At each time, a new input must be added to the filter and the oldest one deleted. This technique
is called Moving Average. You should replace zero for values in the past that do not exists, for
example x[-1]. The function should look as follows. The input is integer but the output will be float.
FIR_C(Input, Output);

Input Vector
Filter, order = 4
Output Vector
B0 … B4
X[n]
y[n]
3

In the following parts, you should find the maximum, minimum and RMS values of the
filtered data.
Initial values
For the purposes of this lab, we will initialize the values b0 = 0.1, b1 = 0.15, b2 = 0.5, b3 = 0.15,
b4 = 0.1. The initial value of x is the first element of your input vector.

Part II – Assembly implementations

You are required to write an assembly subroutine “asm_math” in ARM Cortex M4 assembly
language that takes a one-dimensional input data and return the RMS, max_value, min_value,
max_index and min_index of the input data in a one-dimensional vector. You should naturally use
the built-in floating-point unit where needed by using the existing floating-point assembler
instructions.

Your assembly function will take three parameters:

1. A pointer to the input data array
2. A pointer to the output array
3. The input arrays’ length
Your subroutine should follow the ARM compiler parameter passing convention. Recall
that up to four integers or 4 floating-point parameters can be passed by integer (R0 – R3) and
floating-point registers (S0 – S3), respectively.

For instance, R0 and R1 might contain the values
of the first two integers and S0 will contain the value of a floating-point parameter. If the datatype
is more complex (e.g, struct or a matrix), then a pointer to it is passed instead in R0 or R1. For the
function return value, the register R0/R1 or S0/S1 are used for integer and floating-point results of
the subroutine, respectively.

ARM CALLING CONVENTION

In assembly, parameters for a subroutine are passed on the memory stack and via internal
registers. In ARM processors, the scratch registers are R0:R3 for integer variables and S0:S3 for
floating point variables. Up to four parameters are placed in these registers, and the result is placed
in R0 – R1 (S0 – S1).

If any parameter requires more than 32 bits, then multiple registers are used.
If there are no free scratch registers, or the parameter requires more registers than remain, then the
parameter is pushed onto the stack. In addition to the class notes, please refer to the document
“Procedure Call Standard for the ARM Architecture”, especially its sections 5.3-5.5.

This
particular order of passing parameters is eventually a convention applied by specific compilers.
Please be aware that the several different procedures calling and ordering conventions exist beyond
the one used here, but this procedure call convention is standardized by ARM.
4

Part III – Performance Evaluation against C

You are required to write the same subroutine described in the previous part in the C language.
Then you are to call both the C and assembly subroutines from main. You should compare the
performance between the two implementations and present analysis on execution time differences.
(Use the time measurement features in Keil).

Moreover, use CMSIS-DSP library by passing the
same inputs to the appropriate function and compare the performance with your Assembly and C
functions. Regardless to say, the output of all three implementations should match when
functions have the same coefficients. Similar to the assembly part, the prototype of your C-function
should look as follows:
XXXX_math(InputArray, OutputArray, Length)

Here:

• InputArray is the array of measurements (address)
• OutputArray is the array of values obtained by calculations (address)
• Length specifies the length of the input data array to process

The deliverables of this part are:

1. An assembly based math function “asm_math”
2. A C based math functions “C_math”
3. A CMSIS-DSP based math functions “CMSIS_math”
Function Requirements for all parts
1. Your code should not use registers/variables unnecessarily whenever you can overwrite
registers/variables you do not use anymore.

2. Input array can be any length, but values are always Integers.
3. Your code should have minimum code density; that is; it should consume minimum
memory footprint.
4. You may use the stack for passing parameters if needed (Assembly part)
5. Your code should run as fast as possible. You should optimize beyond your initial crude
implementation.

6. Your code should make use of modular design and function reuse whenever possible
7. Codes will be compared against each other for the fastest speed and lowest memory usage
(Registers) during demo time. Those who achieve best results will get the highest demo
grades.
8. The subroutine should be robust and correct for all cases. Grading of the experiment will
depend on how efficiently you solve the problem, with all the corner cases (if any) being
correct.

9. All registers specified by ARM calling convention are to be preserved, as well as the stack
position. It should be unaffected by the subroutine call upon returning.
10. The calling convention will obey that of the compiler.
11. The subroutine should not use any global variables besides any that we have specified (if
any).
5

Reference Material and Required Reading

• Doc_01 – Cortex-M4 Devices – Generic User Guide
• Doc_02 – Cortex-M4 Programming Manual
• Doc_04 – The Definitive Guide to ARM Cortex-M3 and Cortex-M4 Processors, 3rd Edition
(Optional)

• Doc_07 – Procedure Call Standard for ARM Architecture
• Doc_08 – ARM® and Thumb®-2 Instruction Set
• Doc_09 – Vector Floating Point Instruction Set
• Doc_16 – Debugging with Keil
• Doc_17 – Introduction to Keil 5.xx

Demonstration and Documentation

On demonstration day, you will be given an input test array InputArray[] of a known length as
well as coefficients. Your code should generate the correct output for that specific array.
The demonstration involves showing your source code and demonstrating a working program.

You should be able to call your subroutine several times consecutively. You should be able to
explain what every line in your code does – there will be questions in the demo dealing with all the
aspects of your code and its performance.

The questions might regard the skeleton code to initiate
and start the assembly code that we gave you in Tutorial 1; ask if you do not know!
It is by far the most efficient that you have the full documentation on your assembly code
subroutine completed upon demonstrating the Lab 1, especially that future labs will require lots of
documentation and additional code development on its own.

1. Demos will be held on Friday, February 2nd, 2018. This lab has a weight of 8 marks.
There is NO report associated with this lab. The whole project should be submitted
for review, under the folder name as Gxx_LAB1, where xx is your group number.

2. Demo slots will be announced and students will reserve their own preferred slot on Friday.
Students should show up and be ready for demo 10 minutes prior to their reserved slot.
T.A.s will not wait for you if you are late and will move on to the next ready group. You
will demo on Monday with a huge penalty so make sure you are ready on time.

ECSE426 Lab 1 FIR Filter and Mathematical calculations solved

Download Details:

Description

Objective

Preparation for the Lab

Background on FIR filter

Part I – FILTER:

Part II – Assembly implementations

Your assembly function will take three parameters:

ARM CALLING CONVENTION

Part III – Performance Evaluation against C

Here:

The deliverables of this part are:

Reference Material and Required Reading

Demonstration and Documentation

ECSE426 Lab 1 FIR Filter and Mathematical calculations solved

Download Details:

Description

Objective

Preparation for the Lab

Background on FIR filter

Part I – FILTER:

Part II – Assembly implementations

Your assembly function will take three parameters:

ARM CALLING CONVENTION

Part III – Performance Evaluation against C

Here:

The deliverables of this part are:

Reference Material and Required Reading

Demonstration and Documentation

Related products

ECSE426 Lab 2 Analog Data Acquisition, Filtering, and Digital I/O solved