|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Cortex-M And Classical Series ARM Architecture ComparisonsBy Guruprasad Vadhiraj Putty 1 ABSTRACT ARM has introduced many processors. Each set or groups of processors are having different core and different Features. A new entrant or Designer to the ARM can make use of this paper for easy understanding and choose a processor that is well suited for the requirements. This paper gives brief comparison of the Architectures. 2 INTRODUCTION There are many papers on ARM today but most of them are related to comparison of performances or the improvements made over the previous Architecture. This paper brings out the architectural comparisons between and Classical ARM processors and cortex-M3. The classical ARM series refers to processors starting from ARM9 to ARM11. It tries to explain each module and the usability for industrial control systems. During this process questions are raised for some modules. Whether it is relevant to user for his requirements is left to the end user himself. It does not cover in detail about the power management. 3 COMPARISON OF ARCHITECTURES
Note: In the classical series ITCM,DTCM,CACHE,MMU and MPU will not be in a single core. If there is ITCM,DTCM and MPU then there would not be CACHE and MMU. Each block is explained below with respect to industrial control systems. 3.2 CLASSICAL SERIES 3.1.1 Instruction Tightly Coupled Memory (ITCM) This is useful where the read cycles of instructions are deterministic. In other words number of cycles required to read the instruction remains consistent and faster. There is a disadvantage with this method. There should be boot code to read code from external memory or for downloading it through USB which copies in to SRAM and again it needs to be copied in to ITCM. Not useful when code size becomes too large or when OS is used. Industrial control systems mostly prefer to run the code from inbuilt flash or external flash. 3.1.2 Data Tightly Coupled Memory (DTCM) It is highly useful for storing the data. It helps in faster access. Data segments can have DTCM address in the linker script file. DTCM always accompanies ITCM. There is no processor where only DTCM block is available. 3.1.3 Cache It plays an important role if the code or data resides in external memory. Though the number of cycles to read the instruction or data varies depending on cache hit or miss, it greatly improves the performance especially incase of operating system. If you are loading the only firmware without an operating system then cache utility is less. 3.1.4 Write Buffer Cache in combination with write buffer determines the type of cache. It would be either write back or write through cache. Write buffer without cache helps the processor to write to external memory without needing to wait for the write operation to complete. Cache and write buffer module is useful when external memory is used. Industrial control systems , extensively makes use of flash within the chip for programming and running the code and use eeprom (which can be accessed by i2c) for storing the data base and industrial parameters .yes, there are cases where external memory is used , in such a scenario there are chips in cortex series and classic series having an inbuilt memory controller. Classical ARM series provides better option as DTCM (for storing the data can be used). 3.1.5 Co-processor: It is relevant when MMU, Cache, ITCM, DTCM and MPU(In cortex MPU can be used and has been designed without use of co-processor) needs to be used .Its utility depends on the type of application. For e.g.: if application is about temperature measurement, fault detection system then absence of co-processor will not have much impact. 3.1.6 MMU and MPU: When Linux or other operating systems has to be ported then classical series(not all of them have MMU. you need to select chips which has MMU ) is best suited. MMU maps the virtual address , address that is used by OS in to physical address which can be understood by the memory controllers. uCos can be ported on cortex series, this makes use of MPU. Using MPU you can make certain sections of memory as No Access, Read Only or Read Write. I was working in a semiconductor company and was part of the team which was designing the chip for a specific network protocol. This protocol was still being standardized and our company was part of the forum. We were asked to analyze the performance of the chip. We started writing the firmware , used software queue’s for message receive and transmit and did not use any operating systems and all the events were interrupt driven. My question is, Are we using operating system just because we need to use operating system? The debate is not on the usefulness of operating system but where it is best applicable. I met a design engineer in power station, during the course of the discussion he was explaining me the design in broader way. They were using 32 bit processor with Linux running. Being a firmware guy I asked why OS? The reply was that each module delivers a message and this used by the tasks for control operations. In such cases where number of messages from each module is critical and more, then OS is best suited. The same cannot be said for operations involving temperature, Real time clock, Uart etc Finally it is for the end designer to carefully understand the requirements and needs. 3.1.7 Interrupts Three parameters is always discussed when interrupt topic is raised
Latency: Cortex scores here over classical series where the Interrupt latency is less for the fact that it can fetch and branch to ISR address during the execution of LDM and STM instruction whereas in classic series (other than ARM1176 which also abandons LDM and STM) it completes the execution of LDM and STM before branching to ISR and in such cases ISR latency is determined by number of parameters specified in the instruction. Other advantages of Cortex M
Number of interrupts: There are 239(depends on the chip manufacturer as well) interrupts and priority can be set for the interrupts in cortex core. ATMEL AT91RM9200 (which used 920t)soc has given the option for many interrupts like timer, rtc uart etc which is same as we find in NXP1768 which uses cortex M3( I just took an example). For the firmware developer there is no difference between the two. So to conclude that cortex M is advantageous because it has NVIC is arguable. Other parameter like pending is same as pending register in 9200 and both convey the same message. Preemption: You can have this luxury in cortex series and this has been designed such that if the first instruction is not executed in ISR and if higher priority interrupt is raised then control fetches higher priority vector address and braches to it . Cortex m3 is mainly targeted for industrial applications where the events are mostly interrupt driven and preemption is essential and this block is well suited for this. 3.2 CORTEX M series Cortex M3 architecture blocks
3.2.1 System Block All key control and status features are handled by this block like software reset, power management, Fault status information system exception. In some classic ARM series there is a block called System controller which communicates with Bus Interface Unit to stall the processor when AHB access is performed. An example is writing to external memory. 3.2.2 System timer This is specifically designed for use by the operating system. It is a 24 bit counter. Let us consider the two cases here a. when OS is not ported: In this scenario system timer will not have much usage. If you think firmware can make use of this then most of cortex m3 chips like nxp1768 and atmel SAM3S series provide more than two timers/counters (nxp has three 32bit timers and atmel has six 16 bit timers). b. when OS is ported: If the application do not use consume all the timers, then OS can very well make use of one of the timers that are available. System timer is not essential. My observation is that system timer is a luxury but not a. necessity. 3.2.3 Nested Vectored Interrupt controller Improvement of interrupt handling mechanisms in cortex is already explained. The advantage in cortex is the tail chaining and handling of late arriving interrupts. 4 Instruction Set Architecture and reverse compatibility Cortex supports thumb2 instruction which is a blend of 32 and 16 bit instructions. Though thumb2 is advantageous, code written for cortex series cannot be ported to ARM9,ARM10 and some ARM11(ARM11 that do not have thumb2 support) series. It’s because all 32 bit instructions are suffixed with .w . For ex: ADD.W is 32 bit ADD is 16 Bit Instruction. There are some changes required for reverse portability. The same applies for code written for classical series to be ported to cortex series The advantage of cortex is not restricted to thumb2 but also some bit manipulation instructions. The only question mark you have is, can we write inline assembly? The answer is probably NO. Because there is only thumb mode and we cannot write thumb mode assembly code in C. Now let’s focus on the various models of the two architectures
5 Programmers model The table below gives out the differences between two architectures
All the exceptions will be in handler mode for the cortex while in classic series it can be abort, undefined, fiq, irq. To summarize, the cortex has simplified the processor mode for easy implementation. Talking about stack, there are two kinds of stack as explained above, handler mode always uses the process stack and thread mode can make use of main stack or process stack. This can be configured in control register. It’s simpler for firmware engineer to work with cortex M than with classical series. 6 Exception Model and Fault Handling
Priorities of the interrupt will not be discussed in detail as this paper is mainly concentrated on Architecture comparisons. Cortex M has better description of the exceptions and fault analysis has been made simpler over the classical series Let us consider prefetch abort in classical series. This exception is raised when processor tries to fetch an instruction from a memory region whose attributes has been set as No Access by the MPU or the address given by the processor for fetching does not exist. In the case of Cortex M the exception caused by MPU attributes is called as Memory Management exception and exception because of bad address is classified under Bus fault. So the cause of the exception can be clearly identified in Cortex M.
All exceptions use Main Stack while in thread mode there is option to select main stack or process stack. Handling Exceptions looks simpler and easy to implement. The only debatable point is the configuration of priorities of exceptions in cortex(other than Reset, Non Maskable Interrupt and Hard Fault). My observation is that all the processor /processor related hardware (except external interrupt)should have a fixed priority as in classical series. 7 Fault Handling Fault/exception handling is greatly simplified in Cortex M series. You don’t need to subtract the link register by 4 or 8(for base updated data abort models) in the handler and do STMFD or LDMFD in the exception routine. All of them is internally taken and it is just need to load the PC with LR. 8 Power Management This requires reading of data sheets of the controllers using cortex M and classical series. This may vary but would like to add the type of modes available in two series Cortex series has
NXP 1768 provides another two features power down and deep power down mode
The selection depends on the requirements and the application. 9 Miscellaneous
10 Conclusion Classical ARM series can be selected when
CORTEX M3 can be selected when selection criteria is
11 REFERENCES
12 CONTACT Guruprasad |
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |