Analyzing multithreaded applications - Identifying performance bottlenecks on multicore systems
Nandan Tripathi and Amrit Singh, Freescale Semiconductor
EETimes (4/7/2011 11:04 AM EDT)
Abstract
Various aspects preventing applications from achieving theoretical maximum utilization of multicore resources include: operating system (scheduling, synchronization, etc.), application code (parallelization factor, data/function decomposition, etc.), and hardware architecture scalability (cores, memory subsystem, interconnects, etc.).
We use various multithreaded execution scenarios generated through EEMBC's Multibench as stimulus. We introduce a step by step methodology to analyze these scenarios and identify the bottlenecks. Techniques used for kernel tracing, time/function profiling, etc. and tools used to deploy the methodology are discussed next. The paper ends with discussion of various case studies representing different bottlenecks.
E-mail This Article | Printer-Friendly Page |
|
Related Articles
- Achieving multicore performance in a single core SoC design using a multi-threaded virtual multiprocessor: Part 2
- Achieving multicore performance in a single core SoC using a multi-threaded virtual multiprocessor: Part 1
- Meeting Increasing Performance Requirements in Embedded Applications with Scalable Multicore Processors
- Protecting multicore designs without compromising performance
- Optimizing performance, power, and area in SoC designs using MIPS multi-threaded processors
New Articles
Most Popular
- Streamlining SoC Design with IDS-Integrate™
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
- PCIe error logging and handling on a typical SoC