||With the improvement in VLSI technology, realization of multiple processor cores on a single chip becomes easier. Therefore, more and more users execute applications on current multi-core architectures. The multi-core system has a brilliant performance in executing multi-threaded applications, but this system could not gain any performance in single-threaded applications. This paper proposes a multi-core architecture for enhancing single-threaded performance in embedded system, and focuses on four points:|
1. Construct a simple out-of-order execution core.
2. Design a dynamically scheduled instruction analyzer.
3. Design a mechanism for sharing operands between two cores.
4. Design a mechanism for committing instructions synchronously between two cores.
The architecture of each core is single-issue out-of-order instruction pipe. First, instruction analyzer will fetch instructions and generate instruction dependence tags by detecting the dependencies among the fetched instructions, then schedule instructions dynamically and dispatch to the cores. In the core, instructions can know where to get required operands according to the information of instruction tags, this mechanism enables data can be shared between two cores. Instructions are executed by data-driven approach, but in-order complete to maintain the correctness of the program order. Based on ARM instruction set, this paper tries to explore ways to achieve interaction control mechanisms between two cores and to accelerate a single-thread in the dual-core architecture.
We write a simulation model of the proposed architecture in C language as our trace-driven simulation framework and the MediaBench suite is selected for the experiments. According simulation result, the architecture can obtain average 40% performance speedup comparing to the five-stage pipelined architecture.