||For the real-time systems, the error detection and recovery of the transient fault have become an important issue to improve the reliability. In previous works, in order to detect the control flow error, and recovery the data error caused by control flow error (CFE), they will check signatures, and set a checkpoint to make a complete backup of the data in each basic block. However, frequent checks and data backups complete, which can reduce performance. On the other hand, after an error occurs, it’s not necessary to have performance overhead on recovery the data from checkpoint storage to registers.|
The proposed technique has three main ideas. First, through sensitivity analysis result, it only does a critical signature assertion for the basic blocks which have a higher probability of the CFE. Second, in checkpoint phase, it only backs up the registers which might have the data corruption for a checkpoint. Third, in the recovery phase, it will read the checkpoint value from checkpoint storage for execution directly. In experiment results, the proposed technique has a significant decrease in memory overhead of additional instruction about 2.4 times and less performance overhead of additional execution about 11.9 times at most. It also shows the difference in error correction latency about 20 cycle times in the best case, but it has lower fault coverage about 1.5%.