CSR: Small: Collaborative Research: Dependable Real-Time Computing on Heterogeneous Chip Multiprocessor Systems


Award Information: 

Sponsor: National Science Foundation
Grant Number: CNS-1422709 (Award Abstract at NSF)
Award Institution: The University of Texas at San Antonio (UTSA)
Period: August 15, 2014 - July 30, 2017 (no cost extension to July 30, 2018)
Amount: $230,000 + $16,000 (REU supplement)

Status: on-going (as of August 2017).


People: 

  • PI: Dr. Dakai Zhu 
  • Student(s) supported: 
    • Hamidreza Moradi: Ph.D. student
    • Thinh Vo: undergraduate researcher
    • Lauro Perez: REU undergraduate student
    • Abhishek Roy (GMU Ph.D. student): supported through subaward to GMU for two months in summer 2017
  • External colloborators:
    • Dr. Hakan Aydin (Geroge Mason Univerity)


Research Synopsis: 


Chip multiprocessor (CMP) systems, which provide multiple processors on a single chip, also known as multicore systems, have displaced single-processor architectures as the de facto standard model for computing platforms. This change is due to the fact that the CMPs offer superior performance and power efficiency, compared to the traditional designs. An emerging feature of the CMP era is the deployment of several different types of processing elements on the same platform, with varying computation speed and power consumption characteristics. An additional complicating factor is that a trend toward ove-rprovisioned designs, where only a subset of the available cores can be active at any time, due to power and thermal constraints. Such heterogeneous CMPs are increasingly being deployed in systems where applications with different safety assurance (dependability) and timeliness requirements must co-exist on the same CMP. Hence, there is a growing need for an integrated framework to allocate heterogeneous hardware resources of a CMP among applications in a way that makes efficient use of the resources while assuring that the diverse safety and timeliness requirements of the applications are met.

This project aims to develop models, algorithms, and run-time management schemes for collections of applications with a mix of different timing and dependability requirements running on a shared heterogeneous CMP platform. In particular, a central objective is to develop a sound methodology to selectively apply known hardware and software fault tolerance mechanisms (such as modular redundancy, task replication, re-execution) to such mixed-dependability applications, by considering resource, power, and timing constraints simultaneously. A second objective is to extend the framework to tackle the challenge of intermittent run-time faults that occur in bursts and can affect multiple applications at once during a bounded time window. Success in these efforts could improve the safety and reduce the development and production costs of the increasingly complex cyber-physical systems upon which we all have come to depend. 


Education and outreach activities include integration of aspects of the research into undergraduate and graduate courses at the two participating institutions, involvement of students as research assistants, and efforts to recruit student participants from under-represented demographic groups. 


Publications: 

  • Journal Articles
    • Yifeng Guo, Dakai Zhu, Hakan Aydin, Jian-Jun Han and Laurence T. Yang, Exploiting Primary/Backup Mechanism for Energy Efficiency in Dependable Real-Time Systems,Journal of System Architecture (JSA), Vol. 78, pp. 68-80, August 2017, [DOI: 10.1016/j.sysarc.2017.06.008 ]
    • Mohammad A. Haque, Hakan Aydin and Dakai Zhu, On Reliability Management of Energy-Aware Real-Time Systems through Task Replication, IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 28, no. 3, pp 813-825, March 2017, [DOI: 10.1109/TPDS.2016.2600595]
    • Hang Su, Dakai Zhu and Scott Brandt, An Elastic Mixed-Criticality Task Model and Early-Release EDF Scheduling Algorithms, ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 22, no. 2 (Article No. 28), Mar. 2017. [DOI: 10.1145/2984633]
    • Rehana Begam, Qin Xia, Dakai Zhu, and Hakan Aydin, Preference-Oriented Fixed-Priority Scheduling for Periodic Real-Time Tasks, Journal of System Architecture (JSA), vol. 69, pp. 1-14, Sept. 2016. [DOI: 10.1016/j.sysarc.2016.07.005]
    • Mohammad A. Haque, Hakan Aydin and Dakai Zhu, Energy-Aware Standby-Sparing for Fixed-Priority Real-Time Task Set, Journal of Sustainable Computing, Informatics and Systems (SUSCOM), vol. 6, pp. 81-93, Jun. 2015 [DOI: 10.1016/j.suscom.2014.05.001]
    • Yifeng Guo, Hang Su, Dakai Zhu, and Hakan Aydin, Preference-Oriented Real-Time Scheduling and Its Application in Fault-Tolerant Systems, Journal of System Architecture (JSA), vol. 61, no. 2, pp. 127-139, Feb. 2015. [DOI: 10.1016/j.sysarc.2014.12.001]
  • Conference and Workshop Papers
    • Abhishek Roy, Hakan Aydin and Dakai Zhu, Energy-Efficient Primary/Backup Scheduling Techniques for Heterogeneous Multicore Systems, Proc. of the 8th Int'l Green and Sustainable Computing Conference (IGSC), Oct. 2017
    • Abhishek Roy, Hakan Aydin and Dakai Zhu, Energy-Aware Standby-Sparing on Heterogeneous Multicore Systems, Proc. of the 54th Design Automation Conference (DAC), Jun. 2017
    • Abhishek Roy, Hakan Aydin and Dakai Zhu, On Task Period Assignment in Multiprocessor Real-Time Control Systems, Proc. of the 24th International Conference on Real-Time Networks and Systems (RTNS), Oct. 2016
    • Jian-Jun Han, Xin Tao, Dakai Zhu and Hakan Aydin Criticality-Aware Partitioning for Multicore Mixed-Criticality Systems, International Conference on Parallel Processing (ICPP) Aug., 2016.
    • Hang Su, Peng Deng, Dakai Zhu and Qi Zhu, Fixed-Priority Dual-Rate Mixed-Criticality Systems: Schedulability Analysis and Performance Optimization, IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA) Aug., 2016.
    • Hang Su, Dakai Zhu and Jiafeng Zhu, On the Implementation of RT-FAIR Scheduling Framework in Linux, Proc. of the 14th IEEE International Conference on Ubiquitous Computing and Communications (IUCC), Oct., 2015.