INTRODUCTION TO CHIP MULTITHREADING (CMT) |
||
|
WEEK 1
January 8 |
TU |
LECTURE: Course introduction and overview |
|
TH |
Theme: History and Overview of Chip Multithreading 1-A.
Operating System Scheduling for Chip Multithreaded Processors
[10] (Section I.1) 1-B. Computer Architecture Book, [15](pages
172 181, up to What limits multiple-issue processors) 1-C. Computer Architecture Book, [15](pages 249-257 Sun T1 multiprocessor) |
|
|
WEEK 2
January 15 |
TU |
Theme: Chip Multithreading =
Chip Multiprocessing + Hardware Multithreading 2-A. SMT: A Platform for Next-Generation Processors [8] 2-B. Interleaving [21](Skip Section 6) 2-C. Single-chip multiprocessor [14] |
|
TH |
Theme: Modern CMT processors 2-D. Niagara [19] 2-E. Hyper-threaded Pentium [22] |
|
OVERVIEW OF CMT SYSTEMS RESEARCH |
||
|
WEEK 3
January 22 |
TU |
Theme: CMT performance analysis 3-A. Initial Observations of the SMT Pentium 4 [37] Theme: Scheduling for CMT
systems 3-B. Symbiotic Jobscheduling [32] |
|
TH |
Theme: Runtime support 3-C.
Adaptive OpenMP Loop Scheduler [38] Theme: Dynamic Resource Partitioning in Hardware 3-D. Fair Cache Sharing and Partitioning [18] |
|
|
WEEK 4
January 29 |
TU |
Theme: Performance Modeling 4-A. Methods for Modeling Resource Contention on SMT [25] Theme: OS-hardware Interaction |
|
TH |
LECTURE: Tools and Techniques for CMT Systems Research |
|
SOFTWARE SCHEDULING ALGORITHMS |
||
|
WEEK 5
February 5 |
TU |
Theme: Sharing-Based Scheduling 5-A. Sharing-Based Thread Placement [36] (Justin) 5-B. Pool-Based Scheduling on NUMA systems [2] (Fernando) Project consultations |
|
TH |
Theme: Compiler and Runtime
Algorithms 5-C. Adaptive Execution Techniques [17] (Mike) Project consultations |
|
|
WEEK 6
February 12 |
TU |
Theme: Load-Balancing Schedulers in
Commercial OS
6-A. Linux Scheduler #1 [26] (Justin) 6-B. Linux Scheduler #2 [31] (Dan) 6-C. Optimizations in Solaris [23], pp. 795-814 (Dan) Project consultations |
|
TH |
Theme: Performance
Predictability and Fairness 6-D.
Fair
Scheduler [12]
PROJECT ABSTRACTS DUE ON FEBRUARY 15 |
|
RESOURCE PARTITIONING ALGORITHMS |
||
|
WEEK 7
February 19 |
TU |
Theme: Performance Optimization 7-A. Cooperative Caching [5] (Fernando) 7-B. Communist, Utilitarian and Capitalist Cache Policies [16] (Navid) |
|
TH |
Quiz #1 |
|
|
WEEK 8
February 26 |
TU |
Theme: Performance
Predictability 8-A. Applications of Thread Prioritization [28] (Navid) 8-B. Predictable Performance in SMT Processors [3] (Mike) |
|
TH |
Theme: Fairness 8-C. Predicting Inter-Thread Contention [4] (Hossein) |
|
|
WEEK 9 March 5 |
TU |
Theme: More on performance
9-A.
Utility-based Cache Partitioning [27] (Navid)
Project Progress Reports |
|
TH |
Theme: Fast Single-Thread
Execution 9-B. Transparent Threads [7] (Justin) |
|
ARCHITECTURAL SUPPORT FOR SOFTWARE OPTIMIZATIONS |
||
|
WEEK 10 March
12 |
TU |
Theme: Architectural Support
for OS-level Optimizations 10-A. Architectural Support for Scheduling [30] (Hossein) 10-B. Memory-Monitoring Scheme for Memory-Aware Scheduling [34] (Mike) |
|
TH |
10-C. Helper Threads on SMT [13] (Sven) |
|
|
WEEK 11 March
19 |
TU |
11-A. Evaluating Performance of Hardware Thread Priorities on SMT [24] (Sven) 11-B. Compatible Phase Scheduling [9] (Dan) |
TRENDS IN APPLICATION DEVELOPMENT |
||
|
WEEK 11 March
19 |
TH |
Theme: Trends in Applications 11-C. Multicore Chips Make Application Development Tough [1] 11-D. The Free Lunch is Over [35] 11-E. Hybrid Transactional Memory [6] (Fernando) |
THE BIG PICTURE |
||
|
WEEK 12 March
26 |
TU |
12-A.
Performance/Watt Ratio [20] (Hossein) 12-B. Chip Multithreading: Opportunities and Challenges [33] (Sven) |
|
TH |
Quiz #2 |
|
|
WEEK 13 April
2 |
TU |
Project presentations |
|
TH |
Project presentations |
|
[1] Gary Anthes. Hard Cores: Multicore chips provide power but make app development tough. http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=112303, 2006
[2] Timothy Brecht. An Experimental Evaluation of Processor Pool-Based Scheduling for Shared-Memory NUMA Multiprocessors. In Proceedings of the 3rd Workshop on Job Scheduling Strategies for Parallel Processing, 1997
[3] F. J. Cazorla, Peter M. W. Knijnenburg, R. Sakellariou, E. Fernandez, A. Ramirez, and M. Valero. Predictable Performance in SMT Processors. In Proceedings of the 1st Conference on Computing Frontiers, 2004
[4] D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting Inter-Thread Cache Contention on a Multi-Processor Architecture. In Proceedings of the 12th International Symposium on High Performance Computer Architecture, 2005
[5] J. Chang and G. S. Sohi. Cooperative Caching for Chip Multiprocessors. In Proceedings of the 33rd Annual International Symposium on Computer Architecture, 2006
[6] Peter Damron, Alexandra Fedorova, Yosef Lev, Victor Luchangco, Mark Moir, and Daniel Nussbaum. Hybrid Transactional Memory. In Proceedings of the Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems, 2006
[7] G. Dorai and D. Yeung. Transparent Threads: Resource Sharing in SMT Processors for High Single-Thread Performance. In Proceedings of the 11th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2002
[8] Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, Rebecca M. Stamm, and Dean M. Tullsen. Simultaneous Multithreading: A Platform for Next-Generation Processors. In Proceedings of the IEEE Micro, September 1997
[9] A. El-Moursy, R. Garg, David Albonesi, and Sandhya Dwarkadas. Compatible phase co-scheduling on a CMP of multi-threaded processors. In Proceedings of the 20th International Parallel and Distributed Processing Symposium, 2006
[10] Alexandra Fedorova. Operating System Scheduling for Chip Multithreaded Processors. 2006
[11] Alexandra Fedorova. System Software Design For Chip Multithreaded Processors. Submitted for review. Not for wide distribution., 2006
[12] Alexandra Fedorova, Margo Seltzer, and Michael D. Smith. A Cache-Fair Operation System Scheduler for Chip Multiprocessors. In preparation for conference submission. Not for wide distribution., 2007
[13] I. Ganusov and M. Burtscher. Efficient Emulation of Hardware Prefetchers via Event-Driven Helper Threading. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, 2006
[14] L. Hammond, B Nayfeh, and K. Olukotun. A Single-Chip Multiprocessor. Computer, 3(9):79-85, 1997
[15] J. Hennessy and David A. Patterson. Computer Architecture, Fourth Edition: A Quantitative Approach. Morgan Kaufman, 2006
[16] L. Hsu, S. K. Reinhardt, R. Iyer, and S. Makineni. Communist, Utilitarian, and Capitalist Cache Policies on CMPs: Caches as a Shared Resource. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, 2006
[17] C. Jung, D. Lim, J. Lee, and S. Han. Adaptive Execution Techniques for SMT Multiprocessor Architectures. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005
[18] S. Kim, D. Chandra, and Y. Solihin. Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2004
[19] Poonacha Kongetira. A 32-way Multithreaded SPARC(R) Processor. In Proceedings of the 16th Symposium On High Performance Chips (HOTCHIPS), 2004
[20] James Laudon. Performance/Watt: the New Server Focus. ACM SIGARCH Computer Architecture News, 33(4):5-13, 2005
[21] James Laudon, A. Gupta, and Mark Horowitz. Interleaving: A Multithreading Technique Targeting Multiprocessors and Workstations. In Proceedings of the Sixth International Conference On Architectural Support For Programming Languages And Operating Systems (ASPLOS), 1994
[22] Deborah T. Marr, Frank Binns, David L. Hill, Glenn Hinton, David A. Koufaty, J. Allan Miller, and Michael Upton. Hyper-threading Technology Architecture and Microarchitecture. Intel Technical Journal, 6(1):4-15, 2002
[23] Richard McDougall and Jim Mauro. Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture. Vol. 2nd. 2006
[24] M. Meswani and P. Teller. Evaluating the Performance Impact of Hardware Thread Priorities in Simultaneous Multithreaded Processes using SPEC CPU2000 . In Proceedings of the 2nd Workshop on Operating System Interference In High Performance Applications, 2006
[25] Tipp Moseley, Joshua L. Kihm, Daniel A. Connors, and Dirk Grunwald. Methods for Modeling Resource Contention on Simultaneous Multithreading Processors. In Proceedings of the International Conference on Computer Design, 2005
[26] Jun Nakajima and Venkatesh Pallipadi. Enhancements for Hyper-Threading Technology in the Operating System - Seeking the Optimal Scheduling. In Proceedings of the Second Workshop on Industrial Experiences with Systems and Software, 2002
[27] M. K. Qureshi and Yale Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In Proceedings of the 39th International Symposium on Microarchitecture, 2006
[28] S. E. Raasch and S. K. Reinhardt. Applications of Thread Prioritization in SMT Processors. In Proceedings of the Workshop On Multi-Threaded Execution, Architecture and Compilation, 1999
[29] N. Rafique, W. T. Lim, and M. Thottethodi. Architectural Support for Operating System-driven CMP Cache Management. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, 2006
[30] Alex Settle, Joshua L. Kihm, Andrew Janiszewski, and Daniel A. Connors. Architectural Support for Enhanced SMT Job Scheduling. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2004
[31] Suresh Siddha and Venkatesh Pallipadi. Chip Multi Processing Aware Linux Kernel Scheduler. In Proceedings of the Linux Symposium, 2005
[32] Allan Snavely and Dean M. Tullsen. Symbiotic Jobscheduling for a Simultaneous Multithreaded Processor. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2000
[33] Lawrence Spracklen and Santosh G. Abraham. Chip Multithreading: Opportunities and Challenges. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture, 2005
[34] G. E. Suh, S Devadas, and L. Rudolph. A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning. In Proceedings of the 8th International Symposium on High Performance Computer Architecture, 2002
[35] Herb Sutter. The Free Lunch is Over: A Fundamental Turn Towards Concurrency in Software. Dr.Dobbs Journal, 30(3, 2005
[36] Radhika Thekkath and Susan J. Eggers. Impact of Sharing-Based Thread Placement on Multithreaded Architectures. In Proceedings of the 22nd Annual International Symposium On Computer Architecture (ISCA), April 2004
[37] Nathan Tuck and Dean M. Tullsen. Initial Observations of the Simultaneous Multithreading Pentium 4. In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, 2003
[38] Y. Zhang, M. Burcea, V. Cheng, R. Ho, and M. Voss. An Adaptive OpenMP Loop Scheduler for Hyperthreaded SMPs. In Proceedings of the International Conference on Parallel and Distributed Systems, 2004