If you are interested in working with me, before you email me, please read my First Contact
Vendors have closely integrated reconfigurable logic (e.g., FPGAs) into multicores that enable the software community to realize accelerators for specific program regions. We propose a methodical LLVM-based compiler approach to answer the question, what is acceleratable? and help software developers with early stage exploration of acceleration targets. Our hypothesis is that an entirely new program execution-based abstraction is needed to extract acceleratable regions from programs to help hardware synthesis tools. [IISWC'16,HPCA'16]
Chip designers have shown increasing interest in integrating specialized fixed-function coprocessors. With increasing energy cost of wires and caches relative to compute operations, it is imperative to optimize data movement to retain the energy benefits of accelerators. We have developed a lightweight coherent cache hierarchy for accelerators to optimize the data movement. We are studying coherence based memory models for both GPUs [HPCA'13] and fixed-function coprocessors [ISCA'15].
Recent Publications and Talks
GPGPU-Sim + Wisconsin Ruby
Simulation infrastructure and workloads for Temporal Coherence
Simulation infrastructure for modelling hardware accelerator coherence
- Program committee members, ASPLOS 2017, MICRO 2016, HPCA 2015,
- Local Arrangements Chair, MICRO 2012.
Recently Graduated Students
Phd (Total: 1)
- Kumar, Snehasish. Generalized methods for application specific hardware specialization. (Summer 2013 - Spring 2017)
MSc (Total: 4)
- Vedula, Naveen. Leveraging Compiler Alias Analysis To Free Accelerators from Load-Store Queues. ( Fall 2014 - Fall 2016)
- Sharifian, Amirali. Specialized Macro-Instructions for Von-Neumann Accelerators. ( Fall 2014 - Fall 2016)