SFU Computer Architecture Class Syllabus

Tensorflow

TODO AFTER FIRST CLASS

Form a group (3-4 students) to pick a topic for presentation (topics are listed below under group topics).
Each student will be expected to present for 15-20m.

Syllabus (Google collabotary)

Familiarize yourself with tensorflow and how to create DNNs
Tensor flow tutorials
Clone notebooks to your drive before running them
To run a notebook:
- Reset all runtimes.
- Run all
- Select TPU in Runtime->Change runtime type). Do you notice an improvement in speed?
- What is a TPU? (You will learn about it in class).

Syllabus (Slides)

Week 1 - Into to ML hardware

Lecture notes

Slides

Links

Deep Learning for Computer Architects (chap 1-3) Link
Efficient Processing of Deep Neural Networks: A Tutorial and Survey Link

Week 2 - Intro to ML

Lecture notes

Links

Google Collaboraratory Slides (Linear Model and Convolution layer)
Link

Week 3 - Keras API and more on Conv nets

Lecture notes

Links

Google Collaboraratory Slides (Keras Model and Inception layers)
Link

Week 4 - Into to GPUs and GEMM, Conv

Lecture notes

Slides

Links

FFT Convolutions are Faster than Winograd on Modern CPUs, Here’s Why Link
Optimizing CNN Model Inference on CPUs Link
Optimizing NN on ARM CPUs

Week 5 - Intro to dataflow mappings.

Lecture notes

Slides

Links

Introduction to DNN accelerators Link
A primer on DNN Dataflows Link
MAESTRO Analytical Model Link

Week 6 - More Dataflows

Lecture notes

Slides

Links

DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning Link
Google TPU Link
Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Link
MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects Link

Week 7 - Generality and Automation

Lecture notes

Slides

Links

Cambricon: An Instruction Set Architecture for Neural Networks Link
Learning to Optimize Tensor Programs (Auto TVM) Link
SERVING RECURRENT NEURAL NETWORKS EFFICIENTLY WITH A SPATIAL ACCELERATOR Link
VTA: An Open Hardware-Software Stack for Deep Learning Link
From High-Level Deep Neural Models to FPGAs (DNN weaver) Link

Oct 31 - Project overview presentations

Lecture notes

Slides

Group Presentation Topics

Nov 4th. Sparsity - Group: Minh and Ali

Lecture notes

Links

UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition Link
EIE: efficient inference engine on compressed deep neural network Link
Cnvlutin: Ineffectual-neuron-free deep neural network computing Link
SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks Link
Fine-grained accelerators for sparse machine learning workloads Link
A Sparse Matrix Vector Multiply Accelerator for Support Vector Machine Link

Nov 11th. Toolchains and Design Exploration - Group: Lucy Malsawmtluangi and Emily Chan

Lecture notes

Slides

Links

Timeloop: A Systematic Approach to DNN Accelerator Evaluation Link
Tangram: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators Link
Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach Link

Nov 18th. Precision. - Group: Alec Lu and Yuhui Gao and Zavier Patrick Aguila

Lecture notes

Slides

Links

Stripes: Bit-Serial Deep Neural Network Computing Link
Mixed Precision Training Link
Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent Link
Gist: Efficient Data Encoding for Deep Neural Network Training Link

Nov 25th. Neuromorphic - Group: Sathish Panchapakesan, Rulai hu and John Sweeney

Lecture notes

Slides

Links

Survey Link
Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. Link
A Case for Neuromorphic ISAs Link
IBM True North Link
Demonstrating Advantages of Neuromorphic Computation: A Pilot Study Link

Dec 2. Compilers and Software Automation Frameworks - Group: Xingyu Tian and Weihua Liu

Lecture notes

Slides

Links

TVM Link Git
MLIR Git
Intel ngraph Link Git

Circuits - Group:

Lecture notes

Slides

Links

DRISA: a DRAM-based Reconfigurable In-Situ Accelerator Link
ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars Link
Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks Link