Skip to main content
Tensorflow
TODO AFTER FIRST CLASS
- Form a group (3-4 students) to pick a topic for presentation (topics are listed below under group topics).
- Each student will be expected to present for 15-20m.
Syllabus (Google collabotary)
- Familiarize yourself with tensorflow and how to create DNNs
- Tensor flow tutorials
- Clone notebooks to your drive before running them
- To run a notebook:
- Reset all runtimes.
- Run all
- Select TPU in Runtime->Change runtime type). Do you notice an improvement in speed?
- What is a TPU? (You will learn about it in class).
Syllabus (Slides)
Lecture notes
Links
- Deep Learning for Computer Architects (chap 1-3)
Link
- Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Link
Lecture notes
Links
- Google Collaboraratory Slides (Linear Model and Convolution layer)
-
Link
Lecture notes
Links
- Google Collaboraratory Slides (Keras Model and Inception layers)
-
Link
Lecture notes
Links
- FFT Convolutions are Faster than Winograd on Modern CPUs, Here’s Why
Link
- Optimizing CNN Model Inference on CPUs
Link
- Optimizing NN on ARM CPUs
-
Lecture notes
Links
- Introduction to DNN accelerators
Link
- A primer on DNN Dataflows
Link
- MAESTRO Analytical Model
Link
Lecture notes
Links
- DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning
Link
- Google TPU
Link
- Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Link
- MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects
Link
Lecture notes
Links
- Cambricon: An Instruction Set Architecture for Neural Networks
Link
- Learning to Optimize Tensor Programs (Auto TVM)
Link
- SERVING RECURRENT NEURAL NETWORKS EFFICIENTLY WITH A SPATIAL ACCELERATOR
Link
- VTA: An Open Hardware-Software Stack for Deep Learning
Link
- From High-Level Deep Neural Models to FPGAs (DNN weaver)
Link
Group Presentation Topics
Lecture notes
Links
- UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition
Link
- EIE: efficient inference engine on compressed deep neural network
Link
- Cnvlutin: Ineffectual-neuron-free deep neural network computing
Link
- SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
Link
- Fine-grained accelerators for sparse machine learning workloads
Link
- A Sparse Matrix Vector Multiply Accelerator for Support Vector Machine
Link
Lecture notes
Links
- Stripes: Bit-Serial Deep Neural Network Computing
Link
- Mixed Precision Training
Link
- Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent
Link
- Gist: Efficient Data Encoding for Deep Neural Network Training
Link
Lecture notes
Links
- Survey
Link
- Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing.
Link
- A Case for Neuromorphic ISAs
Link
- IBM True North
Link
- Demonstrating Advantages of Neuromorphic Computation: A Pilot Study
Link
Lecture notes
Links
- DRISA: a DRAM-based Reconfigurable In-Situ Accelerator
Link
- ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars
Link
- Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks
Link