Efficient CNNs

January, 2018

Deep Neural Networks, while being unreasonably effective for several vision tasks, have their usage limited by the computational and memory requirements, both during training and inference stages. Analyzing and improving the connectivity patterns between layers of a network has resulted in several compact architectures like GoogleNet, ResNet and DenseNet-BC. We survey the most recent developments giving CNN architectures with better error vs resourse (parameters/FLOPs/energy/memory/fps) tradeoffs.

We also plan to learn how to do fast implementations of CNNs in existing processing architectures. Most important processor currently is the GPU. Hence we plan to learn some amount of GPU programming as well.


Every week we meet for 3 hrs. Broadly it will be divided in to:

  • 1 hr will be spent on basics and pre 2017 techniques
  • 1 hr on more recent techniques or theory
  • 1 hr on GPU programming (also some thing about FPGAs or general hardware?)


The topics covered can be broadly classified into following. It is reccomended that each participant take one topic and read all papers related to it. If there are too many in a particular topic, we can find some way to share the load.

Explicit Compression Techniques

Efficient CNN Designs

Efficient Semantic Segmentation Architectures

GPU Programming

Theory for CNNs


  1. CNN Introduction and Survey | Depthwise Seperarable Convolutions (Inception, Xception, MobileNet) | Pruning and Quantization Introduction
  2. ResNext | GPU Programming 1
  3. FractualNets | Memory Efficient Convolutions (MEC)
  4. CapsuleNet | tvmlang | Semantic segmentation architectures (PSP Module, Atrous convolutions from Deeplab v3)


  • Ameya Prabhu
  • Aniruddha Vivek Patil
  • Sriharsha Annamaneni
  • Aaron Varghese
  • Vallurupalli Nikitha
  • Sudhir Kumar Reddy
  • Soham Saha
  • Ashutosh Mishra