The "Black" Art & "Elusive" Science of Training Deep Neural Networks
School of Computing, University of Nebraska-LincolnSpring 2021 (3-week session): CSCE 496/896
Synopsis: Deep Learning or representational learning is an exciting venture of modern Artificial Intelligence for solving hard problems in computer vision, natural language processing, and speech recognition, to name a few. A Deep Neural Network (DNN) is a gigantic conglomeration of computational units or artificial neurons, structured hierarchically in many successive layers. DNNs discover hidden patterns from the input data by creating layers of increasingly meaningful representations of the data. To unleash the full potential of DNNs, one must have knowledge of effective DNN architectures, learning algorithms, and optimization strategies. Training DNNs is tricky, no less than summoning a “genie”. Many consider it “black art”. This course will attempt to demystify the training process of DNN models with an emphasis on the modern state-of-the-art DNN architectures. It will explore the “black” art and “elusive” science of training DNNs.
Philosophy: I will advocate my mentor Richard Feynman's philosophy on learning: "What I cannot create, I do not understand". Your understanding of the DNN training issues will be incomplete unless you are able to implement DNNs such as CNN or RNN from scratch (without using any high-level ML libraries such as Scikit-Learn, Keras, or PyTorch). Start by implementing a multi-layer perceptron (MLP) using NumPy. Refer to the pseudocode from my GitHub notebook on MLP. Then, try implementing a CNN using NumPy. Refer to the pseudocode from my GitHub notebook on CNN.
- Instructor
- Dr. M. R. Hasan
- Office Hours
- See the course Canvas page
- Lecture Time
- Monday, Tuesday, Wednesday, Thursday, Friday: 10.00 AM - 11.30 AM via Zoom
- Convolutional Neural Networks
- Recurrent Neural Networks
- How to use TensorFlow-Kears for Building Effective MLP classifiers
- CNN - Image Classification - Tricks of the Trade
- Data Augmentation for Deep Learning
- Efficient Methodlogy for Deep Learning
- Review: Hidden Layer Activations (sigmoid, tanh, ReLU) & Weight Initializers (LeCun, Glorot)
- Vanishing & exploding gradient problem-II: ReLU, He Initializer, Variants of ReLU
- Vanishing & exploding gradient problem-III: Gradient clipping & Batch Normalization
- Achilles' Heel of SGD: Learning rate schedule and optimal learning rate by Hessian
- Lecture Slides & Jupyter notebooks (thorough and extensive) should provide a detailed account of the topics.
- Dive into Deep Learning by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
- Deep Learning by Ian Goodfellow and Yoshua Bengio and Aaron Courville
- Deep Learning with Python by Francois Chollet
- Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd Edition, 2019) by Aurélien Géron (O'Reilley)
- Machine Learning: A Probabilistic Perspective by Kevin P. Murphy
- Pattern Recognition and Machine Learning by Christopher M. Bishop
- Introduction to Machine learning (3rd ed.) by Ethem Alpaydin
- Data Science from Scratch by Joel Grus (O’Reilly)
- Python for Data Analysis (2nd Edition) by Wes McKinney (O'Reilley)
- Python Machine Learning by Sebastian Raschka (Packt Publishing)
- Advanced Engineering Mathematics (10th Ed.) by Erwin Kreyszig
- All of Statistics: A Concise Course in Statistical Inference by Larry Wasserman
- The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos
- Artificial Intelligence: A Guide for Thinking Humans by Melanie Mitchell
- The Deep Learning Revolution by Terrence J. Sejnowski
- Thinking, Fast and Slow by Daniel Kahneman
- Rebooting AI: Building Artificial Intelligence We Can trust by Gary Marcus and Ernest Davis
- Interpretable Machine Learning: A Guide for Making Black Box Models Explainable by Christoph Molnar
- Introduction to Deep Learning - Carnegie Mellon University
- Convolutional Neural Networks for Visual Recognition - Stanford University
- Deep Learning - Stanford University
- Deep Learning Specialization - Andrew Ng (Coursera)
- Get started with Google Colaboratory (Coding TensorFlow)
- Getting Started with TensorFlow in Google Colaboratory (Coding TensorFlow)
- UC Irvine ML Repository
- Kaggle Datasets
- Amazon’s AWS Datasets
- Statlib Datasets Archive
- The CIFAR-10 dataset (Canadian Institute For Advanced Research) For Computer Vision Problems
- MILA Public Dataset
- Machine Learning
- Journal of Machine Learning Research
- IEEE Transactions on Neural Networks and Learning Systems
- Conference and Workshop on Neural Information Processing Systems (NeurIPS)
- International Conference on Learning Representations (ICLR)
- International Conference on Machine Learning (ICML)
- Conference on Computer Vision and Pattern Recognition (CVPR)
- International Conference on Computer Vision (ICCV)
- European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)
- The Advancement of Artificial Intelligence (AAAI)
- International Joint Conference on Artificial Intelligence (IJCAI)
- Annual Meeting of the Association for Computational Linguistics (ACL)
- Conference on Empirical Methods in Natural Language Processing (and forerunners) (EMNLP)
- DeepMind Research
- Google Research
- Facebook Research
- arXiv Machine Learning Publications
GitHub repositories of my tutorials on Machine Learning and Deep Learning
Key repositories:Schedule
Week | Date | Topic & PDF Slides | Video Links |
---|---|---|---|
1 | Jan 4 | ||
1 | Jan 5 | ||
1 | Jan 6 | Jupyter Notebooks: | |
1 | Jan 7 | ||
1 | Jan 8 | Jupyter Notebooks: | |
1 | Jan 9 | ||
2 | Jan 11 |
|
|
2 | Jan 12 | ||
2 | Jan 13 | ||
2 | Jan 14 | ||
2 | Jan 15 | ||
2 | Jan 17 | ||
3 | Jan 18 | Martin Luther King Holiday | |
3 | Jan 19 | CNN Visualization: Jupyter Notebooks: | |
3 | Jan 20 | ||
3 | Jan 21 | Jupyter Notebooks: |
Text Resources
Though there is no one required text for this course, my lectures will draw references from the following books.