Deep Learning can be described as a new machine learning toolkit that has a high likelihood to lead to more advanced forms of artificial intelligence. The evidence for this is in the sheer number of breakthroughs that had occurred since the beginning of this decade. There is a new found optimism in the air and we are now again in a new AI spring. Unfortunately, the current state of deep learning appears to many ways to be akin to alchemy. Everybody seems to have their own black-magic methods of designing architectures. The field thus needs to move forward and strive towards chemistry, or perhaps even a periodic table for deep learning. Although deep learning is still in its early infancy of development, this book strives towards some kind of unification of the ideas in deep learning. It leverages a method of description called pattern languages.
Pattern Languages are languages derived from entities called patterns that when combined form solutions to complex problems. Each pattern describes a problem and offers alternative solutions. Pattern languages are a way of expressing complex solutions that were derived from experience. The benefit of an improved language of expression is that other practitioners are able to gain a much better understanding of the complex subject as well as a better way of expressing a solution to problems.
The majority of literature in the computer science field, the phrase “design patterns” is used rather than “pattern language”. We purposely use “pattern language” to reflect that the field of Deep Learning is a nascent, but rapidly evolving, field that is not as mature as other topics in computer science. There are patterns that we describe that are not actually patterns, but rather may be fundamental concepts. We are never certain which will are truly fundamental and only further exploration and elucidation can bring about a common consensus in the field. Perhaps in the future, a true design patterns book will arise as a reflection of the maturity of this field.
In ML there are many new terms that one encounters such as Artificial Neural Networks (ANN), Random Forests, Support Vector Machines (SVM) and Non-negative Matrix Factorization (NMF). These however usually refer to a specific kind of machine learning algorithm. Deep Learning (DL) in contrast is not really one kind of algorithm, rather it is a whole class of algorithms that tend to exhibit similar characteristics. DL systems are ANN that are constructed with multiple layers (sometimes called Multi-level Perceptrons). The idea is not entirely new, since it was first proposed back in the 1960s. However, interest in the domain has exploded with the help of advancing computational technology (i.e. GPU) and bigger training data sources. Since 2011, DL systems have been exhibiting impressive results in the field of machine learning.
The confusion with DL arises when one realizes that there actually many algorithms and it is not just a single kind. We find the conventional Feed forward Networks also known as Fully Connected Networks (FCN), Convolution Networks (ConvNet), Recurrent Neural Networks (RNN) and less used Restricted Boltzmann Machines (RBM). They all share a common trait in that these networks are constructed using a hierarchy of layers. One common pattern for example is the employment of differentiable layers, this constraint on the construction of DL systems leads to an incremental way of evolving the machine into something that learns classification. There are many patterns that have been discovered recently and it would be fruitful for practitioners to have at their disposal a compilation of these patterns.
Here are some preview graphics, here’s a periodic table of the various design patterns:
and the relationships of some of the patterns described in the book: