Everything You Need to Know About Neural Architecture Search

Neural architecture search is an aspect of AutoML- Here is an in-depth elaboration of the NAS system

Handcrafting neural networks to find the best performing structure has always been a tedious and time-consuming task. Besides, as humans, we naturally tend towards structures that make sense in our point of view, although the most intuitive structures are not always the most performant ones. Neural Architecture Search is a subfield of AutoML that aims at replacing such manual designs with something more automatic. Having a way to make neural networks design themselves would provide a significant time gain, and would let us discover novel, good performing architectures that would be more adapted to their use-case than the ones we design as humans.

NAS is the process of automating architecture engineering i.e., finding the design of a machine learning model. Where it is needed to provide a NAS system with a dataset and a task (classification, regression, etc), it will come up with an architecture. And this architecture will perform best among all other architectures for that given task when trained by the dataset provided. NAS can be seen as a subfield of AutoML and has a significant overlap with hyperparameter optimization.

Walking into the Depths of Neural Architecture Search

An alternative approach to NAS is based on evolutionary algorithms, which have been employed by several groups. An Evolutionary Algorithm for Neural Architecture Search generally performs the following procedure. First, a pool consisting of different candidate architectures along with their validation scores (fitness) is initialized. At each step, the architectures in the candidate pool are mutated (eg: 3×3 convolution instead of a 5×5 convolution). Next, the new architectures are trained from scratch for a few epochs and their validation scores are obtained. This is followed by replacing the lowest scoring architectures in the candidate pool with the better, newer architectures. This procedure is repeated multiple times and thus the candidate pool is refined over time. Mutations in the context of evolving ANNs are operations such as adding or removing a layer, which includes changing the type of a layer (e.g., from convolution to pooling), changing the hyperparameters of a layer or changing the training hyperparameters.

NAS algorithms design a specific search space and hunt through the search space for better architectures. The search space for convolutional network design in the paper mentioned above can be seen in the diagram below. The algorithm would stop if the number of layers exceeded a maximum value. They also added skip connections, batch normalization, and ReLU activations to their search space in their later experiments. Similarly, they create RNN architectures by creating different recurrent cell architectures using the search space shown below. The biggest drawback of this approach was the time it took to navigate through the search space before coming up with a definite solution. They used 800 GPUs for 28 days to navigate through the entire search space before coming up with the best architecture. There was a need for a way to design controllers that could navigate the search space more intelligently.

Reinforcement Learning

Reinforcement learning has been used successfully in driving the search process for better architectures. The ability to navigate the search space efficiently to save precious computational and memory resources is typically the major bottleneck in a NAS algorithm. Often, the models built with the sole objective of a high validation accuracy end up being high in complexity–meaning a greater number of parameters, more memory required, and higher inference times.

Neuroevolution

Floreano et al. (2008) claim that gradient-based methods outperform evolutionary methods for the optimization of neural network weights and those evolutionary approaches should only be used to optimize the architecture itself. Besides deciding on the right genetic evolution parameters like mutation rate, death rate, etc. There's also the need to evaluate how exactly the topologies of neural networks are represented in the genotypes used for digital evolution.

Designing the Search Strategy

Most of the work that has gone into neural architecture search has been innovations for this part of the problem which is finding out which optimization methods work best, and how they can be changed or tweaked to make the search process churn out better results faster and with consistent stability. There have been several approaches attempted, including Bayesian optimization, reinforcement learning, neuroevolution, network morphing, and game theory.