From our smartphones to cars and homes, artificial intelligence is increasingly touching our every walk of life. Applications of artificial intelligence have already proved disruptive across diverse industries, including manufacturing, healthcare, retail, etc. Considering these progresses, we can say artificial intelligence has evolved much impressively in recent years. Research around this technology has also surged and is impacting the way every individual and business interacts with AI technologies. Analytics Insight has listed 10 must look artificial intelligence research papers so far worth looking at now.
Author(s): Diederik P. Kingma, Jimmy Ba
Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, and it is computationally efficient, invariant to a diagonal rescaling of the gradients, and has little memory requirements. It is well suited for problems that are large in terms of data and parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. Adam has been adopted as a default method of optimization algorithm for all those millions of neural networks that people train nowadays.
Author(s): Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, RomalThoppilan, Zi Yang, ApoorvKulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le
This research paper presents Meena, a multi-turn open-domain chatbot that is trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is simply trained to minimize the perplexity of the next token. The researchers also propose a new human evaluation metric to capture key elements of a human-like multi-turn conversation, dubbed Sensibleness and Specificity Average (SSA).
Author(s): Sergey Ioffe, Christian Szegedy
Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. The researchers refer to this phenomenon as "internal covariate shift", and address the problem by normalizing layer inputs. Batch Normalization allows the researchers to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps and surpasses the original model by a significant margin.
Author(s): Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei
Convolutional Neural Networks (CNNs) have been considered as a powerful class of models for image recognition problems. Encouraged by these results, the researchers provide an extensive empirical evaluation of CNNs on large-scale video classification. This used a new dataset of 1 million YouTube videos belonging to 487 classes. Provided by IEEE Conference on Computer Vision and Pattern Recognition, this research paper has been cited by 865 times with a HIC score of 24 and a CV of 239.
Author(s): Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh
Through this research paper around artificial intelligence, the authors point out the inadequacies of existing approaches to evaluating the performance of NLP models. The principles of behavioural testing in software engineering inspired researchers to introduce CheckList, a task-agnostic methodology for testing NLP models. It involves a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to produce a large and diverse number of test cases quickly.
Author(s): Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, SherjilOzair, Aaron Courville, YoshuaBengio
The authors in this AI research paper propose a new framework for estimating generative models via an adversarial process. They simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake.
Author(s): Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
Advances like SPPnet and Fast R-CNN have minimized the running time of state-of-the-art detection networks, exposing region proposal computation as a bottleneck. To this context, the authors introduce a Region Proposal Network (RPN), a fully convolutional network that simultaneously predicts object bounds and abjectness scores at each position. RPN shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.
Author(s): Min-Ling Zhang, Zhi-Hua Zhou
Multi-label learning studies the problem where each example is represented by a single instance while associated with a set of labels simultaneously. While there has been a significant amount of progress made toward the machine learning paradigm in the past decade, this paper aims to provide a timely review on this area with an emphasis on state-of-the-art multi-label learning algorithms.
Author(s): DzmitryBahdanau, Kyunghyun Cho, YoshuaBengio
Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belongs to a family of encoder-decoders. It involves an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation.
Author(s): David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, and others
The paper introduces a new approach to computer Go that uses 'value networks' to evaluate board positions and 'policy networks' to select moves in the game of Go. Go has been perceived as the most challenging of classic games for artificial intelligence. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play.
Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp
_____________
Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.