Forecasting with Recurrent Neural Networks: 12 Tricks
THE MIND-BRAIN, BIG DATA, AND AUTONOMOUS LEARNING
The course focuses on mathematical models of the fundamental principles of the mind-brain neural mechanisms and practical applications in several fields. Big data and Autonomous Learning algorithms are discussed for modeling the mind-brain and engineering applications: cybersecurity, gene-phenotype associations, financial predictions, data mining, learning of patterns under noise, interaction of language and cognition in mental hierarchy. Mathematical models of mechanisms of concepts, emotions, instincts, language, cognition, intuitions, conscious and unconscious, abilities for symbols, functions of the beautiful and musical emotions in cognition and evolution. National and International Awards.
Instructor: Dr. L Perlovsky is Professor of Psychology Northeastern University, CEO LPIT, past Visiting Scholar at Harvard School of Engineering and Applied Science, Harvard Medical School Martinos Brain Imaging Center, Principal Research Physicist and Technical Advisor at the AFRL. Research and publications on neural networks, modeling the mind and cognitive algorithms, language, music cognition, and cultures. Served as professor at Novosibirsk University and New York University; as a principal in commercial startups in biotechnology, text understanding, and financial predictions. His company predicted the market crash following 9/11 a week before the event. He is invited as a keynote plenary speaker and tutorial lecturer worldwide, including venues like Nobel Forum, published more than 500 papers, 17 book chapters, and 4 books. Serves on the Editorial Boards including Editor-in-Chief for “Physics of Life Reviews”, IF=9.5, rank #4 in the world. He received National and International awards including the Gabor Award, the top engineering award from the INNS; and the John McLucas Award, the highest US Air Force Award for basic research.
Noninvasive Electroencephalogram-based Brain-Computer Interfaces.
Abstract: Brain-Computer Interfaces (BCI) is a form of communication that enables individuals unable to perform movements to connect to external assistive devices using the electroencephalogram (EEG) or other brain signals. Noninvasive BCIs capture changes in blood flow or fluctuations in electric and magnetic fields caused by the activity of large populations of neurons. The EEG, a non-invasive technique, measures the electrical activity of the brain in different locations of the head, typically using electrodes placed on the scalp. With the proper removal of artifacts, signal processing and machine learning, human EEG carries enough information about the intention of planning and execution. The objectives of the proposed tutorial are to show how the understanding of electrical activity of the brain, measured noninvasively by EEG, can provide a way to allow communication without muscle movements. The intention is, from the study of the neurodynamic behavior of the brain, to investigate ways and propose models that enable the noninvasive brain-computer interfaces.
Robust Model-Based Learning
Finite mixture models are powerful and flexible to represent arbitrarily complex probabilistic distribution of data. Mixture model-based approaches have been increasingly popular and applied in a wide range of fields in the past decades. They are used for density estimation in unsupervised clustering, for estimating class-conditional densities in supervised learning settings, and for outlier detection purposes. Usually parameters of a mixture model are estimated by the maximum likelihood estimate (MLE) via the expectation maximization (EM) algorithm. It is well known that the MLE and hence EM can be very sensitive to outliers. To overcome this limitation, various robust alternatives have been developed. The goal of this tutorial is (1) to review various robust EM methods; (2) to elaborate Spatial-EM and its algorithm; (3) to illustrate application of robust EM on supervised and unsupervised learning; (4) to introduce concepts of robust statistics to machine learning community.
Simulating an entire nervous system? An exemplary Caenorhabditis elegans emulation case study.
What computational mechanisms and feedback loops need to be considered to faithfully mimic nervous system function? And what processes allow one of the most minimalistic nervous systems in nature - that of the nematode Caenorhabditis elegans - to not only sustain vital body function, but to give rise to a rich behavioral repertoire and basic forms of learning? Although there is no doubt that neuroscience will ultimately provide the answers, we ask whether and what neurocomputational approaches are suited to simulate neural processing events in an organism and thereby confirm or even anticipate some of the underlying principles. This tutorial will introduce the required elements to mimic the nervous system function of a real-world organism in a virtual behavioural context. Software and hardware-based representations of neural function will be compared. Currently available toolsets to define neural response models and brain-mimetic parallel information transfer techniques will be discussed. Means of virtually embodying a neural network in a meaningful behavioural context will be explored. We will finally illustrate how a holistic simulation/emulation platform could be accessed by users with diverse expertise and be exploited for neurocomputational studies.
Computational Intelligence For Wearable Physiological Sensing
"This tutorial will bring together the latest advances in computational intelligence and signal processing for body sensor networks, focusing on real-world applications for next-generation personalised healthcare, where the sensors must be unobtrusive, self-operated and discreet. To this end, we will discuss an open-source biosensing platform, equipped with multimodal miniatuarised sensors (ECG, EEG, respiration, etc.) and will use this platform to generate our real-world examples."
Computational intelligence is, in principle, well equipped to deal with the issues arising from the requirements of 24/7 unobtrusive and user-operated physiological sensing, but the community is still lacking a coherent approach to this problem. To this end, this tutorial will cover the following aspects:
Compositionality and Self-Organization in Cognitive Minds: Lessons from Neuro-Robotics Experimental Studies
Abstract: The current tutorial addresses a crucial problem on how compositionality can be naturally developed in cognitive agents by having iterative sensory-motor interactions with the environment. In considering this problem, a recent promising proposal in the community of embodied cognition and neuro-robotics is to reconstruct higher-order cognition by means of continuous neuro-dynamic systems that can elaborate delicate interactions with the sensory-motor level while sharing the same metric space. The current tutorial attempts to deepen understanding of this proposal by introducing the basic theories to support the scheme for self-organizing compositionality and hierarchy in neuro-dynamic systems, the practices in the related developmental or neuro-robotic studies, and interdisciplinary discussions among neuroscience, nonlinear dynamical systems, developmental psychology and phenomenology of embodied minds which are related to the current theme
Computational Neuroscience: Past- Present -Future
Computational neuroscience now became an accepted discipline of
Data visualization with dimensionality reduction and manifold learning
Dimensionality reduction (DR) aims at providing faithful low-dimensional (LD) representations of high-dimensional (HD) data. DR is a ubiquitous tool in many branches of science, like sociology, psychometrics, statistics, and, more recently, in (big) data mining. Intuitively, a faithful LD representation of HD data preserves key properties of the original data. Choosing the key property (Euclidean distances, geodesic distances, similarities,…) largely influences the resulting representation. DR techniques being mostly unsupervised, quality assessment is also an important issue. This tutorial aims at presenting the latest DR advances in the field of machine learning. It will be accessible to researchers, scientists and practitioners with basic knowledge in mathematics. After a brief historical perspective, the tutorial will present modern DR methods relying on distance, neighborhood or similarity preservation, and using either spectral methods or nonlinear optimization tools. The tutorial will cover important issues such as scalability to big data, user interaction for dynamical exploration, reproducibility and stability.
Dynamic Systems and Learning in the Model Space
Traditional machine learning for temporal data, such as time series, relies on the representation of data space. With the recent big data era, "learning in the model space" has been proposed to provide more robust and compact representation than the data space and to provide the potential more explanation of the approach. The core idea of "learning in the model space" is to use dynamic models fitted on parts of the data as more stable and parsimonious representations of the data. Learning is then performed directly in the model space instead of the original data space. In this tutorial we will present an unified view of dynamic systems as non-autonomous input-driven systems. In addition, we will focus on the three core questions in the model space for temporal data, including the generation of model space, the measure metric of the model space and the learning algorithms in the dynamic model space. The tutorial introduces the theory and algorithms on generation of model space and the presentation ability and classification ability in the model space, the metric based on functional analysis in the model space, and the online learning algorithm in the model space. In this tutorial, we will also demonstrate how to use dynamic systems to represent nonlinear multi-input multi-output (MIMO) system.
Artificial Intelligence and Machine Learning in Quantum Computing
According to Time Magazine, “Quantum computing represents the marriage of two of the great scientific undertakings of the 20th century, quantum physics and digital computing”1 Quantum computing takes advantage of the rather odd and counter-intuitive rules of quantum mechanics like: superposition (a quantum system can be in more than one state or even one place at the same time), entanglement (instantaneous interaction at a distance) and quantum tunneling (a quantum system can switch states without surmounting an energy barrier between). Using these, a quantum computer can be built from very small, atomic sized, devices that can solve classical computing “hard problems”, and even quantum problems that are classically impossible to formulate and solve. The down side is: 1) quantum computers can be very difficult to build, and 2) they are difficult or impossible to program. Recent advances have dramatically addressed the first issue. In this Tutorial, we show how AI and machine learning can address the 2nd issue. We will give an introduction to quantum mechanics, then to the emerging field of quantum computing, then show how the use of AI and machine learning in quantum computing can be a powerful way to “program” quantum computers. The Tutorial is intended for the general AI community, and no prior knowledge or background in quantum mechanics will be assumed.
Multi-Task Learning Primer
There are several instances in practice, when faced with a group of several related recognition or regression tasks, for which the lack of sufficient data prevents one to train powerful enough models for each task independently. Multi-Task Learning (MTL) addresses this problem by leveraging the latent relationship among these tasks via co-training the respective models. This tutorial aims to provide an informative MTL primer. In specific, the tutorial will start discussing the origins of and the motivation behind MTL in the context of real-world applications. Next, an assortment of noteworthy information sharing approaches between tasks along with their modelling and algorithmic frameworks will be examined. Applications of MTL to Big Data problems, current trends in MTL research, as well as open problems in MTL will also be reviewed. Finally, the tutorial will conclude with a brief overview of available MTL-related resources. Our entire exposition assumes a basic machine learning background, familiarity with elementary optimization concepts and methods, as well as very rudimentary, prior exposure to kernel methods, graphical models and Bayesian inference.
Learning in Indefinite Proximity Spaces
The tutorial provides a comprehensive overview about the field: learning with non-metric proximities. Non-metric proximities are often obtained by domain specific measures like sequence lignment functions or shape measures where an explicit vector space of the data is missing. We introduce the formalism used in non-metric spaces and motivate specific treatments for non-metric proximity data. We also address the problem of large scale proximity learning. The discussed algorithms and concepts are widely applicable in supervised- and unsupervised learning.
The tutorial is open for computer science researchers interested
Tutorial material will be available at:
Successful Applications of Neural Networks for Information Fusion
Information fusion provides a number of applications that present significant opportunities for the effective use of neural networks. Both the need for neural network techniques to supply higher-level interpretations and combinations of data and existing methodologies that apply these will be presented in this tutorial. The tutorial will include existing techniques that are used in information fusion and examine the problems where neural networks can provide the robustness and adaptability to the complex challenges that are still considered open areas of research. As a foundation, we begin with a model of six levels of fusion decomposition. This then leads to the use of neural networks in the first four levels of the fusion model. The presenters have a combined 35 years of information fusion experience and have used neural networks and other intelligent approaches in many of their applications.
To learn more about this tutorial please see: NNFusionTutorial at:
Conformal Prediction: A Valid Approach to Confidence Predictions
ADVANCES in UNIVERSUM LEARNING
Most learning methods developed in statistics, machine learning, and pattern recognition assume a standard inductive learning formulation, in which the goal is to estimate a predictive model from finite training data. While this inductive setting is very general, there are several emerging non-standard learning settings that are particularly attractive for data-analytic modeling with sparse high-dimensional data. Such recent non-standard learning approaches include transduction, learning using privileged information, universum learning and multi-task learning. This tutorial describes the methodology called Universum learning or learning through contradiction (Vapnik 1998, 2006, Weston et al 2006, Sinz et al 2008). It provides a formal mechanism for incorporating a priori knowledge for binary classification problems. This knowledge is provided in the form of unlabeled Universum data samples, in addition to labeled training samples (under standard inductive setting). The Universum samples belong to the same application domain as training data. However, they do not belong to either class, so they are treated as contradictions under a modified SVM-like Universum formulation. Several recent analytical and empirical studies provide ample evidence that Universum learning can improve generalization performance, especially for very ill-posed sparse settings. This tutorial will present an overview of Universum learning for binary classification along with practical conditions for evaluating the effectiveness of Universum learning, relative to standard SVM classifiers (Cherkassky et al, 2011; Cherkassky, 2013). We also describe an extension of Universum SVM to cost-sensitive classification settings (Dhar and Cherkassky, 2012).
The Universum learning methodology is known only for classification setting. It is not clear how to extend the idea of learning through contradiction to other types of learning problems because the notion of ‘contradiction’ has been originally introduced for binary classification (Vapnik 1998, 2006). In the second part of this tutorial we present general methodology for incorporating Universum into other types of learning problems. For these problems, one can also expect to achieve improved generalization performance by including additional data samples reflecting a priori knowledge about an application domain. In particular, we present new SVM-based formulation for regression and single-class learning problems that incorporate additional Universum data. Then we briefly discuss computational implementations of these new Universum optimization formulations. We also present several application examples to illustrate advantages of these new Universum formulations, relative to standard SVM regression and single-class SVM. Further, we discuss how the Universum single-class learning can be used for difficult classification problems in changing (nonstationary) environments.
TUTORIAL LENGTH: 2 hours.
Learning Autonomously from Big Data
Feature Selection Technique for Gene Expression Data Analysis
Classification of gene expression data plays a significant role in prediction and diagnosis of diseases. Gene expression data has a special characteristic that there is a mismatch in gene dimension as opposed to
Spiking Neural Networks in Silicon: From Building Blocks to Architectures of Neuromorphic Systems
Spiking Neural Networks (SNN) have gained popularity as the third generation of neural networks with more computational power than the earlier ones and also for being closely matched to its biological counterparts. Hence, there is also an increasing amount of effort devoted to implementing low power and large scale versions of these networks in integrated circuits (VLSI) for various applications. This tutorial will introduce the advances in the design of such low power neural network circuits and systems over the last decade. It is organized in three parts—the first part introduces some applications requiring low power spiking neural circuits for smart sensors. The second part will cover different circuit implementations of the two major building blocks: neuron and synapse. Different design styles from phenomenological to bio-realistic will be covered. The last part will describe system level architectures (such as address event representation, field programmable analog arrays etc) and issues in creating large networks from these fundamental building blocks.
Learning architectures and training algorithms - comparative studies
The traditional approach for solving complex problems and processes usually follows the following steps: At first we are trying to understand them, and then we are trying to describe them in the form of mathematical formulas. This classical Da Vinci approach was used for the last several centuries, and unfortunately it cannot be applied to many current complex problems. These problems are very difficult to understand and process by humans. Notice that many environmental, economic, and often engineering problems cannot be described by equations, and it seems that adaptive learning architectures are the only solution to tackling these complex problems. Many smaller scale problems were already solved using shallow architectures such as ANN, SVM, or ELM. However, for more complex problems, more deep learning systems with enhanced capabilities are needed. It has already been demonstrated that much higher capabilities of super compact architectures have 10 to 100 times more processing power than commonly used learning architectures like MLP. It is possible to train them very efficiently if the network is shallow like SVM or ELM. It turns out that the power of learning systems grows linearly with their widths and exponentially with their depth. For example, such a shallow MLP architecture with 10 neurons can solve only a Parity-9 problem, but a special deep FCC architecture with the same 10 neurons can solve as large a problem as a Parity-1023. Therefore, a natural approach would be to use these deep architectures. Unfortunately, because of the vanishing gradient problem, these deep architectures are very difficult to train, so a mixture of different approaches is used with partial success. Until now, it is assumed that it is not possible to train neural networks with more than 6 hidden layers. We have demonstrated that it is possible to efficiently train much deeper networks. This became possible by the introduction of additional connections across layers and to use our new very powerful NBN algorithm.