Number Title Organizer(s)
T01 Forecasting with Recurrent Neural Networks: 12 Tricks Hans-Georg Zimmermann, Christoph Tietz and Ralph Grothmann
T02 The Mind-Brain, Big Data and Autonomous Learning Leonid I. Perlovsky
T03 Noninvasive Electroencephalogram-based Brain-Computer Interfaces Joao Luis Rosa
T04(a) Robust Model-based Learning: Methods, Algorithms and Applications Yixin Chen and Xin Dang
T04(b) Simulating an entire nervous system? An exemplary Caenorhabditis elegans emulation case study Axel Blau, Martin McGinnity, Brian Mc Ginley, Andoni Mujika
T05 Computational Intelligence for Wearable Physiological Sensing Danilo Mandic and Valentin Goverdovsky
T06 Compositionality and Self-Organization in Cognitive Minds: Lessons from Neuro-Robotics Experimental Studies Jun Tani
T07 Computational Neuroscience: Past - Present - Future
Péter Érdi
T08 Data visualization with dimensionality reduction and manifold learning Michel Verleysen and John A. Lee
T09 Dynamic Systems and Learning in the Model Space Huanhuan Chen and Peter Tino
T10 Use of Artificial Intelligence and Machine Learning in Quantum Computing Elizabeth Behrman and James Steck
T11 Multi-Task Learning Primer Georgios Anagnostopoulos and Cong Li
T12 Learning in indefinite proximity spaces: Mathematical foundations, representations,
and models
Peter Tino and Frank-Michael Schleif
T13 Successful Applications of Neural Networks for Information Fusion Stephen Stubberud and Kathleen Kramer
T14 Conformal Prediction: A Valid Approach to Confidence Predictions Henrik Boström, Alex Gammerman,Ulf Johansson, Lars Carlsson, Henrik Linusson
T15 Advances in Universum Learning Vladimir Cherkassky and Sauptik Dhar
T16 Learning Autonomously from Big Data Streams Plamen Angelov and Asim Roy
T17 Feature Selection Technique for Gene Expression Data Analysis
B. Chandra
T18 Spiking Neural Networks in Silicon: From Building Blocks to Architectures of Neuromorphic Systems Arindam Basu
T19 Learning architectures and training algorithms - comparative studies Bigdan Wilamowski


Tutorial 1

Forecasting with Recurrent Neural Networks: 12 Tricks

Hans-Georg Zimmermann
Senior Principal Research Scientist at Siemens AG, Corporate Technology

State space models are favored as descriptions for general dynamical systems. The construction of such models as recurrent neural networks emphasizes the separation of autonomous and externally driven subdynamics which are identified with neural training algorithms that have been proven to be universal approximators. The superposition of these subdynamics creates models that are successful in real world applications. New applications often show us where we have to improve our modeling approach.
How can we meet the challenge that for our applications only rarely all external drivers are known? How can we identify time independent low dimensional manifolds in parallel with time dependent dynamics to facilitate model building for high dimensional data series? Applications like the modeling of load curves are heavily dependent on the solution of this problem. How can we improve long term forecasts? Many approaches depend on external drivers which are assumed unrealistically as constant in the future. Our solution avoids this problem by internalizing all variables. The resulting approximator is universal again but the learning task has become a lot more difficult. We can meet this challenge with embedding techniques that help to control the training process. Long term forecasting can depend on long term memory which has haunted many modeling approaches in the past. How can we build models with long term memory? Beyond such techniques there are ways to look at the world with causal or retro-causal views which can be combined to cover systems on only implicitly defined manifolds which are partially expectation driven. Last but not least we derive quantitative measures of uncertainty from our models that are not available for classical approaches.
In this tutorial we will talk less about data but more about architectures and the challenges that real world applications pose.

Instructor: Study of mathematics, computer science and operations research. Since 1987 at Siemens AG, Corporate Technology. 1988 founding member of the neural networks group in Siemens Munich. Scientific leader of the Learning Systems Group. Current research interests: System Identification, Forecasting and Control with Neural Networks. Applications in economics (time series analysis, forecasting, diagnosis, uncertainty analysis, and decision support) and engineering (process modeling, renewable energies, control and process surveillance).

Reference: Zimmermann, H.G.; Tietz, C.; Grothmann, R.: Forecasting with Recurrent Neural Networks, In: Neural Networks: Tricks of the Trade, Second Edition; Eds.: Montavon, G; Orr G. B., Müller K.R.; p. 687-707; Springer, 2012; ISBN 978-3-642-35288-1

Tutorial 2


Leonid I. Perlovsky
Northeastern and Harvard University, lperl@rcn.com

The course focuses on mathematical models of the fundamental principles of the mind-brain neural mechanisms and practical applications in several fields. Big data and Autonomous Learning algorithms are discussed for modeling the mind-brain and engineering applications: cybersecurity, gene-phenotype associations, financial predictions, data mining, learning of patterns under noise, interaction of language and cognition in mental hierarchy. Mathematical models of mechanisms of concepts, emotions, instincts, language, cognition, intuitions, conscious and unconscious, abilities for symbols, functions of the beautiful and musical emotions in cognition and evolution. National and International Awards.
A mathematical and cognitive breakthrough, dynamic logic is described. It models cognitive processes “from vague and unconscious to crisp and conscious.” It resulted in more than 1000 times improvements in several engineering applications; recent brain imaging experiments at Harvard Medical School and several labs around the world proved it to be a valid model for the brain-mind processes. New cognitive and mathematical principles are discussed, language-cognition interaction, function of music in cognition and evolution of cultures. How does language interact with cognition? Do we think using language or is language just a label for completed thoughts? Why the music ability has evolved from animal cries to Bach and Lady Gaga?
Why human cognition needs emotions of beautiful, music, sublime. Dynamic logic implements Gödelian ideas to modeling the mind and to engineering, to knowledge instinct and language instinct; why are they different? How languages affect evolution of cultures. Language networks are scale-free and small-world, what does this tell us about cultural values? What are the biases of English, Spanish, French, German, Arabic, Chinese, Russian; what is the role of language in cultural differences? Mathematical models of the mind and cultures bear on contemporary world, and may be used to improve mutual understanding among peoples around the globe and reduce tensions among cultures.
Tutorial contents and related publications can be accessed at the website

Instructor: Dr. L Perlovsky is Professor of Psychology Northeastern University, CEO LPIT, past Visiting Scholar at Harvard School of Engineering and Applied Science, Harvard Medical School Martinos Brain Imaging Center, Principal Research Physicist and Technical Advisor at the AFRL. Research and publications on neural networks, modeling the mind and cognitive algorithms, language, music cognition, and cultures. Served as professor at Novosibirsk University and New York University; as a principal in commercial startups in biotechnology, text understanding, and financial predictions. His company predicted the market crash following 9/11 a week before the event. He is invited as a keynote plenary speaker and tutorial lecturer worldwide, including venues like Nobel Forum, published more than 500 papers, 17 book chapters, and 4 books. Serves on the Editorial Boards including Editor-in-Chief for “Physics of Life Reviews”, IF=9.5, rank #4 in the world. He received National and International awards including the Gabor Award, the top engineering award from the INNS; and the John McLucas Award, the highest US Air Force Award for basic research.


Tutorial 3

Noninvasive Electroencephalogram-based Brain-Computer Interfaces.

Joao Luis Garcia Rosa <joaoluis@icmc.usp.br>

Abstract: Brain-Computer Interfaces (BCI) is a form of communication that enables individuals unable to perform movements to connect to external assistive devices using the electroencephalogram (EEG) or other brain signals. Noninvasive BCIs capture changes in blood flow or fluctuations in electric and magnetic fields caused by the activity of large populations of neurons. The EEG, a non-invasive technique, measures the electrical activity of the brain in different locations of the head, typically using electrodes placed on the scalp. With the proper removal of artifacts, signal processing and machine learning, human EEG carries enough information about the intention of planning and execution. The objectives of the proposed tutorial are to show how the understanding of electrical activity of the brain, measured noninvasively by EEG, can provide a way to allow communication without muscle movements. The intention is, from the study of the neurodynamic behavior of the brain, to investigate ways and propose models that enable the noninvasive brain-computer interfaces.



Tutorial 4(a)

Robust Model-Based Learning

Yixin Chen and Xin Dang

Finite mixture models are powerful and flexible to represent arbitrarily complex probabilistic distribution of data. Mixture model-based approaches have been increasingly popular and applied in a wide range of fields in the past decades. They are used for density estimation in unsupervised clustering, for estimating class-conditional densities in supervised learning settings, and for outlier detection purposes. Usually parameters of a mixture model are estimated by the maximum likelihood estimate (MLE) via the expectation maximization (EM) algorithm. It is well known that the MLE and hence EM can be very sensitive to outliers. To overcome this limitation, various robust alternatives have been developed. The goal of this tutorial is (1) to review various robust EM methods; (2) to elaborate Spatial-EM and its algorithm; (3) to illustrate application of robust EM on supervised and unsupervised learning; (4) to introduce concepts of robust statistics to machine learning community. 


Tutorial 4(b)

Simulating an entire nervous system? An exemplary Caenorhabditis elegans emulation case study.

Axel Blau, Martin McGinnity, Brian Mc Ginley, Andoni Mujika.

What computational mechanisms and feedback loops need to be considered to faithfully mimic nervous system function? And what processes allow one of the most minimalistic nervous systems in nature - that of the nematode Caenorhabditis elegans - to not only sustain vital body function, but to give rise to a rich behavioral repertoire and basic forms of learning? Although there is no doubt that neuroscience will ultimately provide the answers, we ask whether and what neurocomputational approaches are suited to simulate neural processing events in an organism and thereby confirm or even anticipate some of the underlying principles. This tutorial will introduce the required elements to mimic the nervous system function of a real-world organism in a virtual behavioural context. Software and hardware-based representations of neural function will be compared. Currently available toolsets to define neural response models and brain-mimetic parallel information transfer techniques will be discussed. Means of virtually embodying a neural network in a meaningful behavioural context will be explored. We will finally illustrate how a holistic simulation/emulation platform could be accessed by users with diverse expertise and be exploited for neurocomputational studies.



Tutorial 5

Computational Intelligence For Wearable Physiological Sensing

Danilo Mandic and Valentin Goverdovsky

"This tutorial will bring together the latest advances in computational intelligence and signal processing for body sensor networks, focusing on real-world applications for next-generation personalised healthcare, where the sensors must be unobtrusive, self-operated and discreet. To this end, we will discuss an open-source biosensing platform, equipped with multimodal miniatuarised sensors (ECG, EEG, respiration, etc.) and will use this platform to generate our real-world examples."

Computational intelligence is, in principle, well equipped to deal with the issues arising from the requirements of 24/7 unobtrusive and user-operated physiological sensing, but the community is still lacking a coherent approach to this problem. To this end, this tutorial will cover the following aspects:

  • Biophysics behind data acquisition on the human body
  • Current technologies: clinical, ambulatory, tele-operating
  • Multimodal sensors - next generation space-saving and unobtrusive solutions
  • Challenges to data analysis arising from the miniature size of sensors and supporting electronics, such as data power levels and artefacts
  • An example of a fully integrated ultra-wearable sensing platform
  • Signal processing solutions for wearable physiological sensing (data conditioning, detection, estimation)
  • Computational intelligence solutions (data fusion, association, classification)
  • Putting this all together: in-the-ear wearable sensing and ultra-wearable sensing in an orthopedic clinic
  • Application examples: auditory and visual brain computer interfaces, fatigue, sleep and physiological stress
  • Links with big data, point-of-care healthcare, and distributed systems


Tutorial 6

Compositionality and Self-Organization in Cognitive Minds: Lessons from Neuro-Robotics Experimental Studies

Jun Tani, KAIST, South Korea

Abstract: The current tutorial addresses a crucial problem on how compositionality can be naturally developed in cognitive agents by having iterative sensory-motor interactions with the environment. In considering this problem, a recent promising proposal in the community of embodied cognition and neuro-robotics is to reconstruct higher-order cognition by means of continuous neuro-dynamic systems that can elaborate delicate interactions with the sensory-motor level while sharing the same metric space. The current tutorial attempts to deepen understanding of this proposal by introducing the basic theories to support the scheme for self-organizing compositionality and hierarchy in neuro-dynamic systems, the practices in the related developmental or neuro-robotic studies, and interdisciplinary discussions among neuroscience, nonlinear dynamical systems, developmental psychology and phenomenology of embodied minds which are related to the current theme


Tutorial 7

Computational Neuroscience: Past- Present -Future

Peter Erdi [perdi@kzoo.edu]

Computational neuroscience now became an accepted discipline of
neuroscience. Topics:

  1. From single cell to network models: history and newer developments
  2. GPS in the brain: about the Nobel prizes of 2014
  3. Neural rhythms and neural disorders: generation and control
  4. Newer developments in neuroinformatics


Tutorial 8

Data visualization with dimensionality reduction and manifold learning

John A. Lee and Michel Verleysen

Dimensionality reduction (DR) aims at providing faithful low-dimensional (LD) representations of high-dimensional (HD) data. DR is a ubiquitous tool in many branches of science, like sociology, psychometrics, statistics, and, more recently, in (big) data mining. Intuitively, a faithful LD representation of HD data preserves key properties of the original data. Choosing the key property (Euclidean distances, geodesic distances, similarities,…) largely influences the resulting representation. DR techniques being mostly unsupervised, quality assessment is also an important issue. This tutorial aims at presenting the latest DR advances in the field of machine learning. It will be accessible to researchers, scientists and practitioners with basic knowledge in mathematics. After a brief historical perspective, the tutorial will present modern DR methods relying on distance, neighborhood or similarity preservation, and using either spectral methods or nonlinear optimization tools. The tutorial will cover important issues such as scalability to big data, user interaction for dynamical exploration, reproducibility and stability.

Tutorial 9

Dynamic Systems and Learning in the Model Space

Huanhuan Chen and Peter Tino

Traditional machine learning for temporal data, such as time series, relies on the representation of data space. With the recent big data era, "learning in the model space" has been proposed to provide more robust and compact representation than the data space and to provide the potential more explanation of the approach. The core idea of "learning in the model space" is to use dynamic models fitted on parts of the data as more stable and parsimonious representations of the data. Learning is then performed directly in the model space instead of the original data space. In this tutorial we will present an unified view of dynamic systems as non-autonomous input-driven systems. In addition, we will focus on the three core questions in the model space for temporal data, including the generation of model space, the measure metric of the model space and the learning algorithms in the dynamic model space. The tutorial introduces the theory and algorithms on generation of model space and the presentation ability and classification ability in the model space, the metric based on functional analysis in the model space, and the online learning algorithm in the model space. In this tutorial, we will also demonstrate how to use dynamic systems to represent nonlinear multi-input multi-output (MIMO) system.


Tutorial 10

Artificial Intelligence and Machine Learning in Quantum Computing

Speakers: Elizabeth Behrman and James Steck

According to Time Magazine, “Quantum computing represents the marriage of two of the great scientific undertakings of the 20th century, quantum physics and digital computing”1 Quantum computing takes advantage of the rather odd and counter-intuitive rules of quantum mechanics like: superposition (a quantum system can be in more than one state or even one place at the same time), entanglement (instantaneous interaction at a distance) and quantum tunneling (a quantum system can switch states without surmounting an energy barrier between). Using these, a quantum computer can be built from very small, atomic sized, devices that can solve classical computing “hard problems”, and even quantum problems that are classically impossible to formulate and solve. The down side is: 1) quantum computers can be very difficult to build, and 2) they are difficult or impossible to program. Recent advances have dramatically addressed the first issue. In this Tutorial, we show how AI and machine learning can address the 2nd issue. We will give an introduction to quantum mechanics, then to the emerging field of quantum computing, then show how the use of AI and machine learning in quantum computing can be a powerful way to “program” quantum computers. The Tutorial is intended for the general AI community, and no prior knowledge or background in quantum mechanics will be assumed.


Tutorial 11

Multi-Task Learning Primer

Georgios Anagnostopoulos and Cong LI

There are several instances in practice, when faced with a group of several related recognition or regression tasks, for which the lack of sufficient data prevents one to train powerful enough models for each task independently. Multi-Task Learning (MTL) addresses this problem by leveraging the latent relationship among these tasks via co-training the respective models. This tutorial aims to provide an informative MTL primer. In specific, the tutorial will start discussing the origins of and the motivation behind MTL in the context of real-world applications. Next, an assortment of noteworthy information sharing approaches between tasks along with their modelling and algorithmic frameworks will be examined. Applications of MTL to Big Data problems, current trends in MTL research, as well as open problems in MTL will also be reviewed. Finally, the tutorial will conclude with a brief overview of available MTL-related resources. Our entire exposition assumes a basic machine learning background, familiarity with elementary optimization concepts and methods, as well as very rudimentary, prior exposure to kernel methods, graphical models and Bayesian inference. 


Tutorial 12

Learning in Indefinite Proximity Spaces

Peter Tino and Frank-Michael Schleif

The tutorial provides a comprehensive overview about the field: learning with non-metric proximities. Non-metric proximities are often obtained by domain specific measures like sequence lignment functions or shape measures where an explicit vector space of the data is missing. We introduce the formalism used in non-metric spaces and motivate specific treatments for non-metric proximity data. We also address the problem of large scale proximity learning. The discussed algorithms and concepts are widely applicable in supervised- and unsupervised learning.

The tutorial is open for computer science researchers interested
on data analysis and machine learning. The addressed topics
focus on the processing of non-standard data given as proximities only.

Tutorial material will be available at:

Tutorial 13

Successful Applications of Neural Networks for Information Fusion

Stephen Stubberud and Kathleen Kramer

Information fusion provides a number of applications that present significant opportunities for the effective use of neural networks. Both the need for neural network techniques to supply higher-level interpretations and combinations of data and existing methodologies that apply these will be presented in this tutorial. The tutorial will include existing techniques that are used in information fusion and examine the problems where neural networks can provide the robustness and adaptability to the complex challenges that are still considered open areas of research. As a foundation, we begin with a model of six levels of fusion decomposition. This then leads to the use of neural networks in the first four levels of the fusion model. The presenters have a combined 35 years of information fusion experience and have used neural networks and other intelligent approaches in many of their applications.

To learn more about this tutorial please see: NNFusionTutorial at: 

Tutorial 14

Conformal Prediction: A Valid Approach to Confidence Predictions

Henrik Bostrom, Alex Gammerman, Ulf Johansson, Lars Carlsson and Henrik Linusson

Tutorial summary:
How good is your prediction? In risk-sensitive applications, it is crucial to be able to assess the quality of a prediction, however, traditional classification and regression models don't provide their users with any information regarding prediction trustworthiness. In contrast, conformal classification and regression models associate each
of their multi-valued predictions with a measure of statistically valid confidence, and let their users specify a maximal threshold of the model's error rate --- the price to be paid is that predictions made with a higher confidence cover a larger area of the possible output space. This tutorial aims to provide its attendees with the knowledge necessary to implement conformal prediction in their daily data science work, be it research or practice oriented, as well as highlight current research topics on the subject.

Web page:

Tutorial 15


Vladimir Cherkassky1 and Sauptik Dhar2
1Dept. of Electrical & Computer Eng., University of Minnesota, Minneapolis, MN 55455
2Research and Technology Center, Robert Bosch LLC, Palo Alto, CA 94304

Most learning methods developed in statistics, machine learning, and pattern recognition assume a standard inductive learning formulation, in which the goal is to estimate a predictive model from finite training data. While this inductive setting is very general, there are several emerging non-standard learning settings that are particularly attractive for data-analytic modeling with sparse high-dimensional data. Such recent non-standard learning approaches include transduction, learning using privileged information, universum learning and multi-task learning. This tutorial describes the methodology called Universum learning or learning through contradiction (Vapnik 1998, 2006, Weston et al 2006, Sinz et al 2008). It provides a formal mechanism for incorporating a priori knowledge for binary classification problems. This knowledge is provided in the form of unlabeled Universum data samples, in addition to labeled training samples (under standard inductive setting). The Universum samples belong to the same application domain as training data. However, they do not belong to either class, so they are treated as contradictions under a modified SVM-like Universum formulation. Several recent analytical and empirical studies provide ample evidence that Universum learning can improve generalization performance, especially for very ill-posed sparse settings. This tutorial will present an overview of Universum learning for binary classification along with practical conditions for evaluating the effectiveness of Universum learning, relative to standard SVM classifiers (Cherkassky et al, 2011; Cherkassky, 2013). We also describe an extension of Universum SVM to cost-sensitive classification settings (Dhar and Cherkassky, 2012). 

The Universum learning methodology is known only for classification setting. It is not clear how to extend the idea of learning through contradiction to other types of learning problems because the notion of ‘contradiction’ has been originally introduced for binary classification (Vapnik 1998, 2006). In the second part of this tutorial we present general methodology for incorporating Universum into other types of learning problems. For these problems, one can also expect to achieve improved generalization performance by including additional data samples reflecting a priori knowledge about an application domain. In particular, we present new SVM-based formulation for regression and single-class learning problems that incorporate additional Universum data. Then we briefly discuss computational implementations of these new Universum optimization formulations. We also present several application examples to illustrate advantages of these new Universum formulations, relative to standard SVM regression and single-class SVM. Further, we discuss how the Universum single-class learning can be used for difficult classification problems in changing (nonstationary) environments.

Tutorial link:


Researchers and practitioners interested in understanding advanced SVM-based methods and applications. Participants are expected to have background knowledge of standard Support Vector Machine (SVM) classifiers.


  • Bai, X. and V. Cherkassky (2008), Gender classification of human faces using inference through contradictions, in Proc. IJCNN, Hong Kong, pp. 746–750, 2008
  • Cherkassky, V. and F. Mulier, Learning from Data, second edition, Wiley, 2007
  • Cherkassky, V., Predictive Learning, http://vctextbook.com/ 2013
  • Cherkassky, V., Dhar, S., and W. Dai, Practical Conditions for Effectiveness of the Universum
  • Learning, IEEE Transactions on Neural Networks,vol.22, no. 8, 1241-1255, 2011.
  • Dhar, S. and V. Cherkassky, Cost-Sensitive Universum-SVM, Proc. ICMLA, 2012
  • Vapnik, V., Statistical Learning Theory, Wiley, 1998
  • Vapnik, V., Empirical Inference Science: Afterword of 2006, Springer 2006
  • Weston, J, Collobert, R., Sinz, F., Bottou, L. and V. Vapnik (2006), Inference with the Universum, Proc. ICML.
  • Sinz, F., O. Chapelle, A. Agarwal, and B. Schölkopf, An analysis of inference with the Universum, Proc. NIPS-21, pp. 1–8, 2008.

Tutorial 16

Learning Autonomously from Big Data

Plamen Angelov and Asim Roy 

One of the important research challenges today is to cope effectively and
efficiently with the ever growing amount of data that is being exponentially produced by
sensors, Internet activity, nature and society. To deal with this ocean of zeta-bytes of data, data streams and navigate to the small islands of human-interpretable knowledge and information requires new types of analytics approaches and autonomous learning systems and processes.

This tutorial is timely, in view of the recent investments in Big Data and Data
Science in general, including UK £42M Alan Turing Institute. Traditionally, for decades or even centuries machine learning, AI, cognitive science were developed with the assumption that the data available to test and validate the hypotheses is a small, finite volume and can be processed iteratively and offline. The realities of dynamically evolving big data streams and big data sets (e.g. pentabytes of data from retail industry, high frequency trading, genomics or other areas) become more prominent only during the last decade or so. This poses new challenges and requires new, revolutionary approaches.



Tutorial 17

Feature Selection Technique for Gene Expression Data Analysis

B Chandra

Classification of gene expression data plays a significant role in prediction and diagnosis of diseases. Gene expression data has a special characteristic that there is a mismatch in gene dimension as opposed to
sample dimension. All genes do not contribute for efficient classification of samples. A robust feature selection algorithm is required to identify the important genes which help in classifying the samples efficiently. The tutorial will focus on both supervised and unsupervised feature selection techniques suited for Gene Expression data. Techniques like Relief F, mRMR, Laplacian score will be discussed in depth with benchmark microarray datasets. A new feature selection technique based on statistically defined effective range of features termed as Effective Range based Gene Selection ,ERGS(Chandra et.al, 2011) will also be dealt with in depth which helps in identifying the most relevant genes responsible for diseases like leukemia, colon cancer. Software implementation for all the feature selection algorithms will also be illustrated


Tutorial 18

Spiking Neural Networks in Silicon: From Building Blocks to Architectures of Neuromorphic Systems

Organizer: Asst. Prof. Arindam Basu, NTU, Singapore

Spiking Neural Networks (SNN) have gained popularity as the third generation of neural networks with more computational power than the earlier ones and also for being closely matched to its biological counterparts. Hence, there is also an increasing amount of effort devoted to implementing low power and large scale versions of these networks in integrated circuits (VLSI) for various applications. This tutorial will introduce the advances in the design of such low power neural network circuits and systems over the last decade. It is organized in three parts—the first part introduces some applications requiring low power spiking neural circuits for smart sensors. The second part will cover different circuit implementations of the two major building blocks: neuron and synapse. Different design styles from phenomenological to bio-realistic will be covered. The last part will describe system level architectures (such as address event representation, field programmable analog arrays etc) and issues in creating large networks from these fundamental building blocks.


Tutorial 19

Learning architectures and training algorithms - comparative studies

Bogdan M. Wilamowski, IEEE Fellow

The traditional approach for solving complex problems and processes usually follows the following steps: At first we are trying to understand them, and then we are trying to describe them in the form of mathematical formulas. This classical Da Vinci approach was used for the last several centuries, and unfortunately it cannot be applied to many current complex problems. These problems are very difficult to understand and process by humans. Notice that many environmental, economic, and often engineering problems cannot be described by equations, and it seems that adaptive learning architectures are the only solution to tackling these complex problems. Many smaller scale problems were already solved using shallow architectures such as ANN, SVM, or ELM. However, for more complex problems, more deep learning systems with enhanced capabilities are needed. It has already been demonstrated that much higher capabilities of super compact architectures have 10 to 100 times more processing power than commonly used learning architectures like MLP. It is possible to train them very efficiently if the network is shallow like SVM or ELM. It turns out that the power of learning systems grows linearly with their widths and exponentially with their depth. For example, such a shallow MLP architecture with 10 neurons can solve only a Parity-9 problem, but a special deep FCC architecture with the same 10 neurons can solve as large a problem as a Parity-1023. Therefore, a natural approach would be to use these deep architectures. Unfortunately, because of the vanishing gradient problem, these deep architectures are very difficult to train, so a mixture of different approaches is used with partial success. Until now, it is assumed that it is not possible to train neural networks with more than 6 hidden layers. We have demonstrated that it is possible to efficiently train much deeper networks. This became possible by the introduction of additional connections across layers and to use our new very powerful NBN algorithm.