A Reservoir of Adaptive Algorithms for Online Learning from Evolving Data Streams

A Reservoir of Adaptive Algorithms for Online Learning from Evolving Data Streams
Author: Ali Pesaranghader
Publisher:
Total Pages:
Release: 2018
Genre:
ISBN:

Download A Reservoir of Adaptive Algorithms for Online Learning from Evolving Data Streams Book in PDF, Epub and Kindle

Continuous change and development are essential aspects of evolving environments and applications, including, but not limited to, smart cities, military, medicine, nuclear reactors, self-driving cars, aviation, and aerospace. That is, the fundamental characteristics of such environments may evolve, and so cause dangerous consequences, e.g., putting people lives at stake, if no reaction is adopted. Therefore, learning systems need to apply intelligent algorithms to monitor evolvement in their environments and update themselves effectively. Further, we may experience fluctuations regarding the performance of learning algorithms due to the nature of incoming data as it continuously evolves. That is, the current efficient learning approach may become deprecated after a change in data or environment. Hence, the question 'how to have an efficient learning algorithm over time against evolving data?' has to be addressed. In this thesis, we have made two contributions to settle the challenges described above. In the machine learning literature, the phenomenon of (distributional) change in data is known as concept drift. Concept drift may shift decision boundaries, and cause a decline in accuracy. Learning algorithms, indeed, have to detect concept drift in evolving data streams and replace their predictive models accordingly. To address this challenge, adaptive learners have been devised which may utilize drift detection methods to locate the drift points in dynamic and changing data streams. A drift detection method able to discover the drift points quickly, with the lowest false positive and false negative rates, is preferred. False positive refers to incorrectly alarming for concept drift, and false negative refers to not alarming for concept drift. In this thesis, we introduce three algorithms, called as the Fast Hoeffding Drift Detection Method (FHDDM), the Stacking Fast Hoeffding Drift Detection Method (FHDDMS), and the McDiarmid Drift Detection Methods (MDDMs), for detecting drift points with the minimum delay, false positive, and false negative rates. FHDDM is a sliding window-based algorithm and applies Hoeffding's inequality (Hoeffding, 1963) to detect concept drift. FHDDM slides its window over the prediction results, which are either 1 (for a correct prediction) or 0 (for a wrong prediction). Meanwhile, it compares the mean of elements inside the window with the maximum mean observed so far; subsequently, a significant difference between the two means, upper-bounded by the Hoeffding inequality, indicates the occurrence of concept drift. The FHDDMS extends the FHDDM algorithm by sliding multiple windows over its entries for a better drift detection regarding the detection delay and false negative rate. In contrast to FHDDM/S, the MDDM variants assign weights to their entries, i.e., higher weights are associated with the most recent entries in the sliding window, for faster detection of concept drift. The rationale is that recent examples reflect the ongoing situation adequately. Then, by putting higher weights on the latest entries, we may detect concept drift quickly. An MDDM algorithm bounds the difference between the weighted mean of elements in the sliding window and the maximum weighted mean seen so far, using McDiarmid's inequality (McDiarmid, 1989). Eventually, it alarms for concept drift once a significant difference is experienced. We experimentally show that FHDDM/S and MDDMs outperform the state-of-the-art by representing promising results in terms of the adaptation and classification measures. Due to the evolving nature of data streams, the performance of an adaptive learner, which is defined by the classification, adaptation, and resource consumption measures, may fluctuate over time. In fact, a learning algorithm, in the form of a (classifier, detector) pair, may present a significant performance before a concept drift point, but not after. We define this problem by the question 'how can we ensure that an efficient classifier-detector pair is present at any time in an evolving environment?' To answer this, we have developed the Tornado framework which runs various kinds of learning algorithms simultaneously against evolving data streams. Each algorithm incrementally and independently trains a predictive model and updates the statistics of its drift detector. Meanwhile, our framework monitors the (classifier, detector) pairs, and recommends the efficient one, concerning the classification, adaptation, and resource consumption performance, to the user. We further define the holistic CAR measure that integrates the classification, adaptation, and resource consumption measures for evaluating the performance of adaptive learning algorithms. Our experiments confirm that the most efficient algorithm may differ over time because of the developing and evolving nature of data streams.

Machine Learning for Data Streams

Machine Learning for Data Streams
Author: Albert Bifet
Publisher: MIT Press
Total Pages: 289
Release: 2023-05-09
Genre: Computers
ISBN: 026254783X

Download Machine Learning for Data Streams Book in PDF, Epub and Kindle

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework. Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations. The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.

Adaptive Stream Mining

Adaptive Stream Mining
Author: Albert Bifet
Publisher: IOS Press
Total Pages: 224
Release: 2010
Genre: Computers
ISBN: 1607500906

Download Adaptive Stream Mining Book in PDF, Epub and Kindle

This book is a significant contribution to the subject of mining time-changing data streams and addresses the design of learning algorithms for this purpose. It introduces new contributions on several different aspects of the problem, identifying research opportunities and increasing the scope for applications. It also includes an in-depth study of stream mining and a theoretical analysis of proposed methods and algorithms. The first section is concerned with the use of an adaptive sliding window algorithm (ADWIN). Since this has rigorous performance guarantees, using it in place of counters or accumulators, it offers the possibility of extending such guarantees to learning and mining algorithms not initially designed for drifting data. Testing with several methods, including Naïve Bayes, clustering, decision trees and ensemble methods, is discussed as well. The second part of the book describes a formal study of connected acyclic graphs, or 'trees', from the point of view of closure-based mining, presenting efficient algorithms for subtree testing and for mining ordered and unordered frequent closed trees. Lastly, a general methodology to identify closed patterns in a data stream is outlined. This is applied to develop an incremental method, a sliding-window based method, and a method that mines closed trees adaptively from data streams. These are used to introduce classification methods for tree data streams.

Learning from Data Streams in Dynamic Environments

Learning from Data Streams in Dynamic Environments
Author: Moamar Sayed-Mouchaweh
Publisher: Springer
Total Pages: 82
Release: 2015-12-10
Genre: Technology & Engineering
ISBN: 331925667X

Download Learning from Data Streams in Dynamic Environments Book in PDF, Epub and Kindle

This book addresses the problems of modeling, prediction, classification, data understanding and processing in non-stationary and unpredictable environments. It presents major and well-known methods and approaches for the design of systems able to learn and to fully adapt its structure and to adjust its parameters according to the changes in their environments. Also presents the problem of learning in non-stationary environments, its interests, its applications and challenges and studies the complementarities and the links between the different methods and techniques of learning in evolving and non-stationary environments.

Knowledge Discovery from Data Streams

Knowledge Discovery from Data Streams
Author: Joao Gama
Publisher: CRC Press
Total Pages: 256
Release: 2010-05-25
Genre: Business & Economics
ISBN: 1439826129

Download Knowledge Discovery from Data Streams Book in PDF, Epub and Kindle

Since the beginning of the Internet age and the increased use of ubiquitous computing devices, the large volume and continuous flow of distributed data have imposed new constraints on the design of learning algorithms. Exploring how to extract knowledge structures from evolving and time-changing data, Knowledge Discovery from Data Streams presents

Big Data Analysis: New Algorithms for a New Society

Big Data Analysis: New Algorithms for a New Society
Author: Nathalie Japkowicz
Publisher: Springer
Total Pages: 334
Release: 2015-12-16
Genre: Technology & Engineering
ISBN: 3319269895

Download Big Data Analysis: New Algorithms for a New Society Book in PDF, Epub and Kindle

This edited volume is devoted to Big Data Analysis from a Machine Learning standpoint as presented by some of the most eminent researchers in this area. It demonstrates that Big Data Analysis opens up new research problems which were either never considered before, or were only considered within a limited range. In addition to providing methodological discussions on the principles of mining Big Data and the difference between traditional statistical data analysis and newer computing frameworks, this book presents recently developed algorithms affecting such areas as business, financial forecasting, human mobility, the Internet of Things, information networks, bioinformatics, medical systems and life science. It explores, through a number of specific examples, how the study of Big Data Analysis has evolved and how it has started and will most likely continue to affect society. While the benefits brought upon by Big Data Analysis are underlined, the book also discusses some of the warnings that have been issued concerning the potential dangers of Big Data Analysis along with its pitfalls and challenges.

Learning in Non-Stationary Environments

Learning in Non-Stationary Environments
Author: Moamar Sayed-Mouchaweh
Publisher: Springer Science & Business Media
Total Pages: 439
Release: 2012-04-13
Genre: Technology & Engineering
ISBN: 1441980202

Download Learning in Non-Stationary Environments Book in PDF, Epub and Kindle

Recent decades have seen rapid advances in automatization processes, supported by modern machines and computers. The result is significant increases in system complexity and state changes, information sources, the need for faster data handling and the integration of environmental influences. Intelligent systems, equipped with a taxonomy of data-driven system identification and machine learning algorithms, can handle these problems partially. Conventional learning algorithms in a batch off-line setting fail whenever dynamic changes of the process appear due to non-stationary environments and external influences. Learning in Non-Stationary Environments: Methods and Applications offers a wide-ranging, comprehensive review of recent developments and important methodologies in the field. The coverage focuses on dynamic learning in unsupervised problems, dynamic learning in supervised classification and dynamic learning in supervised regression problems. A later section is dedicated to applications in which dynamic learning methods serve as keystones for achieving models with high accuracy. Rather than rely on a mathematical theorem/proof style, the editors highlight numerous figures, tables, examples and applications, together with their explanations. This approach offers a useful basis for further investigation and fresh ideas and motivates and inspires newcomers to explore this promising and still emerging field of research.

5th International Symposium on Data Mining Applications

5th International Symposium on Data Mining Applications
Author: Mamdouh Alenezi
Publisher: Springer
Total Pages: 257
Release: 2018-03-28
Genre: Technology & Engineering
ISBN: 3319787535

Download 5th International Symposium on Data Mining Applications Book in PDF, Epub and Kindle

The 5th Symposium on Data Mining Applications (SDMA 2018) provides valuable opportunities for technical collaboration among data mining and machine learning researchers in Saudi Arabia, Gulf Cooperation Council (GCC) countries and the Middle East region. This book gathers the proceedings of the SDMA 2018. All papers were peer-reviewed based on a strict policy concerning the originality, significance to the area, scientific vigor and quality of the contribution, and address the following research areas.• Applications: Applications of data mining in domains including databases, social networks, web, bioinformatics, finance, healthcare, and security.• Algorithms: Data mining and machine learning foundations, algorithms, models, and theory.• Text Mining: Semantic analysis and mining text in Arabic, semi-structured, streaming, multimedia data.• Framework: Data mining frameworks, platforms and systems implementation.• Visualizations: Data visualization and modeling.

Adaptive Micro Learning - Using Fragmented Time To Learn

Adaptive Micro Learning - Using Fragmented Time To Learn
Author: Geng Sun
Publisher: World Scientific
Total Pages: 151
Release: 2020-02-18
Genre: Computers
ISBN: 981120747X

Download Adaptive Micro Learning - Using Fragmented Time To Learn Book in PDF, Epub and Kindle

This compendium introduces an artificial intelligence-supported solution to realize adaptive micro learning over open education resource (OER). The advantages of cloud computing and big data are leveraged to promote the categorization and customization of OERs micro learning context. For a micro-learning service, OERs are tailored into fragmented pieces to be consumed within shorter time frames.Firstly, the current status of mobile-learning, micro-learning, and OERs are described. Then, the significances and challenges of Micro Learning as a Service (MLaaS) are discussed. A framework of a service-oriented system is provided, which adopts both online and offline computation domain to work in conjunction to improve the performance of learning resource adaptation.In addition, a comprehensive learner model and a knowledge base is prepared to semantically profile the learners and learning resource. The novel delivery and access mode of OERs suffers from the cold start problem because of the shortage of already-known learner information versus the continuously released new micro OERs. This unique volume provides an excellent feasible algorithmic solution to overcome the cold start problem.

Learning from Data Streams

Learning from Data Streams
Author: João Gama
Publisher: Springer Science & Business Media
Total Pages: 486
Release: 2007-10-11
Genre: Computers
ISBN: 3540736786

Download Learning from Data Streams Book in PDF, Epub and Kindle

Processing data streams has raised new research challenges over the last few years. This book provides the reader with a comprehensive overview of stream data processing, including famous prototype implementations like the Nile system and the TinyOS operating system. Applications in security, the natural sciences, and education are presented. The huge bibliography offers an excellent starting point for further reading and future research.