Combined data stream classification

Prof. Dr. Michal Wozniak,
Department of Systems and Computer Networks
Wroclaw University of Technology
Poland

Abstract

The progress of computer science caused that many institutions collected huge amount of data, which analysis is impossible by human beings. Nowadays simple methods of data analysis are not sufficient for efficient management of an average enterprise, since for smart decisions the knowledge hidden in data is highly required, among them methods of collective decision making called multiple classifier systems are the focus of intense research. Unfortunately the great disadvantage of traditional classification methods is that they “assume” that statistical properties of the discovered concept (which model is predicted) are being unchanged. In real situation we could observe so-called concept drift, which could be caused by changes in the probabilities of classes or/and conditional probability distributions of classes. The potential for considering new training data is an important feature of machine learning methods used in security applications (spam filters or IDS/IPS) or decision support systems for marketing departments, which need to follow the changing client behaviour. Unfortunately, the occurrence of this phenomena dramatically decreases classification accuracy.

The talk will focus on the following aspects of classification:
- pattern recognition task,
- combined classifiers (motivations, structure, models),
- chosen methods of data stream recognition,
- detecting concept change,
- applications.