Home      Log In      Contacts      FAQs      INSTICC Portal
 

Tutorials

The role of the tutorials is to provide a platform for a more intensive scientific exchange amongst researchers interested in a particular topic and as a meeting point for the community. Tutorials complement the depth-oriented technical sessions by providing participants with broad overviews of emerging fields. A tutorial can be scheduled for 1.5, 3 hours or 6 hours.

Alfred Inselberg
Tel Aviv University
Israel

Tutorial Outline


Visualization & Data Mining for High Dimensional Datasets
Abstract
A dataset with M items has 2M subsets anyone of which may be the one fullfiling our objectives.
With a good data display and interactivity our fantastic pattern-recognition can not only cut great swaths searching through this combinatorial explosion, but also extract insights from the visual patterns.
These are the core reasons for data visualization.
With parallel coordinates (abbr. ||-cs) the search for relations in multivariate datasets is transformed into a 2-D pattern recognition problem.
The foundations are developed interlaced with applications. Guidelines and strategies for knowledge discovery are illustrated on several real datasets (financial, process control, credit-score, intrusion-detection etc) one with hundreds of variables.
A geometric classification algorithm is presented and applied to complex datasets. It has low computational complexity providing the classification rule explicitly and visually.
The minimal set of variables required to state the rule (features) is found and ordered by their predictive value.
Multivariate relations can be modeled as hypersurfaces and used for decision support.
A model of a (real) country's economy reveals sensitivies, impact of constraints, trade-offs and economic sectors unknowingly competing for the same resources. An overview of the methodology provides foundational understanding; learning the patterns corresponding to various multivariate relations. These patterns are robust in the presence of errors and that is good news for the applications. We stand at the threshold of breaching the gridlock of multidimensional visualization.

The parallel coordinates methodology has been applied to collision avoidance and conflict resolution algorithms for air traffic control (3 USA patents), computer vision (1 USA patent), data mining (1 USA patent), optimization, decision support and elsewhere.

Keywords
Exploratory Data Analysis , Classification for Data Mining , Multidimensional Visualization , Parallel Coordinates , Multidimensional/Multivariate Applications

Audience
The accurate visualization of multidimensional problems and multivariate data unlocks insigths into the role of dimensionality. The tutorial is designed to provide such insights for people working on complex problems.

Biography of Alfred Inselberg
Alfred Inselberg received a Ph.D. in Mathematics and Physics from the University of Illinois (Champaign-Urbana) and was Research Professor there until 1966. He held research positions at IBM, where he developed a Mathematical Model of Ear (TIME Nov. 74), concurrently having joint appointments at UCLA, USC and later at the Technion and Ben Gurion University. Since 1995 he is Professor at the School of Mathematical Sciences at Tel Aviv University. He was elected Senior Fellow at the San Diego Supercomputing Center in 1996, Distinguished Visiting Professor at Korea University in 2008 and Distinguished Visiting Professor at National University of Singapore in 2011. Alfred invented and developed the multidimensional system of Parallel Coordinates for which he received numerous awards and patents (on Air Traffic Control, Collision-Avoidance, Computer Vision, Data Mining). The textbook Parallel Coordinates: VISUAL Multidimensional Geometry and its Applications", Springer (October) 2009, has a full chapter on Data Mining and was acclaimed, among others, by Stephen Hawking.

Contacts
e-mail: kdir.secretariat@insticc.org

footer