Interpretable Discriminative Dimensionality Reduction and Feature Selection
on the Manifold
Babak Hosseini*, Barbara Hammer
*Bielefeld University (formerly) Dortmund University (currently)
Twitter: @Babak_hss
ECML 2019, 19 September 2019
Outline:
• Introduction
• Proposed Method
• Experiments
• Conclusion
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Dimensionality reduction (DR):
• Mapping:
• Visualization purpose
• Lower down data complexity
Relational representation:
• No vectorial representation anymore
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
DR on manifold:
dim. reduction Input space
Relational rep. Feature space
Projected space
Interpretation of the projection:
dim. reduction
?
?
Feature space
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Class-based Interpretation:
• Applicable to Kernel-based DR method
Class-based Interpretation:
• Kernel-PCA
• Embedding dimensions
• Each is recont. from a selection of data
Q: all of them selected from one class?
• If Yes dimension represents (or is related to) class q
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Class-based Interpretation:
• a & b: each dim. uses all classes
• c & d: each dim. uses all almost one class
• Separation of data in the label-space
Class-based Interpretation:
Supervised K-based DR methods
• e.g.: K-FDA (kernel fisher discriminant analysis)
• Within-class () and between-class () covariance matrices
• Good class-separation
• Weak class-based interpretation
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Outline:
• Introduction
• Proposed Method
• Experiments
• Conclusion
Notations:
• Training Matrix:
• Label matrix:
• Mapping to RKHS (rel. rep.)
• Embedding dimensions
• Embedding of
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Objectives:
• O1: Increasing the class-based interpretation of embedding dimensions.
• O2: The embedding should make the classes more separated in the LD space.
• O3: The classes should be locally more condensed in the embedded space.
• O4: Performing feature selection if a multiple kernel representation is provided.
Objectives:
• O1: Increasing the class-based interpretation of embedding dimensions.
• O2: The embedding should make the classes more separated in the LD space.
• O3: The classes should be locally more condensed in the embedded space.
• O4: Performing feature selection if a multiple kernel representation is provided.
Optimization framework:
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Objectives:
• O1: Increasing the class-based interpretation of embedding dimensions.
• O2: The embedding should make the classes more separated in the LD space.
• O3: The classes should be locally more condensed in the embedded space.
• O4: Performing feature selection if a multiple kernel representation is provided.
Interpretability term (O1):
•
• Embedding vector:
1. non-zero small 2. large
Reconst. close data points in RKHS
Smooth labeling in local neighborhoods
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Objectives:
• O1: Increasing the class-based interpretation of embedding dimensions.
• O2: The embedding should make the classes more separated in the LD space.
• O3: The classes should be locally more condensed in the embedded space.
• O4: Performing feature selection if a multiple kernel representation is provided.
Inter-class dissimilarity (O2):
•
•
• Projected vectors
Goal:
• To reduce the similarity of and other classes in the embedded space
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Objectives:
• O1: Increasing the class-based interpretation of embedding dimensions.
• O2: The embedding should make the classes more separated in the LD space.
• O3: The classes should be locally more condensed in the embedded space.
• O4: Performing feature selection if a multiple kernel representation is provided.
Intra-class similarity (O3):
•
•
• Works on non-zero entries in each belonging to class()
Goal:
• If is large embedding dim: is const. the class()
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Objectives:
• O1: Increasing the class-based interpretation of embedding dimensions.
• O2: The embedding should make the classes more separated in the LD space.
• O3: The classes should be locally more condensed in the embedded space.
• O4: Performing feature selection if a multiple kernel representation is provided.
Feature-selection (O4):
• m projections:
•
•
• Multiple-kernel representation of
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Feature-selection (O4):
• : dim. in e.g.:
• multivariate time-series
• multi-view image data
• multi-domain information
• …
• Scaling of the RKHS:
Goal:
• Given the supervised information
• dim. is chosen
Feature-selection (O4):
Goal:
• Given the supervised information
• dim. is chosen
• Injecting into the opt. framework
• affine-constraint + non-negativity cons. interpretable solution
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Optimization scheme:
Convexity of the terms:
• PSD
• non-convex term (w.r.t. ):
• relaxation of the opt. problem
• Alternating opt. scheme
Optimization scheme:
• Close-form solution
• ADMM algorithm
• QP
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Outline:
• Introduction
• Proposed Method
• Experiments
• Conclusion
Datasets:
Different domains:
• face, text, image, etc.
• UCI & feature-selection rep.
• A wide range of dimensions
Alternative methods:
• Supervised: K-FDA, LDR, SDR, KDR
• Unsupervised: JSE, S-KPCA, KEDR
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Dimensionality reduction results:
• Classification accuracy (%)
• 1-nn classifier based on the projected data
• 10-fold CV
Dimensionality reduction results:
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Interpretation of the embedding dimension:
• Interpretability measure :
• becomes if is recon. using one class
• close to if is recon. using all the class
Interpretation of the embedding dimension:
• Projecting the emb. Dimensions on the label-space:
Babak Hosseini, Barbara Hammer ECML 2019, 19 September 2019
Feature selection result:
• MK representation of the data
• non-zero entries in beta:
• alternative methods:
• MKL algorithms: MKL-TR, MKL-DR, KNMF-MKL, and DMKL
• Classification accuracy &
Conclusion:
• A novel method for discriminative dimensionality reduction.
• Focused on the local neighborhoods in RKHS
• Aimed the class-based interpretation of the embedding dimensions.
• A good trade-off between interpretation and separation of classes.
• Feature-selection extension using multiple-kernel data representation.
• Thank you very much!
• Questions?
Twitter: @Babak_hss
35 35