• Keine Ergebnisse gefunden

Learning Explanations From Language Data

N/A
N/A
Protected

Academic year: 2022

Aktie "Learning Explanations From Language Data"

Copied!
3
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

2018 EMNLP Workshop on

Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP)

Learning Explanations from Language Data

David Harbecke* Robert Schwarzenberg*

German Research Center for Artificial Intelligence (DFKI) Alt-Moabit 91c, 10559 Berlin, Germany

{firstname.lastname}@dfki.de

Christoph Alt

Abstract

PatternAttribution is a recent method, intro- duced in the vision domain, that explains classifications of deep neural networks. We demonstrate that it also generates meaningful interpretations in the language domain.

1 Introduction

In the last decade, deep neural classifiers achieved state-of-the-art results in many domains, among others in vision and language. Due to the com- plexity of a deep neural model, however, it is dif- ficult to explain its decisions. Understanding its decision process potentially allows to improve the model and may reveal new knowledge about the input.

Recently,Kindermans et al.(2018) claimed that

“popular explanation approaches for neural net- works (...) do not provide the correct explana- tion, even for a simple linear model.” They show that in a linear model, the weights serve to can- cel noise in the input data and thus the weights show how to extract the signal but not what the signal is. This is why explanation methods need to move beyond the weights, the authors explain, and they propose the methods “PatternNet” and

“PatternAttribution” that learn explanations from data. We test their approach in the language do- main and point to room for improvement in the new framework.

2 Methods

Kindermans et al. (2018) assume that the data x passed to a linear modelwTx=yis composed of signal (s) and noise (d, from distraction)x=s+d.

Furthermore, they also assume that there is a linear relation between signal and targetyas =swhere as is a so called signal base vector, which is in fact the “pattern” that PatternNet finds for us. As

mentioned in the introduction, the authors show that in the model above, w serves to cancel the noise such that

wTd= 0, wTs=y. (1) They go on to explain that a good signal estima- torS(x) = ˆsshould comply to the conditions in Eqs.1but that these alone form an ill-posed qual- ity criterion since S(x) = u(wTu)−1y already satisfies them for anyu for whichwTu 6= 0. To address this issue they introduce another quality criterion over a batch of datax:

ρ(S) = 1−max

v corr(

y

z }| { wTx, vT

dˆ

z }| {

(x−S(x))) (2) and point out that Eq. 2 yields maximum values for signal estimators that remove most of the in- formation aboutyin the noise.

We argue that Eq.2still is not exhaustive. Con- sider the artificial estimator

Sm(x) =mx+ (1−m)s=s+md which arguably is a a bad signal estimator for large m as its estimation contains scaled noise, md. Nevertheless, it still satisfies Eqs.1and yields maximum values for Eq.2since

x−Sm(x) = (1−m)(x−s) = (1−m)d is again just scaled noise and thus does not cor- relate with the outputy. To solve this issue, we propose the following criterion:

ρ0(S) := max

v1

corr(wTx, v1TS(x))

−max

v2

corr(wTx, v2T(x−S(x))).

The minuend measures how much noise is left in the signal, the subtrahend measures how much sig- nal is left in the noise. Good signal estimators split

(2)

Great book for travelling Europe : I currently live in Europe , and this is the book I recommend for my visitors . It covers many countries , colour pictures , and is a nice starter for before you go , and once you are there .

Figure 1: Contributions to positive classification.

signal and noise well and thus yield large ρ0(S).

We leave it to future research to evaluate existing signal estimators with our new criterion.

For our experiments, the authors equip us with expressions for the signal base vectorsasfor sim- ple linear layers and ReLU layers. For the sim- ple linear model, for instance, it turns out that as = cov(x,y)/σy2. To retrieve contributions for PatternAttribution, in the backward pass, the au- thors replace the weights bywas.

3 Experiments

To test PatternAttribution in the NLP domain, we trained a CNN text classifier (Kim,2014) on a sub- set of the Amazon review polarity data set (Zhang et al.,2015). We used 150 bigram filters, dropout regularization and a dense FC projection with 128 neurons. Our classifier achieves an F1 score of 0.875 on a fixed test split. We then used Kin- dermans et al.(2018) PatternAttribution to retrieve neuron-wise signal contributions in the input vec- tor space.1

To align these contributions with plain text, we summed up the contribution scores over the word vector dimensions for each word and used the ac- cumulated scores to scale RGB values for word highlights in the plain text space. Positive scores are highlighted in red, negative scores in blue.

This approach is inspired byArras et al.(2017a).

Example contributions are shown in Figs.1and2.

4 Results

We observe that bigrams are highlighted, in par- ticular no highlighted token stands isolated. Bi- grams with clear positive or negative sentiment contribute heavily to the sentiment classification.

In contrast, stop words and uninformative bigrams make little to no contribution. We consider these

1Our experiments are available at https://github.com/

DFKI-NLP/language-attributions.

DVD Player crapped out after one year : I also began having the incorrect disc problems that I ’ve read about on here . The VCR still works , but hte DVD side is useless . I understand that DVD players sometimes just quit on you , but after not even one year

? To me that ’s a sign on bad quality . I ’m giving up JVC after this as well . I

’m sticking to Sony or giving another brand a shot .

Figure 2: Contributions to negative classification.

meaningful explanations of the sentiment classifi- cations.

5 Related Work

Many of the approaches used to explain and in- terpret models in NLP mirror methods originally developed in the vision domain, such as the recent approaches byLi et al.(2016),Arras et al.(2017a), andArras et al.(2017b). In this paper we imple- mented a similar strategy.

Following Kindermans et al.(2018), however, our approach improves upon the latter methods for the reasons outlined above. Furthermore, PatternAttribution is related to Montavon et al.

(2017) who make use of Taylor decompositions to explain deep models. PatternAttribution reveals a good root point for the decomposition, the authors explain.

6 Conclusion

We successfully transferred a new explanation method to the NLP domain. We were able to demonstrate that PatternAttribution can be used to identify meaningful signal contributions in text in- puts. Our method should be extended to other pop- ular models in NLP. Furthermore, we introduced an improved quality criterion for signal estimators.

In the future, estimators can be deduced from and tested against our new criterion.

* Co-first authorship.

This research was partially supported by the German Federal Ministry of Education and Research through the projects DEEPLEE (01IW17001) and BBDC (01IS14013E).

(3)

References

Leila Arras, Franziska Horn, Gr´egoire Montavon, Klaus-Robert M¨uller, and Wojciech Samek. 2017a.

”What is relevant in a text document?”: An inter- pretable machine learning approach. PLOS ONE, 12(8).

Leila Arras, Gr´egoire Montavon, Klaus-Robert M¨uller, and Wojciech Samek. 2017b. Explaining recur- rent neural network predictions in sentiment anal- ysis. In Proceedings of the 8th EMNLP Workshop on Computational Approaches to Subjectivity, Senti- ment and Social Media Analysis, pages 159–168.

Yoon Kim. 2014. Convolutional neural networks for sentence classification. InProceedings of the 2014 Conference on Empirical Methods in Natural Lan- guage Processing (EMNLP), pages 1746–1751.

Pieter-Jan Kindermans, Kristof T. Sch¨utt, Maximil- ian Alber, Klaus-Robert M¨uller, Dumitru Erhan, Been Kim, and Sven D¨ahne. 2018. Learning how to explain neural networks: PatternNet and PatternAttribution. InInternational Conference on Learning Representations (ICLR).

Jiwei Li, Xinlei Chen, Eduard Hovy, and Dan Jurafsky.

2016. Visualizing and understanding neural models in NLP. InProceedings of NAACL-HLT, pages 681–

691.

Gr´egoire Montavon, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert M¨uller.

2017. Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recogni- tion, 65:211–222.

Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015.

Character-level convolutional networks for text clas- sification. InAdvances in Neural Information Pro- cessing Systems (NIPS), pages 649–657.

Referenzen

ÄHNLICHE DOKUMENTE

The use of this approach for future investment costs of electricity generation technologies in the framework of very long-term energy scenarios shows improvements in

This very high sensitivity of the model results to, admittedly, very high cost variations, points to the weakess of the deterministic cost minimization approach: in

The r a t e of production is constrained by wood supply (which is one of the major links between t h e submodels), by final demand for forest products, by labor

One rationale for this proposal is that higher inflation targets increase inflation expectations, thereby lowering real interest rates in situations where the central bank cannot

For the purpose of handling qualities analyses, a qLPV simulation model including 13 states (six-degree-of-freedom rigid body states and rotor flapping states) was developed

A dynamic LP is then just a linear program comprising of such static models which are interlinked via various state variables (i.e., different types of "inventories",

It is restricted to a very simple structure and allows one to determine the number of sick days, the hospital stays and the resources needed on the basis of a definite demo-

The problem we posed at the beginning of this paper may now be restated: how should policy research be organized and carried out in order to transform the policy maker's