Skip to content

Explainable NLP Survey

:material-circle-edit-outline: 约 1084 个字 :material-clock-time-two-outline: 预计阅读时间 4 分钟

TODO

Survey

A Survey of the State of Explainable AI for Natural Language Processing

This survey thoroughly explains the state of explainable NLP. The Introduction discusses two distinguishing criteria for explanability models (1) whether the explanation is for each prediction individually or the model’s prediction process as a whole, and (2) determining whether generating the explanation requires post-processing or not. In Categorization of Explanations, this paper categorizes the explanation models into local (provides information or justification for the model's prediction on a specific input) vs. global (provides similar justification by revealing how the model's predictive process works, independently of any particular input), and self-explaining (also directly interpretable, generates the explanation at the same time as the prediction, e.g. decision trees, rule-based models, and feature saliency models like attention models) vs. post-hoc (an additional operation is performed after the predictions are made). This section also states that the different categories of models can overlap. In section Aspects of Explanations, this paper introduces three types of explanation techniques: (1) explainability techniques (feature importance, surrogate model, example-driven, provenance-based, declarative induction), (2) operations to enable explainability (first-derivation saliency, layer-wise relevance propagation, and input perturbations, attention, LSTM gating signals, explainability-aware architecture design) and (3) visualization techniques (saliency, raw declarative representations, natural language explanation). The section Evaluation introduces several evaluating metrices.

Opinion Papers

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data (2020)

This paper argues that the modern NLP models trained on form has no abilities in understanding natural languages based on both the science and philosophy theories. It is structured as follows. In section Large LMs: Hype and analysis, this paper samples example pieces from news and academic literature that exaggerate the understanding abilities in using words including "understand""comprehension""recall factual knowledge", and argues that the current LMs have the ability no other than learning the surface linguistic forms of language rather than understanding them. In section What is meaning?, this paper clarifies the meaning of language as the communicative intent that a parole intends to express, and distinguishes the concept "meaning" and "truth" as the truth is the meaning that is "grounded" to the real world. In section The octopus test, this paper detailedly tells a thought experiment of a super intelligent octopus who can mimic the human response by never receiving the knowledge of the grounded real world of the language meaning, by which this paper argues that it might be that how the language receiver decodes the communicative intends affects the conventional meaning of language. In section More constrained thought experiments, two more thought experiments are provided, training the JAVA and training the English LMs without providing the executing methods the communicative intends, and the paper argues that such tasks are impossible. In section Human language acquisition, this paper supports its idea by providing the example of human children's acquiring knowledge is not only grounded on the world image, but also in the interaction with other people. In section Distributional semantics, this paper argues that in NLP, two methods based on the instincts above are training distributional models on corpora augmented with perceptual data, and looking to interaction data (according to Wittgenstein's "meaning in use").

Information Theory-based Compositional Distributional Semantics (2021)

According to the abstract, the contribution of this paper can be concluded as proposing the notion of Information Theory-based Compositional Distributional Semantics (ICDS): (i) We first establish formal properties for embedding, composition, and similarity functions based on Shannon’s Information Theory; (ii) we analyze the existing approaches under this prism, checking whether or not they comply with the established desirable properties; (iii) we propose two parameterizable composition and similarity functions that generalize traditional approaches while fulfilling the formal properties; and finally (iv) we perform an empirical study on several textual similarity datasets that include sentences with a high and low lexical overlap, and on the similarity between words and their description. In section Introduction, the author introduces Frege's concepts of compositionality and contextuality, which respectively refers to that "the meaning of the whole is a function of the meaning of its parts and the syntactic way in which they are combined", and that "the meaning of words and utterances is determined by their context". This section also introduces the main concern of lacking systematicity by the linguists to the NLP, where systematicity is defined as "A system is said to exhibit systematicity if, whenever it can process a sentence, it can process systematic variants, where systematic variation is understood in terms of permuting constituents or (more strongly) substituting constituents of the same grammatical category." Thus, this section introduces that this paper aims to propose a novel system called Information Theory-based Compositional Distributional Semantics (ICDS). In section Related Work, the author introduces a set of properties in selective proper text representation paradigms which includes "systematicity", "usage context", "continuity", and "information measurbility", and introduces a series of previous work under this standard. In section Theoretical Framework, this paper first establishes a geometric interpretation of ICDS, that "The direction of an embedding represents the pragmatic meaning, and the vector norm of embedding represents how much information the literal utterance provides about its meaning in the pragmatic context", and then proposes the concept of ICDS as "there are minimal linguistic units whose semantics are determined by their use and whose amount of information is determined by their specificity. On the other hand, the systematicity of language can be captured by compositional mechanisms while preserving the amount of information of the composite utterance". Section Formal Definition and Properties formally defines the concepts involved in ICDS, where (\(\pi\),\(\delta\), \(\bigodot\)) stand for "embedding", "semantic similarity", and "composition function" respectively. This section points out the embedding function properties (information measurability and angular isometry), composition function properties (composition neutral element, composition norm monotonicity, and sensitivity to stricture), and similarity function properties (angular distance simialrity monotonicity, orthogonal embedding similarity monotonicity, and equidistant embedding simialrity monotonicity). In section Function Analysis and Generalization, this research evaluates several current embedding vector with the proposed framework, while in section Experiment, the semantic representation abilities of several prevailing LLMs including BERT and GPT are evaluated.

Contrastive Explanations for Model Interpretability (2021)

This paper proposes a data augmentation method to generate counterexample on the bases of NLI datasets, and proves that by training on patterns "why A rather than B" with contrastive learning methods, the model performs better than the previous NLI baselines.

Using counterfactual contrast to improve compositional generalization for multi-step quantitative reasoning (2023)