Introduction |
Combining Machine Learning and Semantic Features in the Classification of Corporate DisclosuresAbstractWe investigate an approach to improving statistical text classification by combining machine learners with an ontology-based identification of domain-specific topic categories. We apply this approach to ad hoc disclosures by public companies. This form of obligatory publicity concerns all information that might affect the stock price; relevant topic categories are governed by stringent regulations. Our goal is to classify disclosures according to their effect on stock prices (negative, neutral, positive). In the study reported here, we combine natural language parsing with a formal background ontology to recognize disclosures concerning particular topics from a prescribed list. The semantic analysis identifies some of these topics with reasonable accuracy. We then demonstrate that machine learners benefit from the additional ontology-based information when predicting the cumulative abnormal return attributed to the disclosure at hand. |
On Involutive Nonassociative Lambek CalculusAbstractInvolutive Nonassociative Lambek Calculus (InNL) is a nonassociative version of Noncommutative Multiplicative Linear Logic (MLL) (Abrusci in J Symb Log 56:1403–1451, 1991), but the multiplicative constants are not admitted. InNL adds two linear negations to Nonassociative Lambek Calculus (NL); it is a strongly conservative extension of NL (Buszkowski in Amblard, de Groote, Pogodalla, Retoré (eds) Logical aspects of computational linguistics. LNCS, vol 10054. Springer, Berlin, pp 68–84, 2016). Here we also add unary modalities satisfying the residuation law and De Morgan laws. For the resulting logic InNLm, we define and study phase spaces (some frame models, typical for linear logics). We use them to prove the cut elimination theorem for a one-sided sequent system for InNLm, introduced here. Phase spaces are also employed in studying auxiliary systems InNLm(k), assuming the k-cyclic law for negation. The latter behave similarly as Classical Nonassociative Lambek Calculus, studied in de Groote and Lamarche (Stud Log 71(3):355–388, 2002) and Buszkowski (2016). We reduce the provability in InNLm to that in InNLm(k). This yields the equivalence of type grammars based on InNLm with ( \(\epsilon \) -free) context-free grammars and the PTIME complexity of InNLm. |
Natural Language Semantics and ComputabilityAbstractThis paper is a reflexion on the computability of natural language semantics. It does not contain a new model or new results in the formal semantics of natural language: it is rather a computational analysis, in the context for type-logical grammars, of the logical models and algorithms currently used in natural language semantics, defined as a function from a grammatical sentence to a (non-empty) set of logical formulas—because a statement can be ambiguous, it can correspond to multiple formulas, one for each reading. We argue that as long as we do not explicitly compute the interpretation in terms of possible world models, one can compute the semantic representation(s) of a given statement, including aspects of lexical meaning. This is a very generic process, so the results are, at least in principle, widely applicable. We also discuss the algorithmic complexity of this process. |
A Type-Driven Vector Semantics for Ellipsis with Anaphora Using Lambek Calculus with Limited ContractionAbstractWe develop a vector space semantics for verb phrase ellipsis with anaphora using type-driven compositional distributional semantics based on the Lambek calculus with limited contraction (LCC) of Jäger (Anaphora and type logical grammar, Springer, Berlin, 2006). Distributional semantics has a lot to say about the statistical collocation based meanings of content words, but provides little guidance on how to treat function words. Formal semantics on the other hand, has powerful mechanisms for dealing with relative pronouns, coordinators, and the like. Type-driven compositional distributional semantics brings these two models together. We review previous compositional distributional models of relative pronouns, coordination and a restricted account of ellipsis in the DisCoCat framework of Coecke et al. (Mathematical foundations for a compositional distributional model of meaning, 2010. arXiv:1003.4394, Ann Pure Appl Log 164(11):1079–1100, 2013). We show how DisCoCat cannot deal with general forms of ellipsis, which rely on copying of information, and develop a novel way of connecting typelogical grammar to distributional semantics by assigning vector interpretable lambda terms to derivations of LCC in the style of Muskens and Sadrzadeh (in: Amblard, de Groote, Pogodalla, Retoré (eds) Logical aspects of computational linguistics, Springer, Berlin, 2016). What follows is an account of (verb phrase) ellipsis in which word meanings can be copied: the meaning of a sentence is now a program with non-linear access to individual word embeddings. We present the theoretical setting, work out examples, and demonstrate our results with a state of the art distributional model on an extended verb disambiguation dataset. |
Construction-Based Compositional GrammarAbstractThe paper presents a system for construction classification representing multiple levels of specification, such as grammatical functions, grammatically reflected actants, and lexical semantics, aligned with a compositional system of sign combination mediating between a construction perspective and a valence perspective. The system uses a feature structure formalism based on Head-Driven Phrase Structure Grammar (HPSG) but with essential elements from Lexical Functional Grammar (LFG; cf. Bresnan in Lexical functional syntax. Blackwell, Oxford, 2001), and has as implementation background large scale HPSG grammars. While on the one extreme being able to encode word level selection in multi-word patterns, the system on the other provides a compact format for construction specification, allowing for cross-language comparison both in construction and valence frame inventories. Pivotal in these capacities as well as in sign formalization in general are the grammatical functions. The paper motivates the usefulness of the various functionalities and illustrates the way in which they work together in a formally uniform system. |
Parsing/Theorem-Proving for Logical Grammar CatLog3Abstract\({ CatLog3}\) is a 7000 line Prolog parser/theorem-prover for logical categorial grammar. In such logical categorial grammar syntax is universal and grammar is reduced to logic: an expression is grammatical if and only if an associated logical statement is a theorem of a fixed calculus. Since the syntactic component is invariant, being the logic of the calculus, logical categorial grammar is purely lexicalist and a particular language model is defined by just a lexical dictionary. The foundational logic of continuity was established by Lambek (Am Math Mon 65:154–170, 1958) (the Lambek calculus) while a corresponding extension including also logic of discontinuity was established by Morrill and ValentÃn (Linguist Anal 36(1–4):167–192, 2010) (the displacement calculus). \({ CatLog3}\) implements a logic including as primitive connectives the continuous (concatenation) and discontinuous (intercalation) connectives of the displacement calculus, additives, 1st order quantifiers, normal modalities, bracket modalities, and universal and existential subexponentials. In this paper we review the rules of inference for these primitive connectives and their linguistic applications, and we survey the principles of Andreoli's focusing, and of a generalisation of van Benthem's count-invariance, on the basis of which \({ CatLog3}\) is implemented. |
Variable Handling and Compositionality: Comparing DRT and DTSAbstractThis paper provides a detailed comparison between discourse representation theory (DRT) and dependent type semantics (DTS), two frameworks for discourse semantics. Although it is often stated that DRT and those frameworks based on dependent types are mutually exchangeable, we argue that they differ with respect to variable handling, more specifically, how substitution and other operations on variables are defined. This manifests itself in two recalcitrant problems posed for DRT; namely, the overwrite problem and the duplication problem. We will see that these problems still pose a challenge for various extended compositional systems based on DRT, while they do not arise in a framework of DTS where substitution and other operations are defined in the standard type-theoretic manner without stipulating any additional constraints. We also compare the notions of contexts underlying these two kinds of frameworks, namely, contexts represented as assignment functions and contexts represented as proof terms, and see what different predictions they make for some linguistic examples. |
Boon or Burden? The Role of Compositional Meaning in Figurative Language Processing and AcquisitionAbstractWe critically address current theories of figurative language, focusing on the role of literal or compositional meaning in the interpretation of non-literal expressions, including idioms and metaphors. Specifically, we formulate and discuss the processing hypothesis that compositional meaning may either facilitate or impede the recovery or construction of the intended figurative meaning depending on multiple factors, and in particular, on the expression's decomposability and on the "strength" of semantic relations between the compositional and figurative meanings. As a case study, we consider research on processing and acquisition of figurative expressions in highly verbal individuals with autism spectrum disorder (ASD) in comparison to neuro-typical individuals, and examine the factors that may account for the observed comprehension deficits in the ASD group. Based on this evidence, we discuss some of the strategies employed by language users in processing non-compositional or non-literal expressions, and we highlight implications for research on natural language comprehension and processing systems in the domain of figurative meaning. |
Representing Types as Neural EventsAbstractOne of the claims of Type Theory with Records is that it can be used to model types learned by agents in order to classify objects and events in the world, including speech events. That is, the types can be represented by patterns of neural activation in the brain. This claim would be empty if it turns out that the types are in principle impossible to represent on a finite network of neurons. We will discuss how to represent types in terms of neural events on a network and present a preliminary computational implementation that maps types to events on a network. The kind of networks we will use are closely related to the transparent neural networks discussed by Strannegård. |
Anapafseos 5 . Agios Nikolaos
Crete.Greece.72100
2841026182
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.