GIRNet: Interleaved Multi-Task Recurrent State Sequence Models

Gupta, Divam ; Chakraborty, Tanmoy ; Chakrabarti, Soumen (2019) GIRNet: Interleaved Multi-Task Recurrent State Sequence Models Proceedings of the AAAI Conference on Artificial Intelligence, 33 (01). pp. 6497-6504. ISSN 2159-5399

[img] PDF
274kB

Official URL: http://doi.org/10.1609/aaai.v33i01.33016497

Related URL: http://dx.doi.org/10.1609/aaai.v33i01.33016497

Abstract

In several natural language tasks, labeled sequences are available in separate domains (say, languages), but the goal is to label sequences with mixed domain (such as code-switched text). Or, we may have available models for labeling whole passages (say, with sentiments), which we would like to exploit toward better position-specific label inference (say, target-dependent sentiment annotation). A key characteristic shared across such tasks is that different positions in a primary instance can benefit from different ‘experts’ trained from auxiliary data, but labeled primary instances are scarce, and labeling the best expert for each position entails unacceptable cognitive burden. We propose GIRNet, a unified position-sensitive multi-task recurrent neural network (RNN) architecture for such applications. Auxiliary and primary tasks need not share training instances. Auxiliary RNNs are trained over auxiliary instances. A primary instance is also submitted to each auxiliary RNN, but their state sequences are gated and merged into a novel composite state sequence tailored to the primary inference task. Our approach is in sharp contrast to recent multi-task networks like the crossstitch and sluice networks, which do not control state transfer at such fine granularity. We demonstrate the superiority of GIRNet using three applications: sentiment classification of code-switched passages, part-of-speech tagging of codeswitched text, and target position-sensitive annotation of sentiment in monolingual passages. In all cases, we establish new state-of-the-art performance beyond recent competitive baselines.

Item Type:Article
Source:Copyright of this article belongs to Association for the Advancement of Artificial Intelligence
ID Code:130893
Deposited On:01 Dec 2022 06:12
Last Modified:27 Jan 2023 09:38

Repository Staff Only: item control page