Mark Granroth-Wilding

What Happens Next?
Event Prediction Using a Compositional Neural Network Model

Mark Granroth-Wilding and Stephen Clark (2016).
In proceedings 13th AAAI Conference on Artificial Intelligence (AAAI 2016).

Abstract

We address the problem of automatically acquiring knowledge of event sequences from text, with the aim of providing a predictive model for use in narrative generation systems. We present a neural network model that simultaneously learns embeddings for words describing events, a function to compose the embeddings into a representation of the event, and a coherence function to predict the strength of association between two events.

We introduce a new development of the narrative cloze evaluation task, better suited to a setting where rich information about events is available. We compare models that learn vector-space representations of the events denoted by verbs in chains centering on a single protagonist. We find that recent work on learning vector-space embeddings to capture word meaning can be effectively applied to this task, including simple incorporation of a verb's arguments in the representation by vector addition. These representations provide a good initialization for learning the richer, compositional model of events with a neural network, vastly outperforming a number of baselines and competitive alternatives.

Files

Code

We release here the code used to process Gigaword data, build models and run the experiments reported in the paper. It includes all the code relevant to the results reported in the paper.

In order to reproduce the results, you also need the Gigaword corpus and the a list of the duplicate documents (below). See README.md for more details.

Duplicate articles in the Gigaword corpus are listed in a file gigaword_duplicates.gz, produced by Nate Chambers. (It wasn't released together with the code, but I see no reason not to release it now, since it only contains filenames, no document data.)

Pre-trained models

We release here the trained models that were used to produce the results reported in the paper.

The models are distributed as tarballs each containing a single directory with all the model's parameter files. Extract these in the models directory and they should create the required directory structure to put the models in (entitychains/[model-type]/[model-name]/).

Dataset split

Models were trained on the NYT portion of the Gigaword corpus. 10% of documents were held out as a development set and 10% as a test set.