KILT: a Benchmark for Knowledge Intensive Language Tasks

Solving knowledge-intensive tasks, such as fact checking or open-domain question answering requires access to a large body of information. A recent study suggests a benchmark and library for this type of tasks.

It allows to formulate several knowledge-intensive natural language processing tasks using a common interface and the same knowledge source (a single snapshot of Wikipedia). It may be used for datasets which use a textual input and perform one of five tasks: fact checking, assigning a unique Wikipedia page to entities mentioned in text, collecting the information on certain relations of entities, answering questions, or discussing with a user in a dialogue. In every instance, textual spans in specific Wikipedia pages are provided to confirm the output. The suggested benchmark allows to develop and evaluate different models that solve knowledge-intensive tasks.

Challenging problems such as open-domain question answering, fact checking, slot filling and entity linking require access to large, external knowledge sources. While some models do well on individual tasks, developing general models is difficult as each task might require computationally expensive indexing of custom knowledge sources, in addition to dedicated infrastructure. To catalyze research on models that condition on specific information in large textual resources, we present a benchmark for knowledge-intensive language tasks (KILT). All tasks in KILT are grounded in the same snapshot of Wikipedia, reducing engineering turnaround through the re-use of components, as well as accelerating research into task-agnostic memory architectures. We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text. KILT data and code are available at this https URL.


Comment this news or article