Named Entity Processing on Historical Newspapers

© 2020 DHI EPFL

© 2020 DHI EPFL

How good are systems at recognizing and disambiguating named entities in multilingual historical newspapers? Which approach is best?
Help to find out by participating in the HIPE shared task!
The Impresso project is happy to announce the release of sample data for HIPE, a CLEF shared task on Named Entity Processing for Historical Newspapers.

Since its introduction some twenty years ago, named entity (NE) processing has become an essential component of virtually any text mining application and has undergone major changes. Recently, two main trends characterise its developments: the adoption of deep learning architectures, and the consideration of textual material originating from historical and cultural heritage collections. While the former opens up new opportunities, the latter introduces new challenges with heterogeneous, historical and noisy inputs. If NE processing tools are increasingly being used in the context of historical documents, performances are below the ones on contemporary data and are hardly comparable. In this context, the objective of HIPE is threefold:

  1. to strengthen the robustness of existing approaches on non-standard input;
  2. to enable performance comparison of NE processing on historical texts; and, in the long run,
  3. to foster efficient semantic indexing of historical documents in order to support scholarship on digital cultural heritage collections


  • Task 1: Named Entity Recognition and Classification.
  • Task 2 : Named Entity Linking.



Via the CLEF 2020 portal until 26 April 2020.