FINTOC 2021 Shared-Task: “Financial Document Structure Extraction”

This online event has been held at The 3rd Financial Narrative Processing Workshop (FNP 2021), Lancaster, United Kingdom on 15 September 2021.

21 septembre 2021

FINTOC 2021: An Overview

Financial documents are regularly created and often end up being published in machine-readable formats (i.e. mostly PDF format), with only minimal structure information. Firms use these documents to report their activities, financial situation or potential investment plans to shareholders, investors and the financial markets. They are essentially annual corporate reports containing detailed financial and operational information.

In countries such as the US and France, regulators including EDGAR SEC or AMF expect firms to follow a specific template when reporting their financial results so as to ensure standardisation and consistency across firms’ disclosures. In other European countries however, the management usually has more power on what, where and how to report. This leads to a lack of standardisation between financial documents published within the same market.

This shared task aims at analysing Financial Prospectuses; i.e. official PDF documents in which investment funds describe their characteristics and investment modalities. Although the content they include is usually regulated, their format is not standardized thus enabling more diversity. It ranges from plain text format, to more graphical and tabular presentation of data and information. The majority of prospectuses are published without a table of content (TOC), which is often required to help readers navigate within the document and assist legal teams in checking if the mandatory content has been included. 

Hence, automatic analysis of prospectuses to extract the structure of these documents is becoming increasingly crucial to many firms across the globe.

What does the FINTOC 2021 shared task consist of?

The third edition of the FinTOC shared task is based on the same two tracks as the FinTOC’2 edition: 

  • One track for english documents ;
  • And another for french documents.

It will score systems on both title detection and TOC generation performance. The task has been updated, simplifying data formats to make it as smooth as possible for any interested researcher to participate and submit their systems’ outputs at FinTOC’3.

All participants must register. Once registered, the participating teams will be provided with a common training dataset containing PDF documents and the associated TOC annotation.

How to participate in the FINTOC 2021 Shared-Task

If you wish to be a part of this project, please use the registration form below to add the detail of your team: https://forms.gle/qawe1dP13MAsTdLu6  

Key Dates

  • Blind test set release: 15 August 2021
  • Systems submission: 23 August 2021
  • Release of results: 23 August 2021

Prizes/Awards

The winning team of the FinTOC 2021 shared task will receive an achievement certificate and a money prize worth US $650. The winners will also be given the opportunity to present their work at the workshop.

Shared-Task Organisers – Fortia Financial Solutions

  • Dr Ismail El Maarouf
  • Dr Juyeon Kang
  • Abderrahim Aitazzi
  • Sandra Bellato
  • Mei Gan

Contact

Questions about FinTOC-2021 shared task can be sent to:

fin.toc.task@gmail.com

News

28 juin 2020
What Makes AI a Hugely Valuable Tool for Finance?
12 février 2021
FinSim-3: The 3rd Shared Task on Learning Semantic Similarities for the Financial Domain
14 avril 2021
2OS by Fortia: a new generation platform that combines no-code software and no-code AI