[AISWorld] FinTOC'4 Call For Participation
FinTOC SharedTask
fin.toc.task at gmail.com
Fri Feb 4 03:00:10 EST 2022
The first CFP went out with some errors (link to participation form and
full list or organization committee), sorry about that, please see the
updated link and CFP below.
Apologies for cross-posting.
Call for participation
FNP-2022 Shared Task: “FINTOC’4 -Table Of Content extraction from Financial
Documents”
To be held as part of The 4th Financial Narrative Processing Workshop (FNP
2022), To be held at 13th Edition of the Language Resources and Evaluation
Conference (LREC 2022).
Lancaster, United Kingdom [online] on 24 June 2022. A free 1 day event.
===================
Shared Task URL: http://wp.lancs.ac.uk/cfie/fintoc2022/
Workshop URL: http://wp.lancs.ac.uk/cfie/fnp2022/
Participation Form: https://tinyurl.com/wb76cjxj
_____________________________________________
Awards and Prizes:
The winning team for FinTOC 2022 shared task will receive an achievement
certificate and a money prize which will be announced shortly.
_____________________________________________
Shared Task Description:
A vast amount of financial documents are created and published constantly
in machine-readable formats (generally PDF file format), with only minimal
structure information. Firms use such documents to report their activities,
financial situation or potential investment plans to shareholders,
investors and the financial markets, basically corporate annual reports
containing detailed financial and operational information.
In some countries as in the US or in France, regulators as EDGAR SEC or AMF
require firms to follow a certain template when reporting their financial
results to insure standardisation and consistency across firms’
disclosures. In other European countries, on the other hand, the management
usually have more discretion on what where and how to report resulting in
lack of standardisation between financial documents published within the
same market.
Existing work on book and document table of contents (TOC) recognition has
been almost all on small size, application-dependent, and domain-specific
datasets. However, TOC of documents from different domains differ
significantly in their visual layout and style, making TOC recognition a
challenging problem for a large scale collection of heterogeneous documents
and books. Compared to regular books (mostly provided in a full text format
with limited structural information such as pages and paragraphs),
Financial documents, containing textual and non textual content, have a
more sophisticated structure including, parts, sections, sub-sections,
sub-sub-sections.
In this shared task, we focus on analysing Financial Prospectuses; official
PDF documents in which investment funds precisely describe their
characteristics and investment modalities. Although the content they must
include is often regulated, their format is not standardized and displays a
great deal of variability ranging from plain text format, towards more
graphical and tabular presentation of data and information. The majority of
prospectuses are published without a table of content (TOC), which is
usually needed to help readers to navigate within the document by following
a simple outline of headers and page numbers, and assist legal teams in
checking if all the contents required are fully included. Thus, automatic
analyses of prospectuses to extract their structure is becoming more and
more vital to many firms across the world.
Thanks to the contribution of the Autonomous University of Madrid (UAM,
Spain), the fourth edition of the FinTOC shared task welcomes a new track
for Spanish documents in addition to English and French, and it will score
systems on both Title detection and TOC generation performance as has been
the practice in previous editions.
Participants need to register. Once registered, all participating teams
will be provided with a common training dataset containing PDF documents
and the associated TOC annotation.
To participate please use the registration form below to add details of
your team: https://tinyurl.com/286d67sc (this is now open as of 03/02/2022)
_____________________________________________
Important dates:
- 1st Call for papers & shared task participants: 10 January 2022
- 2nd Call for papers & shared task participants: 14 February 2022
- Training set release: 25 February 2022
- Blind test set release: 25 March 2022
- Systems submission: 1 April 2022
- Release of results: 5 April 2022
- Paper submission deadline: 8 April 2022
- Papers notification of acceptance: 3 May 2022
- Workshop date: 24 June 2022 (full day event)
_____________________________________________
Contact:
For any questions on the shared task please contact us on:
fin.toc.task at gmail.com
_____________________________________________
Shared Task Organisers:
- Abderrahim Aitazzi, Fortia Financial Solutions
- Sandra Bellato, Fortia Financial Solutions
- Blanca Carbajo Coronado, Universidad Autónoma de Madrid
- Dr Ismail El Maarouf, Fortia Financial Solutions
- Mei Gan, Fortia Financial Solutions
- Dr Juyeon Kang, Fortia Financial Solutions
- Prof. Ana Gisbert, Universidad Autónoma de Madrid
- Prof. Antonio Moreno Sandoval, Universidad Autónoma de Madrid
On Thu, Feb 3, 2022 at 4:23 PM FinTOC SharedTask <fin.toc.task at gmail.com>
wrote:
> Apologies for cross-posting.
>
>
> Call for participation
>
> FNP-2022 Shared Task: “FINTOC’4 -Table Of Content extraction from
> Financial Documents”
>
>
> To be held as part of The 4th Financial Narrative Processing Workshop (FNP
> 2022), To be held at 13th Edition of the Language Resources and Evaluation
> Conference (LREC 2022).
>
> Lancaster, United Kingdom [online] on 24 June 2022. A free 1 day event.
>
> ===================
>
> Shared Task URL: http://wp.lancs.ac.uk/cfie/fintoc2022/
>
> Workshop URL: http://wp.lancs.ac.uk/cfie/fnp2022/
>
> Participation Form: https://tinyurl.com/286d67sc
>
> _____________________________________________
>
> Awards and Prizes:
>
> The winning team for FinTOC 2022 shared task will receive an achievement
> certificate and a money prize which will be announced shortly.
>
> _____________________________________________
>
>
> Shared Task Description:
>
> A vast amount of financial documents are created and published constantly
> in machine-readable formats (generally PDF file format), with only minimal
> structure information. Firms use such documents to report their activities,
> financial situation or potential investment plans to shareholders,
> investors and the financial markets, basically corporate annual reports
> containing detailed financial and operational information.
>
> In some countries as in the US or in France, regulators as EDGAR SEC or
> AMF require firms to follow a certain template when reporting their
> financial results to insure standardisation and consistency across firms’
> disclosures. In other European countries, on the other hand, the management
> usually have more discretion on what where and how to report resulting in
> lack of standardisation between financial documents published within the
> same market.
>
> Existing work on book and document table of contents (TOC) recognition has
> been almost all on small size, application-dependent, and domain-specific
> datasets. However, TOC of documents from different domains differ
> significantly in their visual layout and style, making TOC recognition a
> challenging problem for a large scale collection of heterogeneous documents
> and books. Compared to regular books (mostly provided in a full text format
> with limited structural information such as pages and paragraphs),
> Financial documents, containing textual and non textual content, have a
> more sophisticated structure including, parts, sections, sub-sections,
> sub-sub-sections.
>
> In this shared task, we focus on analysing Financial Prospectuses;
> official PDF documents in which investment funds precisely describe their
> characteristics and investment modalities. Although the content they must
> include is often regulated, their format is not standardized and displays a
> great deal of variability ranging from plain text format, towards more
> graphical and tabular presentation of data and information. The majority of
> prospectuses are published without a table of content (TOC), which is
> usually needed to help readers to navigate within the document by following
> a simple outline of headers and page numbers, and assist legal teams in
> checking if all the contents required are fully included. Thus, automatic
> analyses of prospectuses to extract their structure is becoming more and
> more vital to many firms across the world.
>
> Thanks to the contribution of the Autonomous University of Madrid (UAM,
> Spain), the fourth edition of the FinTOC shared task welcomes a new track
> for Spanish documents in addition to English and French, and it will score
> systems on both Title detection and TOC generation performance as has been
> the practice in previous editions.
>
> Participants need to register. Once registered, all participating teams
> will be provided with a common training dataset containing PDF documents
> and the associated TOC annotation.
>
> To participate please use the registration form below to add details of
> your team: https://tinyurl.com/286d67sc (this is now open as of
> 03/02/2022)
>
>
>
>
> _____________________________________________
>
> Important dates:
>
> - 1st Call for papers & shared task participants: 10 January 2022
>
> - 2nd Call for papers & shared task participants: 14 February 2022
>
> - Training set release: 25 February 2022
>
> - Blind test set release: 25 March 2022
>
> - Systems submission: 1 April 2022
>
> - Release of results: 5 April 2022
>
> - Paper submission deadline: 8 April 2022
>
> - Papers notification of acceptance: 3 May 2022
>
> - Workshop date: 24 June 2022 (full day event)
>
> _____________________________________________
>
> Contact:
>
> For any questions on the shared task please contact us on:
>
> fin.toc.task at gmail.com
>
> _____________________________________________
>
> Shared Task Organisers:
>
> - Abderrahim Aitazzi, Fortia Financial Solutions
>
> - Sandra Bellato, Fortia Financial Solutions
>
> - Dr Ismail El Maarouf, Fortia Financial Solutions
>
> - Mei Gan, Fortia Financial Solutions
>
> - Dr Juyeon Kang, Fortia Financial Solutions
>
>
>
More information about the AISWorld
mailing list