Development of Scientific Figure Parser

Over the summer, a group of six University of Helsinki Computer Science students will develop an open source scientific figure parser for the Aalto Datahub, in collaboration with the research group of Professor Juha Joenväärä (Department of Finance, Aalto University).
 financial chart displayed on a tablet
Financial chart displayed on a tablet

The Aalto Datahub (AUDH) is taking part in a University of Helsinki software development course as a customer, commissioning the development of an open source tool that parses scientific figures into structured data. Six Computer Science students will work on the project over the summer, independently developing a product that meets the requirements we set out as the customer.

Reliably turning large quantities of unstructured data accross different file types into structured data is a non-trivial task. One key challenge is making use of the graphic data contained in those documents. High-quality, reproducible graph-to-data pipelines open up new, cost-effective avenues of research in behavioral finance and behavioral economics, as well as in many other fields where the completeness of the information held in a document matters.

The tool is being developed in collaboration with the research group of Professor Juha Joenväärä (Department of Finance, Aalto University). Once complete, the parser will be used in Professor Joenväärä's research alongside other Aalto finance researchers. We further hope the tool will prove useful across a wide range of other fields and research projects.

By supporting the development of this parser as an open source tool, the Aalto Datahub aims to make these pipelines accessible to researchers at Aalto University as well as the wider research community.

Share
URL copied!

Read more news

Image of FIRE's logo.
Data sources Published:

Aalto University part of the FIRE -infrastructure

Aalto University is a part of the Finnish Infrastructure for Register-based Research (FIRE) project.
Antti Valkonen pitching the Data Hub case at SolverX
Data sources Published:

AUDH at SolverX

Aalto University Datahub took part in reverse-pitching event SolverX held in Helsinki in April.
Data sources Published:

Upright providing a wider database for company net impact reports

Upright has widened their database of net impact reports of companies.
Data sources Published:

Amadeus access outdated - similar data now through Orbis

As some of you may have noticed, we no longer offer access to the Amadeus database. Instead of this, you can now use the Orbis database, which offers the same content with global coverage. You can find information about Orbis and how to access it below.