PIXLS - Preprint Information eXtraction for Life Sciences

Profile & Description

Preprints are a relatively new pathway of making scientific results available to the wider research community even before peer review. Meanwhile, thereis are a large number of different preprint servers used by the research community, which differ both technically and in terms of content. In the project proposed here, the partners TH Köln – University of Applied Sciences and ZB MED - Information Center for Life Sciences would like to systematically unlock the previously neglected information source preprint servers, make them more accessible through value-added services and ensure the reusability of the metadata and full texts obtained. For this, the project will conduct research and develop an e-research technology covering an information extraction pipeline in which the data will be homogenized and merged into the ZB MED Knowledge Environment (ZB MED KE). Value-added services, such as Linked Open Data interfaces and innovative reputation and trend indicators, will then be developed on this data basis and made available to the library and scientific community for subsequent use.

Funding Agency
DFG - Deutsche Forschungsgemeinschaft
Partner Institution
ZB MED - Information Centre for Life Sciences
People Involved
Prof. Dr. Philipp Schaer (Technische Hochschule Köln)
Fabian Haak, M.Sc. (Technische Hochschule Köln)
Prof. Dr. Konrad Förstner (ZB MED - Information Centre for Life Sciences)
Benjamin Wolff, M.Sc. (ZB MED - Information Centre for Life Sciences)

ProjectPIXLS - Preprint Information eXtraction for Life Sciences

Duration
2023 - 2025
Funded by

Publications

2024

BATS: BenchmArking Text Simplicity.
In: L.-W. Ku, A. Martins and V. Srikumar, editors, Findings of the Association for Computational Linguistics ACL 2024, pages 11968-11989. Association for Computational Linguistics, Bangkok, Thailand and virtual meeting, 2024.
Christin Kreutz, Fabian Haak, Björn Engelmann and Philipp Schaer.
[doi]  [abstract]  [BibTeX] 
Dynamics in Search Engine Query Suggestions for European Politicians.
In: WEBSCI '24: Proceedings of the 16th ACM Web Science Conference, pages 279–289. 2024.
Franziska Pradel, Fabian Haak, Sven-Oliver Proksch and Philipp Schaer.
[doi]  [BibTeX] 
The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias.
2024.
Timo Spinde, Smilla Hinterreiter, Fabian Haak, Terry Ruas, Helge Giese, Norman Meuschke and Bela Gipp.
[pdf]  [BibTeX] 

2023

Text Simplification of Scientific Texts for Non-Expert Readers.
In: SimpleText@CLEF-2023, volume abs/2307.03569, series CEUR Workshop Proceedings. 2023.
Björn Engelmann, Fabian Haak, Christin Katharina Kreutz, Narjes Nikzad-Khasmakhi and Philipp Schaer.
[doi] [pdf]  [BibTeX] 
Preliminary Results of a Scientometric Analysis of the German Information Retrieval Community 2020-2023.
In: M. Leyer and J. Wichmann, editors, Proceedings of the LWDA 2023 Workshops: BIA, DB, IR, KDML and WM. Marburg, Germany, 09.-11. October 2023, volume 3630, series CEUR Workshop Proceedings, pages 222-230. CEUR-WS.org, 2023.
Philipp Schaer, Svetlana Myshkina and Jüri Keller.
[doi] [pdf]  [BibTeX]