IR Cologne at TREC 2021
We will participante in the 2021 TREC News Track. We just submitted our experimental runs that include Björn’s relation extraction methods to enhance background linking tasks. Based on the 2020 test collection we get an significant improvement. We are curious on how our approach performs in 2021. Hope to see you all at NIST this winter.
Last week, we enjoyed welcoming our newest team member Björn Engelmann. Björn, who has a Master’s degree in computer science with a focus on Data Science and philosophy, will work in the JoIE project. Together with our colleagues from Science Media Center, we are going to dive right into this promising research and development project with the idea to enable data journalists to get meaningful information out of mostly unstructured web data. Welcome on board, Björn!
New Project: JoIE - We are hiring!
We are happy to annouce that Klaus Tschira Stiftung has accepted our grant application for the JoIE Project (Journalistic Information Extraction ). Together with our partner from Science Media Center Germany we will build up an informatino extraction pipeline for data journalists. Sounds interesting? Apply today and be part of the team!
We participate in the TREC-COVID Challenge
NIST, the U.S. National Institute of Standards and Technology, has started the TREC-COVID Challenge to build and evaluate infrastructures and systems to support the search for relevant information on COVID-19.
Researchers, clinicians, and policy makers involved with the response to COVID-19 are constantly searching for reliable information on the virus and its impact. This presents a unique opportunity for the information retrieval (IR) and text processing communities to contribute to the response to this pandemic, as well as to study methods for quickly standing up information systems for similar future events. The results of the TREC-COVID Challenge will identify answers for some of today’s questions while building infrastructure to improve tomorrow’s search systems.
Our Information Retrieval Group at TH Köln (University of Applied Sciences), lead by Philipp Schaer, is taking part in this challenge. Timo Breuer, a PhD student at our group, is the lead developer for our three systems that participate in this international campaign. We released all of our code as open source. Check our GitHub repository for code and implementation details.
LiLAS at CLEF 2020 - Call for Papers
We invite you to submit a early research of open ideas paper to the LiLAS workshop. The Living Labs for Academic Search (LiLAS) Workshop at CLEF 2020 will be held at Thessaloniki, Greece between 22-25 September 2020. The Call for Contributions is open until 24 May 2020. Please feel free to contact us if you have any questions!
CLEF 2019 - Replicability and Reproducibility
Timo’s first paper got accepted at CLEF 2019 for the CENTRE workshop. The goal of CENTRE is to run a joint CLEF/NTCIR/TREC task on challenging participants to reproduce best results of the most interesting systems submitted in previous editions of CLEF/NTCIR/TREC and to contribute back to the community the additional components and resources developed to reproduce the results.
New Project: STELLA
We are happy to annouce that DFG - German Research Foundation has accepted our grant application for the STELLA project (Infrastructure for Living Labs). Together with our partner of GESIS and ZB MED we will build up a new evaluation environment for retrieval and recommender systems. These online evaluation will will differ considerably from classical TREC studies and will enable researchers to use an evaluation method that was previously reserved only for industrial research.
Smart Harvesting II presentation and Hands-on Lab on 107th German Librarian's Day
Mandy Neumann introduced the DFG-funded Smart Harvesting II project to the attendants of the 107th German Librarian's Day in Berlin. For this purpose she gave a presentation that included an overall view of the project, its objectives, the work packages already completed, and an outlook on future work. In addition Mandy Neumann and Christopher Michels from the University of Trier led a Hands-on Lab, in which the participants were introduced to OXPath on the basis of a concrete example and enabled to design their own expressions for their specific application cases.
Mandy Neumann participated in the JCDL 2018
From 3rd to 6th June Mandy Neumann participated in the Joint Conference on Digital Libraries (JCDL 2018) in Fort Worth, TX, USA. You can find our paper Prioritizing and Scheduling Conferences for Metadata Harvesting in dblp besides all other accepted papers on 2018.jcdl.org now. Also have a look at the presentation that was held.
Paper accepted for JCDL 2018 available at the ArXiv
We are happy to announce that our paper we wrote together with Christopher and Ralf from dblp on “ Prioritizing and Scheduling Conferences for Metadata Harvesting in dblp” was accepted the Joint Conference on Digital Libraries (JCDL 2018). As usual we deposited a preprint of the paper at arXiv.org.
Poster presentation at ID@NRW 2018
Mandy Neumann will present her PhD proposal related to the Smart Harvesting II project at the conference Innovationstag Digitalisierung NRW 2018 at the Rheinische Fachhochschule Köln on 1st of March 2018. The main goal of the ID@NRW 2018 is to provide a forum for PhD students and scientists from the Graduate Institute NRW to discuss their research projects.
New Project: PRIOR - PRepublicatIOn Radar
We are pleased to announce that the Google Digital News Initiative has approved funding for a prototype project we will carry out in collaboration with the SMC Lab. PRIOR, the PRepublicatIOn Radar, will be an integrated tool for science journalists to keep up to date with the latest scientific research, enabling them to detect and filter potentially interesting studies in a diverse set of scientific journals. Find out more about PRIOR on our project page.
New Project: ESUPOL - The Influence for Web Search Engines on Political Opinion Formations
The Ministry of Culture and Science of the German State of North Rhine-Westphalia has approved funding for a state-wide graduate institute on Digital Societies. Philipp Schaer (Professor for Information Retrieval at TH Köln, University of Applied Sciences) and Sven-Oliver Proksch (Cologne Center for Comparative Politics) will conduct an interdisciplinary project on the influence of search engines on political opinion formation. The project will collect large amounts of web data from various search engines and analyze them using natural language processing and investigate the effects on opinion formation using laboratory experiments.
Publication in Code4Lib Journal Vol. 38
We got an article published in the Code4Lib Journal (Issue 38): “ Web-Scraping for Non-Programmers: Introducing OXPath for Digital Library Metadata Harvesting”. Thanks to our co-author Jan Steinberg from GESIS! For a full list of publications check our group’s publication list.
New Project: Smart Harvesting II
Within the Smart Harvesting project we would like to develop a ‘smart’ set of tools and workflows to allow non-programmers to build a rich set of web scrapers to build online bibliographies out of freely available web resources.
We would like to welcome our new colleague Mandy Neumann who joined us last week. She is going to work in the Smart Harvesting II project.