There is a large number of different preprint servers used by the research community, which differ both technically and in terms of content. In the PIXLS project we would like to systematically unlock the previously neglected information source preprint servers, make them more accessible through value-added services and ensure the reusability of the metadata and full texts obtained.
Information science is characterized as a very heterogeneous and multidisciplinary scientific discipline. In this project, various methods will be used to extract and created a corpus thath will contain only dissertations that are dissertations relevant to information science.
The project Journalistic Information Extraction (JoIE) aims to address the problem of information extraction from unstructured sources, that are relevant for (data) journalism. Based on the two state-of-the-art tools Workbench and Fonduer, a solution will be developed that can handle the different web data sources and makes them usable for journalism by putting them into a structured form.
The STELLA project aims to create an evaluation infrastructure that allows to evaluate search and recommendation services within productive web-based search systems with real users. STELLA provides an integrated e-Research environment that allows researchers in the field of information retrieval and recommendation services to conduct studies with real users in real environments. The experimental set-ups differ considerably from classical TREC studies, which can only be carried out offline, or also from user studies, which only allow laboratory experiments, and thus enable researchers to use an evaluation method that was previously reserved only for industrial research or the operators of large online platforms.
In this project, the question of how search engines can influence political opinion-forming and political issues is to be addressed, and what influence factors such as 'filter bubbles', collaborative filtering and the lack of users' search or media competence have on these processes.
PRIOR, the PRepublicatIOn Radar, will be an integrated tool for science journalists to keep up with the latest scientific research in important domains of knowledge. It will enable them to detect and filter potentially interesting studies in a diverse set of scientific journals. The challenge is to deal with unstructured and heterogeneous incoming information types. PRIOR will extract, harmonize and process new embargoed research publications to allow searching, browsing and filtering. The prototype will work with two modules: a data extraction and harmonization framework as well as a web-based user interface to find new and filter relevant scientific publications.
Within the Smart Harvesting project we would like to develop a 'smart' set of tools and workflows to allow non-programmers to build a rich set of web scrapers to build online bibliographies out of freely available web resources.
GESIS - Leibniz Institute for the Social Sciences