Feasibility of applying off-the-shelf Artificial Intelligence tools on digital library images collections

This poster is part of the OR2020 Virtual Poster Session which takes place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments during this week.

Author:

Harish Maringanti

Poster description:

Digital Libraries rely on keywords to make their collections discoverable, and not having keywords is less of an issue for textual collections because of Optical Character Recognition (OCR) capabilities. With visual collections, if item level information is not available, then discovering relevant items becomes a big issue. Creating item level information for visual collections is still a manual process in most cultural heritage institutions. Coupled with collection processing backlog issues, this presents a huge problem for humanities scholars – most of cultural heritage material is still either not available online or difficult to find in online environment. Leveraging existing machine learning tools is one way to address this issue.

Marriott Library received one year funding (July 2018 – June 2019) to explore the feasibility of using machine learning algorithms to generate descriptive metadata for archival images. I will share results of our year-long project along with lessons learned.

(Page through the slides below and click on the full screen window)

About the presenter:

Harish Maringanti is the Associate Dean for IT & Digital Library Services at the J. Willard Marriott Library, University of Utah. He is responsible for advancing Library’s technology initiatives; he was the lead PI on several grant funded projects including the recently concluded Lyrasis Catalyst Fund grant, “Machine Learning meets Library Archives”, to explore feasibility of applying machine learning tools on digital library data. His primary research interests include applications of emerging technologies in digital libraries.

Open Polar: Thematic harvesting of Polar region resources

This poster is part of the OR2020 Virtual Poster Session which takes place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments during this week.

Authors:

Tamer S. Abu-Alam, Obiajulu Odu, Per Pippin Aspaas, Stein Høydalsvik, Leif Longva, Karl Magnus Nilsen

Poster description:

Recent growth in the number of scholarly documents has intensified the need for discovering, sharing, exchanging and reuse of scholarly information across the scientific community. However, our work has shown that there is a 60% findability gap of the polar-related scholarly documents (doi.org/10.7557/7.4682). This 60% findability gap raises an awareness sign of the need of the scientific community to create a database of the open-access records about the Polar Regions and making this database available to researchers, students and the wider public through one search platform.

Based on the obligations and the motivations of the UiT the Arctic University of Norway and the Norwegian Polar Institute, we are working on a project, Open Polar, which aims to create a searchable, homogeneous and seamless database of the polar-related open-access records and make this database available to the scientific community.

(Page through the slides below and click on the full screen window)

About the presenter:

Tamer S. Abu-AlamU,  The Arctic University of Norway

Modifying, refining and developing new features for a Research Information System (RIS) at Helmut-Schmidt-University Hamburg

This poster is part of the OR2020 Virtual Poster Session which takes place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments during this week.

Authors:

Antje Groneberg, Andrea Bollini

Poster description:

In this poster the various ways how new features can be inspired and developed for an open source platform are described, specifically DSpace-CRIS, during the ongoing implementation of a RIS (openHSU at the Helmut-Schmidt-University / University of the Federal Armed Forces Hamburg). Our DSpace-CRIS installation is significant positively inspired through other DSpace-CRIS instances: Some of our (new) features weren’t available in the official code base of DSpace-CRIS, but are inspired through already available features on other DSpace-CRIS installations (and just have to be modified and customized), some could be developed through the refinement of already available software elements (in the official codebase from DSpace or DSpace-GLAM) and some of them have to be developed from scratch.

(Page through the slides below and click on the full screen window)

About the presenter:

Antje Groneberg graduated with a Masters degree in Sociology and German Studies in 2011. Afterwards, she did a library traineeship for two years. She graduated with a Masters degree in Library and Information Science. Since 2013, she works as subject librarian at the Helmut-Schmidt-University / University of the Federal Armed Forces Hamburg. During the optimization of the discovery system, her interest for innovative library services grew. After organizing the redesign of the library webpage, as head of the Digital Library Unit within the university library she now dedicates herself to the build-up of a research information system for the university.

 

OAPEN Open Access Books Toolkit

This poster is part of the OR2020 Virtual Poster Session taking place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments in this week.

Author:

Tom Mosterd

Poster description:

There is a growing interest in making academic books Open Access (OA) with the number of OA books increasing each year along with the introduction of additional funder mandates. However limited awareness amongst authors as well as a lack of understanding and common misconceptions about licensing and quality, for barriers in the transition to OA for books.

Within the context of open scholarship this calls for an open resource that is easy to use, kept up-to-date and relevant for authors and research support worldwide.  This has resulted in the concept of an Open Access Books Toolkit.

(Click on the image below to view and again to enlarge)

 

About the presenter:

Tom Mosterd is the Community Manager for the Directory of Open Access Books (DOAB) and OAPEN. He is responsible for communication plans, for managing the SCOSS fundraising campaign for both DOAB and OAPEN as well as several community-oriented projects, including the OA Books Toolkit.

Data mining, NLP and Machine Learning at the service of a scalable open repository

This poster is part of the OR2020 Virtual Poster Session which takes place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments during this week.

Authors:

Yann Mahé, Carolina Sanchez, Manuel Guzman

Poster description:

Apart from making research outputs visible and accessible, open repositories nowadays face another major challenge. The large amount of data generated by research institutions and the necessity to control, manage, and well-structure them, remains one of the most important goals to ensure that this data is not only collected but also has a meaningful use.

After the release of the new open source repository Polaris OS in March 2018, MyScienceWork has embedded and developed several Text & Data Mining technologies (TDM) and algorithms to enable the solution to convert raw data into suitable information. Data extraction, automatic topic classification, analytics amongst others, is made possible thanks to these technologies.

The presentation will demonstrate how the integration of Text and Data Mining and Artificial Intelligence into Polaris OS responds to different institutional repository stakeholders’ needs.

The experience of getting a “No Code” open repository

This poster is part of the OR2020 Virtual Poster Session which takes place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments during this week.

Authors:

Yann Mahé, Carolina Sanchez, Manuel Guzman

Poster description:

In March 2018, MyScienceWork released Polaris OS, a new open source solution which seeks to find appropriate solutions for major challenges of open repositories. Before developing the Polaris OS platform, we examined existing open repositories and noticed that all of them required high programming skills to setup and greater expertise for customization.

By choosing to develop a “No Code” solution (Polaris OS), MyScienceWork decided to make open repositories setup and modification easier, with little to no programming skills required. Our poster will present all the benefits of setting up a “No Code” solution for institutional repositories through the example of Polaris OS.

(Click on the image below to view and again to enlarge)

About the presenter:

Carolina Sanchez holds a Bachelor degree in Law and Political Sciences. She worked as a paralegal within an international law firm before she decided to join ICEX-CECO Business School in Madrid to complete her studies with a MBA in International Management. After graduating, she worked as an International Trade consultant within the Spanish Embassy to Morocco. Following this, she joined a high tech company in Mexico City as a Business Development Manager. She came back to Europe by joining MyScienceWork’s team to bring her international commercial experience in developing business.

UnityFVG – Regional Research Portal implementing OpenAIRE CRIS/Data/Literature Guidelines with DSpace-CRIS

This poster is part of the OR2020 Virtual Poster Session which takes place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments during this week.

Author:

Jordan PIŠČANC

Poster description:

Friuli-Venezia Giulia “Regional Scientific System” includes three Public Universities that started a common project (UnityFVG) in 2014 for integration and exposition of their Research “entities”. The UnityFVG Research Portal is based on DSpace-CRIS solution and uses CERIF-XML over OAI-PMH for harvesting the main entities (Researchers, Organizations, Publications) from the Institutional DSpaceCRIS systems, also exploiting their REST interface to enrich data exposed on the Portal. In 2018 we started harvesting other entities (Research Groups, Public Engagement Events, Journals, Conferences, Datasets) and linking them to Researcher Profiles, developing a special interface to search and view Researchers’ “skills”. All this data is collected and linked together using persistent identifiers like DOI, Handle, ORCID. The use of PIDs provides an effective response to a major challenge of the project: to collect plenty of information from different sources and to match it in unique entities/items without ambiguities or duplicates, representing the research life-cycle.

(Page through the slides below and click on the full screen window)

About the presenter:

Jordan PIŠČANC graduated in Electronic Engineering and with a Master degree in “Privacy & Security ICT in PA”.  He has been working at the University of Trieste since 1998, currently at the Information Technology Services for Knowledge Transfer. IR/CRIS IT Manager of the Institutional Repositories for more than 10 years. He participated in the pilot project for harvesting PhD theses of National Libraries and developed the DSpace plugin for the NBN:IT project. He has been actively contributing to the activities of the DSpace community with interventions at various conferences (OpenRepositories, OAI, euroCRIS). His research interests focus is the Open Archive and the DSpace-CRIS/GLAM infrastructures. He follows with great interest the topics related to OpenScience and is a member of IOSSG and was also member of the DSpace Leadership Group in 2018. 

The Reproducibility and Reusability Platform

This poster is part of the OR2020 Virtual Poster Session taking place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments in this week.

Authors:

R. Barbera, R. Bruno, M. Fargetta,  R. Rotondo, A. Anagnostou, S. J. E. Taylor

Poster description:

For Open Science to become a common practice its enabling technologies must demonstrate to be useful and easy to use. Building and executing software on distributed computing infrastructures (DCIs), with input data related to Open Access publications and coming from FAIR repositories, should hence be as easy as surfing the web.

The Reproducibility and Reusability Platform (RRP) precisely addresses this issue. It consists of standard-based components: (i) the FutureGateway Framework for Science Gateways, (ii) the INFN Open Access Repository (OAR), and (iii) the Science Software on Demand (SSOD) service.

(Page through the slides below and click on the full screen window)

About the presenter:

Riccardo Bruno, born in Catania, got his Computer Science master degree in 1999. He is working at the Italian Institute of Nuclear Physics (INFN) in the context of computing distributed infrastructures (Grid, Cloud and HPC). He also developed the FutureGateway, a software framework to ease the creation of Science Gateways.

Shared repositories: Building a multi-tenancy repository service at the British Library

This poster is part of the OR2020 Virtual Poster Session which takes place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments during this week.

Authors:

Sara Gould, Rachael Kotarski, Ellen Ramsey

Poster description:

The British Library launched a Shared Research Repository infrastructure for 6 partner heritage organisations with active research bases, classed as Independent Research Organisations. To meet expectations of Open Access and data sharing, we put expertise into building a multi-tenant research repository, developed by Ubiquity Press, shared across IRO partners. Pathways to long-term preservation and community codebase contributions are ongoing priorities.

The University of Virginia and Ubiquity Press are our partners in the Advancing Hyku project, funded by Arcadia. We’re all partnering to further work with the repository community, representing wider needs within the Advancing Hyku project.

(View the slide below and click on the full screen window)

About the presenter:

Rachael Kotarski is the Head of Research Infrastructure Services at the British Library. After a brief stint developing data and image services with an Open Access publisher, Rachael joined the British Library and has been working on developing data-focussed services for 10 years. The bulk of this time has been delivering DataCite to UK organisations and building the UK community around data citation.

A widely deployable and OpenAire-compatible DSpace usage data collector for LA Referencia

This poster is part of the OR2020 Virtual Poster Session taking place in the week of June 1-5. We encourage you to ask questions and engage in discussion on this poster by using the comments feature. Authors will respond to comments in this week.

Authors:

Lautaro Matas (LA Referencia), César Olivares (CONCYTEC – Perú), Washington Segundo (IBICT – Brazil), Vanderlino Neto (CNEN – Brazil), Rino Vargas (CONCYTEC – Perú) and Guilhermo Murilo (LA Referencia)

Poster description:

The poster gives a summary of the implementation of a lightweight, easy-to-deploy, read-only alternative for a DSpace usage data collector compatible with Matomo and OpenAire usage statistics infrastructure. It sends usage data from individual repositories to an external regional aggregator by issuing read-only queries to the out-of-the-box DSpace Solr statistics subsystem.

The success of this kind of service depends on installing a collector component in every repository, so one of the main requirements was to provide a user-friendly, non-invasive and reliable deploying process for repository managers. This development is part of LA Referencia´s tasks in OpenAIRE Advance project, aimed to build a pilot on usage data exchange between Latin America and Europe open science infrastructures.

(Click on the image below to view and again to enlarge)

About the presenters:

Lautaro Matas (LA Referencia), César Olivares (CONCYTEC – Perú), Washington Segundo (IBICT – Brazil), Vanderlino Neto (CNEN – Brazil), Rino Vargas (CONCYTEC – Perú) and Guilhermo Murilo (LA Referencia)