Events

Online seminar PIDs for Software

PIDs for software can make scientific work processes more transparent, networked and efficient (Recap)

As digitalisation in research and teaching progresses, the number of software solutions at scientific institutions is increasing significantly. The reliable provision and use of this software is essential in order to make research results comprehensible and enable their reuse. Recognising software as a scientific product is crucial in order to make its importance for research and innovation visible and to promote its sustainable use and citation. Persistent identifiers (PIDs) play a central role in this: they ensure unambiguous allocation and make it easier to find, access, interoperate and reuse software in line with the FAIR principles

The “PID Network Germany” project hosted the online seminar “PIDs for Software” on June 30, 2025. We highlighted various aspects relating to the documentation and use of software. In addition to an introduction to national and international initiatives for dealing with research software, the focus was also on practical challenges and possible solutions.

The event began with an introduction to the topic of “research software” and the importance of PIDs in the context of software. Alexander Struck from HU Berlin categorized the relevance for the national and international research landscape.

The HERMES publication system, presented by Stephan Druskat (German Aerospace Center), was developed for research software to facilitate the publication, documentation and visibility of scientific software. It offers researchers a platform to systematically present their software projects, add metadata and make them permanently accessible. The system also supports the integration of PIDs to clearly reference the software and improve its citation capability.

Morane Gruenpeter (Software Heritage) presented the Software Hash IDentifier (SWHIDs), which enables a permanent and unchangeable reference over the entire life cycle of software. This allows software versions, source code and development processes to be reliably linked with one another in order to ensure the traceability and reproducibility of scientific work. In addition, Paul Vierkant shed light on the perspectives of the data provider DataCite and Esther Scheven (DNB) presented the extent to which software products can be mapped in the Common Authority File (GND).

It became clear in all the presentations that referencing software using PIDs can make scientific work processes more transparent, networked and efficient.

Group exchange: community needs and challenges

There was also the opportunity to exchange ideas in three groups during the event. In an interactive part, three questions were formulated for the participants to answer and discuss.
The following is a brief summary.

Research software faces significant hurdles in reproducibility, long-term usability, and recognition. Key challenges include:

  • Maturity & Policy: Software is often developed quickly for specific projects with limited documentation, standards, and a lack of institutional incentives for proper citation or publication.
  • Publication & Sharing: Inconsistent workflows, missing identifiers & metadata, and dependency issues complicate software publication and reuse.
  • Resource Constraints: Limited time & resources hinder the implementation of good software development and publication practices.
  • Technical Barriers: Lack of expertise, compatibility issues, and difficulties with hosting/sharing code (e.g., using platforms like GitHub) contribute to problems.
  • Data & Documentation: Decisions about integrating datasets and a consistent lack of time for thorough documentation further complicate usability and reproducibility.

Workshop participants identified several challenges to adopting PIDs for research software. Key concerns center around workflow integration, metadata management, and long-term accessibility.

  • Workflow & Tooling: Integrating PIDs into existing workflows is complex, particularly with manual publishing processes. Participants highlighted the need for convenient tools that fit specific needs.
  • Metadata Synchronization: A “chicken-egg problem” exists regarding syncing PIDs with comprehensive metadata, requiring careful planning of who creates and manages it. The choice of metadata format also impacts usability.
  • Version Control & DOIs: Existing version control systems (like GitLab) offer stable URLs, lessening the perceived benefit of additional PID mechanisms. Decisions around version-specific vs. latest-version DOIs require consideration.
  • Accessibility & Ownership: Ensuring long-term software accessibility is a concern, especially when developers and publishers differ, or software is embedded within larger publications.
  • Emerging Issues: Participants raised concerns about PID referencing for AI-generated software due to copyright and valuation questions.

Participants identified a comprehensive set of documentation elements crucial for ensuring software reusability. Key areas include:

  • Licensing & Restrictions: Clear licensing information (adhering to software.reuse standards) and a defined scope outlining what the software can and cannot do.
  • Technical Requirements: Detailed specifications for the software stack, operating systems, dependencies (with version numbers), and necessary resources.
  • Usage & Execution: Comprehensive guides for installation, execution (including build/compile instructions), and minimal usage examples with test data.
  • Context & Purpose: A clear description of the software’s scope, purpose, the problems it solves, and its FAIR indicators.
  • Contribution & Support: Information on how to contribute, mechanisms for feedback, and contact information for authors/contributors.
  • Dependencies & Citation: Clear citation information and details on all software dependencies.

Considering the identified challenges in reproducibility, usability, and long-term viability, participants advocate for more user-focused solutions and enhancements to clearly demonstrate the value of PID adoption within existing research software workflows. They emphasized that thorough documentation – essential for users, funders, and contributors – is a cornerstone of this effort, ensuring software remains reusable and impactful over time.

At the end of the event, additional questions were formulated in a Zoom survey that should help the PID Network Germany project to once again record the requirements and conditions for the use of PIDs for software in a structured manner. These results also underline the findings from the group work. More than 70% of participants stated that they find guidelines or policies for the use of research software very helpful. Furthermore, these respondents agreed that binding requirements (e.g. from third-party funders) could support PID use and implementation. In line with previous discussions, most participants cited a lack of awareness or training and the challenge of integrating PIDs into existing workflows as the biggest hurdles. At the same time, it is pointed out that sustainable funding for the implementation and maintenance of PID must be ensured.

We would like to thank all participants and speakers for the exchange.

Program

TimeTopicSlidesSpeaker
13:00WelcomeBegleitfolienSteffi Genderjahn, Helmholtz Open Science Office 
13:10 Research Software and PIDs for Discoverydoi.org/10.5281/zenodo.15773857
    
Alexander Struck, HU Berlin
13:25Automating the creation of persistently identifiable software publications with HERMESdoi.org/10.5281/zenodo.15646509Stephan Druskat, DLR
13:40SoftWare Hash IDentifier (SWHID) and use casesdoi.org/10.5281/zenodo.15750889Morane Gruenpeter, Software Heritage
13:55 Interactive PartMiro-Board (summary in the text) 
14:20Break  
14:30 Software in The Integrated Authority File (GND)https://doi.org/10.5281/zenodo.15828850Esther Scheven, German National Library (DNB)
14:45Software publications from a DataCite perspectivehttps://doi.org/10.5281/zenodo.15772977Paul Vierkant, DataCite
from 15:00 Open for discussionSurvey results 

Event-DOI: https://doi.org/10.25798/d4gz-gc97