Record Details

Replication Materials for: Engineering Formality and Software Risk in Debian Python Packages

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title Replication Materials for: Engineering Formality and Software Risk in Debian Python Packages
 
Identifier https://doi.org/10.7910/DVN/WENTBH
 
Creator Gaughan, Matthew
Champion, Kaylea
Hwang, Sohyeon
 
Publisher Harvard Dataverse
 
Description

These materials were produced as part of:



Gaughan, Matthew, Champion, Kaylea, and & Hwang, Sohyeon. (2024) "Engineering Formality and Software Risk in Debian Python Packages." 31st IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER2024).



And includes data initially produced in:



Champion, Kaylea; Hill, Benjamin Mako, 2021, "Replication data and online supplement for: Underproduction: An Approach for Measuring Risk in Open Source Software", https://doi.org/10.7910/DVN/PUCD2P, Harvard Dataverse, V2



In this archive, you'll find:




  • inst_all_packages_full_results.tab Summary data for all Debian packages as they appear in Champion and Hill (2021).

  • mmt_data_final.csv Summary data for all Python-language Debian packages as they appear in the paper. This data set is novel, and includes package age in days, Github milestone usage, two different calculations of mean membership type (MMT), and package name.

  • calculatePower.R Contains R code to reproduce linear regression and power analysis methods as they appear in the paper.



For more information, please contact:


Matt Gaughan (he/him)


gaughan@u.northwestern.edu



Abstract:



While Free/Libre and Open Source Software (FLOSS) is critical to global computing infrastructure, the maintenance of widely-adopted FLOSS packages is dependent on volunteer developers who select their own tasks. Risk of failure due to the misalignment of engineering supply and demand --- known as underproduction --- has led to code base decay and subsequent cybersecurity incidents such as the Heartbleed and Log4Shell vulnerabilities. FLOSS projects are self-organizing but can often expand into larger, more formal efforts. Although some prior work suggests that becoming a more formal organization decreases project risk, other work suggests that formalization may in fact increase the likelihood of project abandonment. We evaluate the relationship between underproduction and formality, focusing on formal structure, developer responsibility, and work processes management. We analyze 182 GNU/Linux packages made available via the Debian distribution and find that although more formal structures are associated with higher risk of underproduction, more elevated developer responsibility is associated with less underproduction while the relationship between formal work process management and underproduction is not statistically significant. Our analysis suggests that a FLOSS organization's transformation into a more formal structure may face unintended consequences which must be carefully managed.


 
Subject Computer and Information Science
Social Sciences
 
Date 2024-01-10
 
Contributor Gaughan, Matthew