15 May 2024 1 min read Publication

Exploring Pre-trained Models in Software Engineering

As AI models become integral to software, understanding their lifecycle is crucial. Our PeaTMOSS study examines how pre-trained models spread and evolve in open-source projects, highlighting challenges in reproducibility, maintenance, and trust.

Cubist-inspired artwork to accompany this posting. Generated by DALL-E.

Exploring Pre-trained Models in Software Engineering

In our recent article, PeaTMOSS: A Dataset for Investigating the Supply Chain of Pre-trained Models in Open-Source Software, we examine how deep learning models are integrated into software development. The growing reliance on pre-trained models (PTMs) raises important questions about their trustworthiness, maintenance, and evolution within software projects. By curating and analyzing a dataset that tracks PTMs in open-source repositories, we provide insights into how these models propagate, how they are updated, and what challenges arise in their long-term use. Our goal is to help researchers and practitioners better understand the software supply chain dynamics of AI-driven components.

Since its publication, the article has drawn interest from both software engineering and machine learning communities, particularly those concerned with reproducibility, licensing, and security. As PTMs become more prevalent, ensuring their responsible use in software projects will require collaboration between AI and software engineering researchers. We hope this work contributes to ongoing discussions about the sustainability of AI models in production environments. If you're curious, you can read the full paper here.

Wenxin Jiang, Jerin Yasmin, Jason Jones, Nicholas Synovic, Jiashen Kuo, Nathaniel Bielanski, Yuan Tian, George K. Thiruvathukal, and James C. Davis, Challenges and practices of deep learning model reengineering: A case study on computer vision. Empirical Software Engineering 29, 142 (2024). https://doi.org/10.1007/s10664-024-10521-0

You might also like...

Special Issue on Low-Code/No-Code + Metaverse in IEEE Computer

AI in Hiring: Fairness or Just Automated Bias?

A Signal Injection Attack Against Zero Involvement Pairing and Authentication for the Internet of Things

PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software

Intermediate C Programming, 2nd Edition