Resources
Which Refactoring Reduces Bug Rate?
Which Refactoring Reduces Bug Rate? - repository
The Corrective Commit Probability Code Quality Metric
The Corrective Commit Probability Code Quality Metric - repository
Commit Classification using Natural Language Processing: Experiments over Labeled Datasets
Just R, Jalali D, Ernst MD (2014) Defects4j: A database of existing faults to enable controlled testing studies for java programs. In: Proc. of the 2014 Int. Symposium on Softw. Testing and Analysis (ISSTA), ACM repository
Gyimesi P, Vancsics B, Stocco A, Mazinanian D, Beszedes A, Ferenc R, Mesbah A (2019) Bugsjs: a benchmark of javascript bugs. In: 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pp 90101
Le Goues C, Holtschulte N, Smith EK, Brun Y, Devanbu P, Forrest S, Weimer W (2015) The manybugs and introclass benchmarks for automated repair of c programs. IEEE Transactions on Software Engineering 41(12):1236–1256
Saha R, Lyu Y, Lam W, Yoshida H, Prasad M (2018) Bugs.jar: A largescale, diverse dataset of real-world java bugs. In: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), pp 10–13
Herbold S, Trautsch A, Trautsch F, Ledel B (2019) Issues with szz: An empirical assessment of the state of practice of defect prediction data collection. 1911.08938
https://github.com/rpau/git-commit-classifier
Enabling the Continuous Analysis of Security Vulnerabilities with VulData7 repository
Mills C, Parra E, Pantiuchina J, Bavota G, Haiduc S (2020) On the relationship between bug reports and queries for text retrieval-based bug localization. Empirical Software Engineering pp 1–42
Wang Q, Parnin C, Orso A (2015) Evaluating the usefulness of ir-based fault localization techniques. In: Proceedings of the 2015 International Symposium on Software Testing and Analysis, Association for Computing Machinery, New York, NY, USA, ISSTA 2015, p 1–11, DOI 10.1145/2771783.2771797, URL https://doi.org/10.1145/2771783.2771797
Ye X, Bunescu R, Liu C (2015) Mapping bug reports to relevant files: A ranking model, a fine-grained benchmark, and feature evaluation. IEEE Transactions on Software Engineering 42(4):379–402
Chaparro O, Marcus A (2016) On the reduction of verbose queries in text retrieval based software maintenance. In: 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C), IEEE, pp 716–718
S. Levin and A. Yehudai. Boosting automatic commit classification into maintenance activities by utilizing source code changes. In Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE, pages 97–106, New York, NY, USA, 2017. ACM
B. Ray, D. Posnett, V. Filkov, and P. Devanbu. A large scale study of programming languages and code quality in github. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pages 155–165, New York, NY, USA, 2014. ACM.
N. C. Shrikanth and T. Menzies. Assessing practitioner beliefs about software defect prediction. In Intl. Conf. Softw. Eng., number 42, May 2020.
A. Hindle, D. M. German, M. W. Godfrey, and R. C. Holt. Automatic classication of large changes into maintenance categories. In 2009 IEEE 17th International Conference on Program Comprehension, pages 30–39, May 2009
J. J. Amor, G. Robles, J. M. Gonzalez-Barahona, and A. Navarro. Discriminating development activities in versioning systems: A case study, Jan 2006.
G. Antoniol, K. Ayari, M. Di Penta, F. Khomh, and Y.-G. Guéhéneuc. Is it a bug or an enhancement? a text-based approach to classify change requests. In Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, CASCON ’08, New York, NY, USA, 2008. Association for Computing Machinery
On the Relationship between Refactoring Actions and Bugs: A Differentiated Replication replication package
How Often Do Single-Statement Bugs Occur? The ManySStuBs4J Dataset dataset
BugsInPy: A Database of Existing Bugs in Python Programs to Enable Controlled Testing and Debugging Studies repo
Large-Scale Manual Validation of Bug Fixing Commits: A Fine-grained Analysis of Tangling
Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, and Ali Ouni, "Can refactoring be self-affirmed? an exploratory study on how developers document their refactoring activities in commit messages", the 3nd International Workshop on Refactoring (IWoR'2019). dataset
Early Life Cycle Software Defect Prediction. Why? How? dataset
The Technical Debt Dataset dataset
An Automatically Created Novel Bug Dataset and its Validation in Bug Prediction dataset
Mining and Managing Big Data Refactoring for Design Improvement: Are We There Yet?
Mining Software Repositories with a Collaborative Heuristic Repository dataset
Exploring the communication functions of comments during bug fixing in Open Source Software projects
What Causes Wrong Sentiment Classifications of Game Reviews? repo
Bug or not bug? That is the question
Exploring the communication functions of comments during bug fixing in Open Source Software projects
Classifying Code Commits with Convolutional Neural Networks repo
Improve Classification of Commits Maintenance Activities with Quantitative Changes in Source Code
An empirical study on the use of SZZ for identifying inducing changes of non-functional bugs repo
PySStuBs: Characterizing Single-Statement Bugs in Popular Open-Source Python Projects repo
Denchmark: A Bug Benchmark of Deep Learning-related Software repo
Recommending Bug-fixing Comments from Issue Tracking Discussions in Support of Bug Repair
Enhancing Source Code Refactoring Detection with Explanations from Commit Messages repo
Bug Severity Prediction using Keywords in Imbalanced Learning Environment
Where should the bugs be fixed? repo
Denchmark: A Bug Benchmark of Deep Learning-related Software repo
[ANDROR2: A Dataset of Manually-Reproduced Bug Reports for Android apps] (https://arxiv.org/pdf/2106.08403.pdf) repo
What makes a good Node.js package? Investigating Users, Contributors, and Runnability repo
CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software repo
MSR Mining Challenge: The SmartSHARK Repository Mining Data repo
CrossVul: A Cross-Language Vulnerability Dataset with Commit Data repo
PYREF: Refactoring Detection in Python Projects repo
On the differences between quality increasing and other changes in open source Java projects repo
Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities repo
Quick remedy commits and their impact on mining software repositories repo
What Makes a Good Commit Message? repo
Defect Identification, Categorization, and Repair: Better Together repo
BugBuilder: An Automated Approach to Building Bug Repository BugBuilder GrowingBugRepository
TSSB-3M: Mining single statement bugs at massive scale repo
What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes dataset
Defectors: A Large, Diverse Python DataCommit Message Matters: Investigating Impact and Evolution of Commit Message Qualityset for Defect Prediction dataset
An Empirical Study on Bugs Inside PyTorch: A Replication Study dataset
Commit Message Matters: Investigating Impact and Evolution of Commit Message Quality dataset
Shedding Light on Software Engineering-specific Metaphors and Idioms dataset
BugBuilder: An Automated Approach to Building Bug Repository dataset
gitmoji - structured and colorful commit classification using emojis
An Analysis of Improving Bug Fixing in Software Development
Known Vulnerabilities of Open Source Projects: Where Are the Fixes?
[Review of Open Software Bug Datasets](https://link.springer.com/chapter/10.1007/978-3-031-45648-0_1
On Refining the SZZ Algorithm with Bug Discussion Data dataset
Multilabel classification for defect prediction in software engineering dataset