Project Overview
There is a rising trend for AI for Code applications like AI-assisted programming, code summarization, defect detection, etc. These applications help software developers improve productivity and make programming accessible to domain experts who are increasingly writing code for scientific applications. As more machine learning models are used for such tasks, there is a need for better annotated datasets which can focus on complex characteristics of code that go beyond surface level features. The field of Natural Language Processing has numerous such datasets focusing on specific linguistic features which has led to significant advancements. Our aim is to create meaningful labels for source code characteristics and automatically annotate source code for said labels. These can be used to train machine learning models as well as to probe pretrained models on whether they have learned certain human recognizable characteristics or spurious correlations.
Team Members
Maxwell Sutcliffe
Client Communication CoordinatorSenior in Software Engineering.
Robby Rice
Digital Content CoordinatorSenior in Software Engineering.
Gavin Canfield
Quality ControlSenior in Computer Engineering.
Tanner Dunn
Agile Framework OrganizerSenior in Software Engineering.
Amon McAllister
Individual Component DesignSenior in Software Engineering.
Weekly Reports
Report 1Report 2
Report 3
Report 4
Report 5
Report 6
Report 7
Report 8
Design Documents
User NeedsRequirements
Project Plan
Design Context and Exploration
Proposed Design
Testing
Final Planning Document
S E 492 Final Document
Final Slides
491 Final Slides492 Final Slides