The Role of Generative AI in Software Development Productivity: A Pilot Case Study
The Role of Generative AI in Software Development Productivity: A Pilot Case Study
ABSTRACT 1 INTRODUCTION
With software development increasingly reliant on innovative tech- Over the last decade, software engineering research has explored
nologies, there is a growing interest in exploring the potential of various aspects of teamwork in software development. Such inves-
generative AI tools to streamline processes and enhance produc- tigations involved understanding how software engineers work to-
tivity. In this scenario, this paper investigates the integration of gether, how they collaborate to tackle problems, how they share
generative AI tools within software development, focusing on un- information to complete tasks, their choices in adopting tools, and
derstanding their uses, benefits, and challenges to software profes- the obstacles they encounter in their activities [12, 25, 32]. Un-
sionals, in particular, looking at aspects of productivity. Through derstanding these aspects is crucial for improving individual and
a pilot case study involving software practitioners working in dif- team performance and ultimately achieving success in software
ferent roles, we gathered valuable experiences on the integration processes. Despite these research efforts, understanding produc-
of generative AI tools into their daily work routines. Our findings tivity in software development remains challenging due to its mul-
reveal a generally positive perception of these tools in individual tifaceted nature [12, 17]. Productivity is influenced by technical,
productivity while also highlighting the need to address identified social, and psychological factors, making it complex to fully grasp.
limitations. Overall, our research sets the stage for further explo- Additionally, subjective metrics, diverse tasks, and evolving team
ration into the evolving landscape of software development prac- dynamics add further layers of complexity [11, 26]. Therefore, de-
tices with the integration of generative AI tools. spite progress in understanding teamwork in software engineer-
ing, unlocking the essence of productivity in software teams re-
CCS CONCEPTS mains difficult [9, 11, 29].
Currently, the introduction of generative AI has elevated inves-
• Software and its engineering → Programming teams. tigations into productivity in software development to a new level
as discussions shift towards the incorporation of generative AI-
KEYWORDS based tools to enhance productivity, increase work efficiency, re-
software engineering, generative AI, LLMs, productivity duce errors in software tasks, and accelerate software production
[8, 21, 22]. Notably, tools like GitHub Copilot have emerged as
promising coding aids to improve code writing time. However, de-
ACM Reference Format:
Mariana Coutinho, Lorena Marques, Anderson Santos, Marcio Dahia, Ce- spite the general belief in the potential benefits that generative AI
sar França, and Ronnie de Souza Santos. 2024. The Role of Generative AI brings to software development, particularly regarding productiv-
in Software Development Productivity: A Pilot Case Study. In Proceedings ity gains, empirical evidence remains scarce, especially when con-
of the 1st ACM International Conference on AI-Powered Software (AIware sidering the context of complex real-world projects, such as indus-
’24), July 15–16, 2024, Porto de Galinhas, Brazil. ACM, New York, NY, USA, trial settings [20, 24].
8 pages. https://doi.org/10.1145/3664646.3664773 The lack of empirical evidence regarding the effectiveness of
generative AI in complex real-world software development projects
has motivated the present research. Hence, our goal was to explore
how generative AI tools might impact the productivity of software
Permission to make digital or hard copies of all or part of this work for personal or professionals working on different roles and activities, including
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita- those focused on the delivery of software (e.g., coding and testing),
tion on the first page. Copyrights for components of this work owned by others than supporting activities (e.g., management and IT infrastructure) and
the author(s) must be honored. Abstracting with credit is permitted. To copy other-
wise, or republish, to post on servers or to redistribute to lists, requires prior specific
related tasks (e.g., data science). To this end, we investigated this
permission and/or a fee. Request permissions from [email protected].
AIware ’24, July 15–16, 2024, Porto de Galinhas, Brazil
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 979-8-4007-0685-1/24/07
https://doi.org/10.1145/3664646.3664773
AIware ’24, July 15–16, 2024, Porto de Galinhas, Brazil Mariana Coutinho, Lorena Marques, Anderson Santos, Marcio Dahia, Cesar França, and Ronnie de Souza Santos
phenomenon in a large software company. Specifically, the follow- changes in software practices and tools directly impact the pro-
ing research question guided this study: How does the integra- ductivity of software professionals [1, 3, 15, 33]. For instance, gen-
tion of generative AI tools influence the work of software pro- erative AI has recently emerged as a prominent topic, influencing
fessionals across different roles and activities? We are particu- the debate on productivity in software development.
larly interested in investigating the relationship between the usage Recently, expanding on several research insights, important di-
of these tools and aspects associated with productivity. mensions of developers’ productivity were highlighted, including [9]:
The remainder of this paper is organized as follows. In Section 2, • Satisfaction and well-being: Focuses on happiness, job
we present a literature review on productivity and generative AI satisfaction, work-life balance, feeling valued, and a positive
in software development. In Section 3, we present the case under work environment. High levels of satisfaction boost produc-
study, our data collection, and our data analysis strategy. In Sec- tivity, creativity, and retention.
tion 4, we introduce the key insights derived from the case, which • Performance: Encompasses delivering high-quality work
are discussed in Section 5. Finally, Section 6 discusses the limita- efficiently and accurately within set timelines. Metrics like
tions of our study, and Section 7 outlines our conclusions and plans lead time for changes, deployment frequency, and change
for future work. failure rate gauge performance, impacting customer satis-
faction and business outcomes.
• Activity: Evaluates the volume and nature of teamwork,
such as completed tasks, lines of code, and commits. How-
2 BACKGROUND
ever, activity metrics shouldn’t be the sole focus as they
In this section, we explore previous research on productivity and might not directly correlate with productivity or value de-
generative AI in software development, specifically addressing the livered.
challenges associated with measuring productivity and integrating • Communication and collaboration: Assesses how well
generative AI tools into software development tasks. team members communicate, share knowledge, and work
together towards common goals, vital for successful soft-
ware development teamwork.
• Efficiency and flow: Efficiency involves optimizing work-
2.1 Challenges in Productivity Analysis
flows to maximize output by eliminating bottlenecks and
Understanding and measuring productivity among software devel- unnecessary steps. Flow describes a state of full immersion,
opers has long been a challenge due to the absence of universally fostering creativity, focus, and efficiency among team mem-
accepted metrics. The complex nature of development tasks, cou- bers.
pled with multifaceted dimensions of productivity, poses signifi-
cant hurdles in devising accurate metrics. Factors such as individ- These dimensions challenge myths and misconceptions surround-
ual work styles, team dynamics, task complexities, and subjective ing developers’ productivity, demonstrating that only the combi-
perceptions contribute to the intricate web of challenges inherent nation of several metrics allows for a nuanced understanding of
in measuring developer productivity [9, 11]. productivity in software engineering.
Diverse industry discussions commonly refer to productivity as
the relationship between output and input, but diverse fields adopt 2.2 Productivity and Generative AI
varying notions and measurement units [29]. However, looking Recent studies have shown that generative AI can enhance produc-
specifically at software engineering, these conventional metrics, tivity in software development primarily through automated code
such as considering lines of code as the relationship between in- generation [16]. In this context, large language models are trained
put and output, inadequately capture the essence of software devel- on several code datasets to produce usable code in response to
opment since this work thrives on collaborative efforts stemming specific task prompts. While common chat interfaces can generate
from a diverse group of individuals, each contributing unique ex- both code and natural language responses, tools designed specifi-
pertise [9]. cally for code generation, such as GitHub Copilot, CodeWhisperer,
Software developers operate within the domain of knowledge and ChatGPT, have gained popularity among developers [4, 31].
workers, where productivity lies in the exchange of ideas, knowl- GitHub Copilot, developed collaboratively by GitHub, OpenAI,
edge, and skills, where collaboration becomes the cornerstone, fos- and Microsoft, is described as an AI pair programmer, providing
tering innovation and comprehensive solutions. [9, 28]. These en- developers with real-time code suggestions based on the context
compass individual values, professional objectives, and the stan- of comments and existing code, supporting their productivity by
dards established by the organization. Moreover, their work re- minimizing disruptions and increasing focus on several aspects
volves around activities that demand creativity as part of their re- of programming [10]. Additionally, CodeWhisperer is an AI code
sponsibilities [14]. generator by AWS that offers real-time code suggestions as devel-
Historically, the evolution of productivity measures and discus- opers write, anticipate the completion of lines of code, comments,
sions in software development reflects a quest for a comprehensive or generate entire functions and code blocks, allowing faster task
definition that considers multifaceted aspects influencing project completion [2]. ChatGPT, developed by OpenAI, serves as a con-
outcomes [1]. Over the years, diverse methodologies emerged, from versational AI and virtual assistant, helping developers in coding,
quantifying delivered code to embracing Agile principles, and the debugging, and learning, ultimately enhancing productivity in soft-
rapid evolution in technology continues to shape this quest, as ware development [23].
The Role of Generative AI in Software Development Productivity: A Pilot Case Study AIware ’24, July 15–16, 2024, Porto de Galinhas, Brazil
Preliminary studies focused on the use of these tools and dis- to other crucial stages of the software development process. By ex-
cussed their potential effects on the productivity of developers. ploring the impact of generative AI tools across these tasks, we can
In [24], the authors highlighted the positive correlation between gain comprehensive insights. Moreover, given that various con-
the use of GitHub Copilot and the productivity of developers, with textual factors and backgrounds influence productivity, the com-
a focus on the automation of repetitive tasks. Following this, [13] pany’s involvement in multiple industrial sectors allows us to as-
expands this scenario and confirms the positive correlations while sess productivity within varied contexts, enhancing the study’s
also recognizing the need for broader productivity metrics beyond robustness and applicability. Additionally, the company’s interest
coding time to obtain insights on developer satisfaction. Further, [7] in experimenting with the use of generative AI tools provides an
also explores the speed gains offered by generative AI-based tools, opportunity to leverage the insights gained towards devising pro-
acknowledging that these gains vary with task complexity and de- cesses and policies for their integration into projects.
veloper experience. Lastly, [36] offers a comprehensive review of
AI integration trends in software development, including the in-
creased role of AI in task automation and decision support. 3.2 Defining the Pilot Study Scope
The reports primarily focus on task completion time as a pro- Given the diverse array of projects, contexts, backgrounds, and pro-
ductivity metric, but productivity in software development entails fessionals within the company, we chose to refine our research ap-
complex factors beyond this [29]. In [20], a more comprehensive in- proach by initiating a preliminary investigation within a more fo-
vestigation details an application of ChatGPT in software develop- cused subset. This entailed selecting a specific group of individuals
ment, covering the entire software creation process from require- to engage with AI generative tools and document their experiences.
ments to deployment. Though productivity measurement is not By focusing on this targeted group, we aimed to gather valuable in-
the primary focus, the study offers insights into the nuanced chal- sights that would guide us in the formulation of a broader, more
lenges of measuring the effects of generative AI on productivity. extensive investigation within the case, exploring various produc-
tivity facets in software development.
3 METHOD During this initial phase, we obtained 17 licenses for various
tools, including ChatGPT Plus, OpenAI API, Midjourney, and GitHub
Recognizing the limited empirical evidence from real-world set-
Copilot, for professionals to utilize in their work and provide feed-
tings regarding the use of generative AI to enhance productivity
back on these tools. In this pilot study, any interested professional
in software development, we chose a methodology centered on a
could participate, provided they met three specific criteria: a) they
case study [35]. Software engineering case studies analyze real-life
hadn’t regularly used generative AI tools in their work (priority
settings, like companies or teams, using a systematic approach to
given to those who hadn’t used them at all); b) they received ap-
collect and analyze data that can inform industrial practice [27]. In
proval from their team manager to integrate the tool into their
this research, prior to conducting a comprehensive investigation
tasks; c) they committed to reporting their experiences, including
considering the nuanced nature of productivity, we began with a
both positive and negative impacts on their perceived productivity.
simplified pilot case study [19, 34]. This pilot study aimed to gather
After extending invitations to participate via the company’s com-
insights from developers who were using these tools for the first
munication channels, we selected 14 volunteers who were either
time as an official working tool. By focusing on this initial experi-
actively engaged in software development or working in support-
ence from developers, we plan to gain a preliminary understanding
ing roles to join the pilot case study. These participants were cho-
of how we can explore the nuanced characteristics of productivity
sen based on the aforementioned criteria, with diversity across
in this context. Further details regarding our methodology are out-
backgrounds, experience levels, and project types considered to en-
lined in the sections below.
sure broad representation. This approach aimed to capture a spec-
trum of perspectives and experiences, thereby enhancing the in-
3.1 The Case sights derived from the pilot study.
The company selected for our case study was established in 1996
and specialized in on-demand software solutions across various
sectors like finance, telecommunications, government, manufac- 3.3 Data Collection
turing, services, and utilities. With a workforce exceeding 1,200 In line with the case study methodology, we employed various
professionals, over 70% are directly engaged in software develop- data collection methods to explore the case, including question-
ment across 50 distinct teams. These teams comprise individuals naires with open-ended questions [18] and observations [30]. The
from diverse technical backgrounds, including programmers, qual- primary data collection technique utilized was the questionnaire,
ity assurance (QA) specialists, and designers, who are proficient in which directly gathered insights from the professionals who vol-
popular software development methodologies such as Scrum, Kan- unteered to use the AI tools. Unlike traditional case studies that
ban, and Waterfall and work to develop systems for global clients often rely on interviews, we opted for a questionnaire-based ap-
spanning North America, Latin America, Europe, and Asia. Ad- proach in this pilot study. This decision was made because the use
ditionally, the company employs several professionals in related of the tools was voluntary and not consistently integrated into the
fields like data science and supporting functions such as IT and volunteers’ daily work life or projects. We aimed to minimize in-
human resources. terference with their work dynamics or team activities. Additional
This company provides an excellent setting for our case study data were obtained from observing the company’s communication
as it offers diverse tasks ranging from common coding activities channels, such as Slack, both to identify potential study volunteers
AIware ’24, July 15–16, 2024, Porto de Galinhas, Brazil Mariana Coutinho, Lorena Marques, Anderson Santos, Marcio Dahia, Cesar França, and Ronnie de Souza Santos
AI, software professionals were able to save time while completing 5 DISCUSSIONS
their tasks. In particular, this time saving was apparent through the We focused our discussions on the nature of software developers’
support for various activities that involve writing artifacts, such as productivity presented in the literature, particularly emphasizing
reports. The tool’s ability to generate coherent and relevant con- that software developers operate within the domain of knowledge
tent and provide valuable insights and suggestions significantly workers, where productivity lies in the exchange of ideas, knowl-
supported participants’ writing activities, enabling them to pro- edge, and skills. In this context, we compared how the positive
duce these types of artifacts with greater ease. Furthermore, par- impact of using generative AI tools reported by participants aligns
ticipants highlighted the AI tools’ versatility in supporting a wide with several dimensions of productivity, namely, satisfaction and
range of software tasks as a visible benefit. The effectiveness of well-being, performance, activity, communication and collabora-
providing timely and relevant support across these diverse tasks tion, and efficiency and flow.
was demonstrated to be a valuable advantage of the tools. Primarily, participants emphasized how these tools improve ef-
ficiency and flow, with gains being observed in various aspects of
their software development activities. By optimizing time and con-
solidating multiple tools into a streamlined workflow, participants
4.3 Generative AI Tools: Challenges
maximized the efficiency of their outputs with less effort. More-
The participants reported challenges associated with utilizing gen- over, the reported positive impact on productivity suggests that
erative AI in their software development, with reliability and re- participants experienced an improvement in their performance as
finement emerging as the most recurring issues. Software profes- they were able to create relevant and insightful content, such as
sionals described encountering difficulties in ensuring the reliabil- reports, code, or design models.
ity and functionality of generated responses, especially when at- Additionally, while not explicitly mentioned by the participants,
tempting to use multiple questions simultaneously. They also faced we understand that the use of generative AI tools can indirectly
challenges in crafting precise prompts to obtain objective and accu- impact communication and collaboration within software develop-
rate responses, along with concerns about the absence of sources to ment teams. By providing quick access to information and facili-
reinforce the reliability of results. Additionally, participants high- tating knowledge acquisition, these tools can enhance communica-
lighted the need for refinement and fine-tuning of the generated tion and collaboration by enabling team members to share insights
results to achieve optimal usage. Despite the AI’s capability to pro- and align their understanding toward common goals.
duce responses, participants found that the outputs often required
manual adjustments and polishing before they could be effectively
incorporated into their work. Finally, security measures emerged
as a potential challenge, with at least two participants noting their 5.1 Implications
inability to use the support of the tools with sensitive project data Our study has implications for research, as our pilot case study
due to security constraints. takes on a real-world perspective to explore the evolving land-
scape of software development practices with the integration of
generative AI tools. As participants expressed positive experiences
despite encountering challenges, they suggested that the benefits
4.4 Generative AI Tools: Effects of Perceived of these tools outweighed the drawbacks. Even though our find-
Productivity ings are preliminary, they underscore the need for further inves-
Aligned with the previous findings, participants reported a positive tigation into the efficacy and impact of these tools across various
effect of generative AI tools on their perceived productivity. They dimensions of productivity. In particular, we highlight the need
highlighted how these tools facilitated efficiency gains in various for research focused on refining these tools to address reliability
aspects of their software development activities, thus primarily re- concerns and expand their capabilities, particularly around the di-
lating the optimization of time with productivity gains. One no- mensions of productivity not identified in this study, e.g., activity
table characteristic of this optimization was the consolidation of and satisfaction.
several individual tools into a single tool to perform various activi- Additionally, our study has implications for industrial practice,
ties. Therefore, despite encountering challenges such as reliability as it sheds light on the practical benefits of integrating genera-
concerns or limited outcomes, with the exception of one individ- tive AI tools into software development workflows, considering
ual, software professionals mostly reported a positive impact on different professional roles, including programming, testing, and
their perceived productivity. design. The positive experiences reported by participants indicate
On an additional note, software professionals related their in- the potential of these tools to enhance productivity. Therefore, by
creased productivity with the value that the AI tools incorporated addressing challenges and leveraging the advantages offered by
into their work, especially in facilitating the creation of relevant generative AI, software companies can potentially optimize their
and insightful content, whether reports, code, or design models. development processes. Moreover, considering the challenges re-
More specifically, the tools supported their productivity by con- ported by practitioners regarding reliability concerns and usage
tributing to learning and knowledge acquisition by providing quick difficulties, our findings suggest the importance of providing ad-
access to information because, despite needing external verifica- equate training and support to facilitate the effective adoption of
tion, using these tools is much more productive than seeking in- these tools within development teams, ensuring that they maxi-
formation through queries in search engines. mize their potential benefits while minimizing any associated risks.
AIware ’24, July 15–16, 2024, Porto de Galinhas, Brazil Mariana Coutinho, Lorena Marques, Anderson Santos, Marcio Dahia, Cesar França, and Ronnie de Souza Santos
Supporting “to experiment with ideation processes. I also used it to support the creation of presentation mate-
Ideation rial”. (P06)
Processes “I used midjourney as part of the creative process, combining prompts and images we already use
to generate new texture ideas.” (P07)
Resolving “I used the tool to solve some small doubts about Python code construction.” (P01)
Doubts in Code “ I was able to evaluate how the tool can support data exploration processes, code generation, and
Construction others.” (P03)
“The tool was used for code syntax research, automatic generation of simple algorithms, writing
unit tests.” (P11)
Conducting “The use of ChatGPT has been very useful at the beginning of text production activities.” (P04)
Formal Writing “Aiding in composing texts from topics, suggesting ideas for product and process names.” (P09)
“I have been using the OpenAI GPT-4 model for a variety of tasks, including text generation.” (P10)
Benefits Time “It allowed me to be faster in text writing situations and debugging.” (P09)
Optimization “The tool has proven to be useful in saving time and effort, allowing me to focus on more critical
tasks.” (P10)
“The main gain was in development and research time.” (P12)
Versatility “I believe that using our own materials as data and making the AI generate new combinations.”
(P07)
“The flexibility and adaptability of the tool have been crucial aspects that enhance its value and
applicability in various contexts..” (P10)
“It is a much more productive way of seeking knowledge when compared to the model of using
search engines (e.g., Google) up to that point.” (P13)
Challenges Reliability “Difficulty in verifying the reliability of some information.” (P06)
“I faced challenges, mainly in ensuring that the generated responses are accurate and reliable, which
sometimes requires manual review and adjustment.” (P10)
“The absence of sources is one of the main barriers to the reliability of the results.” (P13)
Security “The adoption of a model that I can use sensitive data from the company.” (P04)
“I would like to add a crucial observation about the ethical and privacy challenges associated with
the use of AI tools like GPT.” (P10)
“The main difficulty was not exposing code used in clients.” (P11)
Productivity “ Absolutely. The value in the productivity and speed of generating results from my prompts is
undeniable..” (P05)
“Yes, the AI tool provided significant value in various areas of my activities. Firstly, it significantly
improved my efficiency.” (P10)
The Role of Generative AI in Software Development Productivity: A Pilot Case Study AIware ’24, July 15–16, 2024, Porto de Galinhas, Brazil
5.2 Future Work opportunities, optimize time, and facilitate the creation of relevant
Following our pilot case study, our immediate future work involves and insightful content. However, the practitioners reported chal-
conducting a comprehensive case study, capitalizing on the avail- lenges that mainly revolved around reliability concerns and diffi-
ability of the company that participated in this study. Our focus culties in obtaining the desired outcomes, forcing them to manu-
will be twofold. Firstly, we aim to explore the particularities arising ally fix the obtained outcomes for inaccuracies or inconsistencies
from the utilization of generative AI tools across various software in generated content.
development roles, ranging from developers to QAs and designers, In conclusion, our pilot case study provides insights into the
thereby gaining insights into how different professionals perceive integration of generative AI tools within software development
and utilize these tools within their specific tasks. Secondly, we aim practices. While our findings suggest promising benefits associ-
to further explore the relationship between generative AI tools and ated with the utilization of these tools, it is important to address the
the dimensions of productivity by expanding our participant co- identified limitations through future research efforts to seamlessly
hort within the case study to encompass a diverse range of project integrate them into the software development process. Overall, our
configurations and software development methodologies. By do- study sets the stage for continued exploration into the evolving
ing so, we aim to offer a more detailed analysis of the impact of landscape of software development practices with the integration
generative AI tools on productivity across different software devel- of generative AI tools.
opment contexts, thereby facilitating a more nuanced discussion
on this subject.
REFERENCES
6 THREATS TO VALIDITY [1] A. J. Albrecht. 1979. Measuring Application Development Productivity. In Pro-
ceedings of IBM Applications Development Symposium. Monterey, 83.
While our pilot case study provides valuable insights into inte- [2] Amazon Web Services. 2023. Amazon CodeWhisperer.
grating generative AI tools into software development workflows, https://aws.amazon.com/codewhisperer/. Accessed: 2023-12-10.
[3] Barry Boehm et al. 2000. Software Cost Estimation with COCOMO II. Prentice
some limitations inherent in the method must be acknowledged. Hall, Upper Saddle River.
Firstly, as a pilot study, our investigation involved a small number [4] Alexia Cambon, Brent Hecht, Benjamin Edelman, Donald Ngwe, Sonia Jaffe,
Amy Heger, Mihaela Vorvoreanu, Sida Peng, Jake Hofman, Alex Farach, et al.
of participants from a single company, and our findings are not 2023. Early LLM-based Tools for Enterprise Information Workers Likely Provide
statistically generalizable to a broader population. Instead, we an- Meaningful Boosts to Productivity. Technical Report. MSFT Technical Report.
ticipate that researchers and practitioners can draw insights from https://www. microsoft. com/en-us/research . . . .
[5] Kathy Charmaz. 2014. Constructing grounded theory. sage.
our discussions, learn about our findings, and transfer the knowl- [6] Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthe-
edge acquired from our pilot case study to their unique situations sis in software engineering. In 2011 international symposium on empirical soft-
and contexts. ware engineering and measurement. IEEE, 275–284.
[7] McKinsey Digital. 2023. Unleashing developer productivity with generative AI.
Additionally, the study focused primarily on participants’ per- https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/unleashing-developer-prod
ceptions and experiences without comprehensive quantitative met- Acessed in Mar 22, 2024.
[8] Christof Ebert and Panos Louridas. 2023. Generative AI for software practition-
rics to assess the impact of generative AI tools on productivity. ers. IEEE Software 40, 4 (2023), 30–38.
Therefore, as with any qualitative research, there is a potential for [9] Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmer-
researcher bias in data interpretation and analysis. To mitigate this mann, Brian Houck, and Jenna Butler. 2021. The SPACE of Developer Produc-
tivity: There’s more to it than you think. Queue 19, 1 (2021), 20–48.
threat to validity, we heavily relied on the raw reports provided by [10] Github. 2021. GitHub Copilot. https://copilot.github.com. Accessed on Novem-
the participants, consistently comparing our interpretations with ber 23, 2023.
their views throughout our analysis. [11] Marcela Guerrero-Calvache and Giovanni Hernández. 2022. Team productiv-
ity in agile software development: a systematic mapping study. In International
Finally, the pilot nature of the study constrained the depth of Conference on Applied Informatics. Springer, 455–471.
data collection and analysis, preventing a thorough exploration of [12] Martin Hoegl, K Praveen Parboteeah, and Hans Georg Gemuenden. 2003. When
teamwork really matters: task innovativeness as a moderator of the teamwork–
the topic. These limitations underscore the need for future research performance relationship in software development projects. Journal of Engineer-
endeavors with broader and more varied samples, integrating both ing and Technology Management 20, 4 (2003), 281–302.
qualitative and quantitative approaches to offer a comprehensive [13] Eirini Kalliamvakou. 2022. Research: quantifying GitHub
Copilot’s impact on developer productivity and happiness.
understanding of the topic. https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-produ
Acessed in Mar 22, 2024.
[14] Young-Ho Kim, Eun Kyoung Choe, Bongshin Lee, and Jinwook Seo. 2019. Under-
7 CONCLUSIONS standing personal productivity: How knowledge workers define, evaluate, and
In this paper, we have explored the integration of generative AI reflect on their productivity. In Proceedings of the 2019 CHI Conference on Human
Factors in Computing Systems. 1–12.
tools into software development workflows, aiming to understand [15] B Lakhanpal. 1993. Understanding the factors influencing the per-
their impact on productivity from the perspective of software pro- formance of software development groups: An exploratory group-level
fessionals. Our goal was to provide an understanding of how these analysis. Information and Software Technology 35, 8 (1993), 468–473.
https://doi.org/10.1016/0950-5849(93)90044-4
tools can be utilized within real-world projects. Through a pilot [16] Hongxin Li, Jingran Su, Yuntao Chen, Qing Li, and ZHAO-XIANG ZHANG. 2024.
case study involving software professionals, we collected insights SheetCopilot: Bringing Software Productivity to the Next Level through Large
Language Models. Advances in Neural Information Processing Systems 36 (2024).
into their experiences while integrating these tools into their daily [17] Yngve Lindsjørn, Dag IK Sjøberg, Torgeir Dingsøyr, Gunnar R Bergersen, and
work routines. Tore Dybå. 2016. Teamwork quality and project success in software develop-
Our findings revealed a generally positive perception of gener- ment: A survey of agile development teams. Journal of Systems and Software
122 (2016), 274–286.
ative AI tools among participants. These tools were particularly [18] Jefferson Seide Molléri, Kai Petersen, and Emilia Mendes. 2016. Survey guide-
valued for their ability to streamline workflows through learning lines in software engineering: An annotated review. In Proceedings of the 10th
AIware ’24, July 15–16, 2024, Porto de Galinhas, Brazil Mariana Coutinho, Lorena Marques, Anderson Santos, Marcio Dahia, Cesar França, and Ronnie de Souza Santos
ACM/IEEE international symposium on empirical software engineering and mea- [27] Per Runeson and Martin Höst. 2009. Guidelines for conducting and reporting
surement. 1–6. case study research in software engineering. Empirical software engineering 14
[19] Cleviton VF Monteiro, Fabio QB da Silva, and Luiz Fernando Capretz. 2016. The (2009), 131–164.
innovative behaviour of software engineers: Findings from a pilot case study. In [28] Anastasia Ruvimova, Alexander Lill, Jan Gugler, Lauren Howe, Elaine Huang,
Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Gail Murphy, and Thomas Fritz. 2022. An exploratory study of productivity
Engineering and Measurement. 1–10. perceptions in software teams. In Proceedings of the 44th International Conference
[20] Mauricio Monteiro, Bruno Castelo Branco, Samuel Silvestre, Guilherme Avelino, on Software Engineering. 99–111.
and Marco Tulio Valente. 2023. End-to-End Software Construction using Chat- [29] Caitlin Sadowski and Thomas Zimmermann. 2019. Rethinking productivity in
GPT: An Experience Report. arXiv preprint arXiv:2310.14843 (2023). software engineering. Springer Nature.
[21] Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad [30] Carolyn B. Seaman. 1999. Qualitative methods in empirical studies of software
Myers. 2024. Using an llm to help with code understanding. In 2024 IEEE/ACM engineering. IEEE Transactions on software engineering 25, 4 (1999), 557–572.
46th International Conference on Software Engineering (ICSE). IEEE Computer So- [31] Samarth Sikand, Kanchanjot Kaur Phokela, Vibhu Saujanya Sharma, Kapil Singi,
ciety, 881–881. Vikrant Kaulgud, Teresa Tung, Pragya Sharma, and Adam P Burden. 2024. How
[22] Shakked Noy and Whitney Zhang. 2023. Experimental evidence on the pro- much SPACE do metrics have in GenAI assisted software development?. In Pro-
ductivity effects of generative artificial intelligence. Science 381, 6654 (2023), ceedings of the 17th Innovations in Software Engineering Conference. 1–5.
187–192. [32] Diane Strode, Torgeir Dingsøyr, and Yngve Lindsjorn. 2022. A teamwork effec-
[23] OpenAI. 2023. ChatGPT: Optimizing Language Models for Dialogue. tiveness model for agile software development. Empirical Software Engineering
https://openai.com/blog/chatgpt. Acessado em 10 de dezembro de 2023. 27, 2 (2022), 56.
[24] Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. 2023. The im- [33] Claes Wohlin and Mattias Ahlgren. 1995. Soft factors and their im-
pact of ai on developer productivity: Evidence from github copilot. arXiv preprint pact on time to market. Software Quality Journal 4, 3 (1995), 189–205.
arXiv:2302.06590 (2023). https://doi.org/10.1007/bf01351923
[25] Rafael Prikladnicki, Yvonne Dittrich, Helen Sharp, Cleidson De Souza, Marcelo [34] Qing Xie and Atif M Memon. 2008. Using a pilot study to derive a GUI model for
Cataldo, and Rashina Hoda. 2013. Cooperative and human aspects of software automated testing. ACM Transactions on Software Engineering and Methodology
engineering: Chase 2013. ACM SIGSOFT Software Engineering Notes 38, 5 (2013), (TOSEM) 18, 2 (2008), 1–35.
34–37. [35] Robert K Yin. 1994. Discovering the future of the case study. Method in evalua-
[26] Daniel Rodríguez, MA Sicilia, E García, and Rachel Harrison. 2012. Empirical tion research. Evaluation practice 15, 3 (1994), 283–290.
findings on team size and productivity in software development. Journal of [36] Beiqi Zhang, Peng Liang, Xiyu Zhou, Aakash Ahmad, and Muhammad Waseem.
Systems and Software 85, 3 (2012), 562–570. 2023. Practices and challenges of using github copilot: An empirical study. arXiv
preprint arXiv:2303.08733 (2023).
This figure "acm-jdslogo.png" is available in "png" format from:
http://arxiv.org/ps/2406.00560v1
This figure "sample-franklin.png" is available in "png" format from:
http://arxiv.org/ps/2406.00560v1