Pair programming is the work mode in development in which two people collaborate on the same task in split, but frequently reversed, roles (one thinking high level, the other implementing the changes). Compared to solo programming, pair programming may have multiple positive effects, for example, on code quality. Utilizing pairing correctly, therefore, is an important—but non-trivial—aspect of coordination within highly effective development teams. With the emergence of capable generative AI coding assistants in the last few years, pairing between human programmer and an AI coding assistant (sometimes called pAIr programming) has also developed into an established pattern. Since I personally enjoy traditional human/human pair programming but also use and see the benefits of AI coding assistants, I was curious about what the literature has to say on the pros and cons of both work modes. Here is what I found out.
TL;DR (for those in a rush)
Based on the research in the reviewed papers, human/human pair programming still is a very useful collaboration mode due to its positive effects on productivity, code quality, team satisfaction, as well as learning compared to solo programming. However, human/AI pair programming (pAIr programming) has a significant overlap with regard to productivity, code quality, and satisfaction. Given that pAIr programming is also more flexible than pair programming (for example, because AI coding assistants are available around the clock) it certainly is a good addition at least in cases where traditional pairing is not reasonable and solo programming would be less efficient. To utilize both traditional pair programming and pAIr programming optimally, I suggest a hybrid workflow starting with a pairing session to share knowledge, break down tasks, and finding solutions for problems with a later switch to pAIr programming or solo programming depending on the task.
Introduction
I’m a big fan of pair programming, the work mode in development in which two people work together on the same task one thinking high level (navigator) and the other implementing the changes (driver) with frequent role reversal. There are a couple of reasons for this. First and foremost, it’s—in my opinion—one of the best ways to distribute knowledge within development teams. Additionally, pair programming ensures continuous communication and mutual understanding especially in scenarios in which team members are distributed geographically and collaborate remotely. Under the right conditions, pair programming may also yield notable improvements to productivity and code quality.
With the general availability and affordability of generative AI coding assistants, like GitHub Copilot, however, a new—potentially rivaling—pattern of pairing a human programmer with an AI coding assistant (pAIr programming) has emerged. Choosing tasks that benefit from pair programming has not been trivial before and with this additional option of human/AI pair programming in the mix, it has become even more difficult. The most interesting question in this context, obviously, is whether or when to choose pairing with an AI over pairing with a human colleague. Does pAIr programming replace traditional pairing? Or, if not, in which situations should we put it to use?
Pair programming vs pAIr programming
A short overview of the pros and cons of pair programming
Pair programming used to be exclusively compared to solo programming (the default work mode in most development teams). Some of the questions that arise naturally when we think about introducing pair programming as a work mode for development teams (either as a default or situational mode of working together) include
- What are the benefits of pair programming over solo programming?
- When should we add pair programming to our workflows?
There is a somewhat broad consensus that pair programming, under the right conditions, may have notable and measurable positive effects on
- code quality,
- implicit knowledge transfers, as well as
- team cohesion and satisfaction.
Code quality: Higher code quality alongside readability and maintainability typically means lower amount of bugs and thus lower bug fixing costs. Although there are no recent studies on this topic, the following study published in 2010 (Integrating Software Assurance into the Software Development Life Cycle (SDLC); Dawson et al.; 2010) on the cost of fixing bugs in different software lifecycle stages, found that the cost of fixing bugs in maintenance may be up to 15x as much as compared to fixing them during development.
Pair programming, by virtue of having two pairs of eyes looking out for potential bugs instead of just one, may result in finding more bugs earlier and, therefore, may lower the cost of fixing bugs.
Knowledge transfer: Moreover, by having two programmers collaborating on the same task, a lot of communication and thus knowledge transfers happen naturally and on-the-go. This is also very important, since it may
- reduce review times,
- lower the need for knowledge sharing sessions, and it may
- secure continued work in case a team member leaves, becomes ill or otherwise unable to work.
Team cohesion and satisfaction: pair programming moreover may reduce silos and may heighten mutual understanding within development teams. This may in turn increase satisfaction, confidence and motivation thus upholding positive team spirit and reducing the risk of developers leaving the team.
Pair programming, however, does not always yield positive effects. For example, simple tasks (like refactoring or writing documentation) typically will not benefit from pairing. Also, the human factor plays an important role. Some people prefer working on their own and may, therefore, not be more productive and correspond well to having to pair up for a task.
Enter human/AI pair programming
Obviously, a lot has happened since early 2022 (when I wrote the post linked above). In particular, generative artificial intelligence (GenAI) chat bots like ChatGPT or GitHub Copilot emerged that can be used to write and analyze code and even allow new programming methodologies like Vibe Coding.
Programming in conjunction with an AI chat bot often superficially resembles human/human pair programming with the AI taking the driver part (code writing). There is a back and forth dialogue where the human developer breaks down the actual coding tasks and the AI suggests pieces of code. The human developer integrates the code and tests it (potentially with support of the AI for writing test code).
Some AI-assisted code editors like Cursor actively change the code base and thus allow the human programmer to fully concentrate on the navigator part.
Role reversal is also possible, but—from my limited experience—current AIs are not very good at navigating changes to a code base given a set of requirements. They conversely, however, often excel at writing quality code when presented with a concrete task. So, having the AI take the part of the navigator is likely not as efficient. One major difference of pAIr programming and human/human pair programming, consequently, is the fixed role set up.
Another way in which pAIr programming differs from human/human pair programming is the knowledge distribution and team cohesion aspect. Talking to an AI chat bot is, for one thing, usually one-sided. The human programmer will not receive critical feedback or suggestions unless specifically asking for them. Moreover, by not talking to a human colleague, the human programmer does not share the knowledge about technical aspects or business requirements within the team. Lastly, pAIr programming sidesteps direct team collaboration and therefore has little if any impact on team cohesion.
Some pointers on how to choose between solo programming, human/human pair programming, and pAIr programming
Given that we now have at least a trifecta of development work modes—solo programming, human/human pair programming, and pAIr programming–how do we pick the perfect work mode for a given programming task? Thankfully, there are some relatively recent studies available that provide answers to some of the most pressing questions. I picked the following two studies from 2023:
- Is AI the better programming partner? Human-Human Pair Programming vs. Human-AI pAIr Programming (Ma, Wu, & Koedinger; 2023)
- Classification of Human-Human and Human-AI Pair Programming Effects and Expansion for AI Pair Programming Patterns (Takai et al.; 2023)
This is, an obviously limited selection and it may, therefore, not perfectly reflect the current state of research on this topic. That being said, both papers appear methodologically rigorous, unbiased, and contain some interesting findings.
Ma, Wu, & Koedinger: The first paper analyzes all three work modes in the dimensions of
- quality,
- productivity,
- satisfaction,
- learning, and
- cost.
An overview of the comparison results is shown in the table below.
From the table, we can gather that pAIr programming may yield some notable improvements over solo programming when considering quality, productivity, and satisfaction. There is, however, no noteworthy difference in learning and there are no robust results yet for cost comparisons. Notably, the paper does not contain a direct estimation of the pros and cons of human/human pair programming compared to pAIr programming. We can infer however, that both working modes have their respective positive effects on quality, productivity, and satisfaction, while traditional pair programming is better with respect to learning. Given that pairing with an AI code assistant is more flexible than pairing with a human colleague, the authors make the argument that pAIr programming may fill a niche in situations where human/human pair programming typically runs into problems. More concretely, this niche comprises situations where it‘s difficult to
- find tasks of the right complexity,
- assign human programmers that are compatible with respect to skill level and/or working styles,
- establish proper communication and collaboration, and
- handle logistics (for example, scheduling pairing sessions).
Hence, pAIr programming may be a good substitute for human/human pair programming in case developments teams encounter one or more of the above challenges.
Takai et al.: The second paper assesses human/human pair programming and pAIr programming by how they influence code, coding, coder, and pairing. A summary is shown below.
While the results are somewhat muddied in that there are no clear decision rules how to choose between human/human pair programming and pAIr programming, there are some important pointers that may guide development teams. The first point made is that, in the code dimension, experts may yield more benefits from working with AI coding assistants than novices, since the latter have more trouble gauging the quality of AI-generated code. Consequently, pAIr programming may not be an adequate replacement for human/human pair programming if one of the programmers is comparatively inexperienced.
Moreover, the authors note that productivity gains of pAIr programming in terms of code generated may be held back by higher effort of fixing deficiencies and cleaning up tech debt. As for the impact on the coder themself, pAIr programming does not necessarily lead to skill improvements on the human part especially if responses contain errors and if sources are intransparent.
For human/human pair programming, it is reported that quality and productivity gains are disputed and may rest on—not precisely outlined—conditions. Takai et al., however, also confirm the assessment that human/human pair programming at least has a positive impact on commitment and satisfaction as well as facilitating knowledge distribution within the team thus making team shake ups and handovers less risky.
The authors also mirror the finding made in the Wu paper that pAIr programming has a definite advantage over human/human pair programming when some of the limiting factors like personal incompatibility or scheduling difficulties are present.
The paper, moreover, contains a very interesting proposal that test-driven development could be an instrumental way to improve how intent is communicated by the human programmer to the AI in pAIr programming (a pattern called test-driven AI coding). Conveying intent via test cases may lead to less disambiguities compared to communicating in natural language and thus may reduce the amount of unsuitable or faulty code produced by the coding assistant. The authors suggest the following four-step workflow for test-driven AI coding:
- writing tests (potentially also using an AI assistant),
- writing code (using the AI assistant and running against the test cases),
- refactoring code (applying coding conventions, restructuring, and reducing duplications), and finally
- analyzing (with static code tools like Sonarqube).
A hybrid workflow
From the results of the two research papers summarized above and my personal experience, an example workflow that optimally utilizes all three work modes of pair progamming, pAIr programming, and solo programming may look like this:
- begin work in human/human pairing mode to touch base on the current state and create a collective understanding of requirements, problems, conventions, and so on; then,
- continue with traditional pairing to break down hard problems and find solutions; eventually,
- split up into individual pAIr programming sessions once the tasks have been refined to a degree that allows parallel and independent work, and lastly
- fall back to solo programming wherever reasonable.
Working like this should leverage the respective advantages of pair programming, pAIr programming as well as solo programming optimally.
Summary
Pair programming is a development work mode where two human developers work together on the same task. One person thinks high level (navigator) and the other person implements the changes (driver). Pair programming has traditionally been compared to solo programming and there is research suggesting positive effects on productivity, code quality, team satisfaction, and knowledge distribution. With the advent of capable AI coding assistants, a new working mode where a human developer works together with an AI (pAIr programming) has emerged. This new work mode may rival human/human pair programming.
From the reviewed literature, it does not become completely clear which mode would typically be the best choice under what circumstances. That being said, what is clear is that
- with respect to productivity, code quality, and team satisfaction both traditional pair programming and human/AI pair programming may have positive effects over solo programming; that
- regarding learning and knowledge distribution, traditional pair programming is still better than solo programming or human/AI pair programming ; and that,
- traditional pair programming requires good communication and collaboration skills and proper logistics which limit its usefulness and open up a specific niche for human/AI pair programming where it may serve as a very good addition due to its flexible and accessible nature.
Additionally, test-driven development may improve pAIr programming by conveying intent to the AI coding assistant more effectively (test-driven AI coding).
Consequently, human/human pair programming should still be considered on of the best work modes for effective development teams and pairs well with the use of AI coding assistants if the pros and cons of both work modes are understood well. Lastly, solo programming also still has its place —especially as a fall back—but will likely have an even more diminished role.