Beyond the stoplight chart: How to perform Agile oversight at scale.

Four years ago I co-wrote the federal government’s guide to budgeting and oversight of major software projects. In the intervening years, the advice contained in that document has been put into practice in a number of state legislature and governors’ offices, to the point where a new problem has appeared: How do you oversee software projects at scale?

So you’ve left behind the fiction of stoplight charts. You’re getting demos instead of memos. Teams are ostensibly delivering software incrementally, based on constant user research. You have two, five, ten, thirty projects working in this way. But, oh no…you have thirty projects working in this way. Now your calendar is just demos of functioning software, all day, week after week. This is a great problem to have, but it’s a problem just the same.

I have worked on solving this problem, and I’ve applied parts of a solution, but my solution is by no means a tested theory of change. But I think it’s important to work out loud, so what follows is the four-stage process that I recommend to anybody facing the problem of scaling up oversight of major software projects. (I wrote a bit about this last May, though targeted at agency heads, in “How an agency principal should oversee a major custom software project.” That’s meant for the actual head of an agency; this guide, on the other hand, is meant for their deputies, advisors, etc., for legislative staff in similar positions, and for state IT agencies and procurement shops.)

Here are the four steps: 1. Get access to the backlog. 2. Require sprintly ship notices sent to leadership. 3. Engage with the ship notices. 4. Provide direct support. Let’s take each of these in turn.

1. Get Access to the Backlog

Any Agile team practicing user-centered software development is going to maintain a backlog of user stories, in GitHub Issues, Jira, or a similar tool. (And if they are not, that is the biggest, reddest of red flags.) Tell the team that they need to provide you with access to that backlog, so that you can monitor their work. This is the single source of truth for any software project—without some serious deception, looking at this will reveal exactly how much work they’re getting done and how they are doing it. If the team is reluctant or claims technological obstacle, tell them that you’ll give them 30 days until you need access; they will use that time frantically getting their act together, like somebody tidying up their home before the house-cleaner arrives.

When you have access to the backlog, periodically review it to get a sense of what they’re doing, why they’re doing, how it’s going, and the extent to which that’s rooted in user needs uncovered through one-on-one user research. Look at the software they’re developing to see those user stories realized in staging and production. It is important to fight the the urge to extract metrics from the backlog. You might think “I can tally the story points completed per sprint to measure productivity,” or “I can measure the time allocation per activity.” Down this path lies madness. A scrum team is a small, self-organizing, cross-functional group that incrementally builds software to address user needs. The one external metric that they should be held to is “are they meeting user needs?” Turning their activities into metrics in the name of transparency will wreck that work product.

2. Require Sprintly Ship Notices Sent to Leadership

A reasonable request to make of a team is that they send ship notices at the end of each sprint—a summary of what they’ve been up to for the past two weeks. A ship notice should also include what’s planned for the next sprint, a list of blockers that they need help clearing, and a one-sentence budget update that indicates how much has been spent against what total dollar value with a forecast end date based on that burn rate. To report what they’ve been doing for the past two weeks, have them simply paste in a list of all user stories that they completed, with each one linked to that user story in the backlog, for traceability. These only take a few minutes to prepare, since it’s mostly copying and pasting from their backlog, and a cadence of sending one per sprint is sufficient for leadership. This email should be CCed to the most important people who need to receive these updates. A major initiative for an agency should be CCed to the agency principal.

3. Engage with the Ship Notices

It is not enough to simply receive the ship notices. For teams, that will eventually feel like they’re CCing /dev/null. It is important to actually read every ship notice, understand every ship notice and, when appropriate, engage. When you see that a particularly valuable or interesting user story has shipped, you should try out the new feature on the site and respond with praise for the team. If you see that there’s a blocker that you could or should address for them, do so. If you see that a team’s velocity has slowed down over the past few sprints, you should check up to see if they need support (not to complain, but to help). Remember that your emailed responses will be shared with the team—seeing that leadership is engaged, interested, and supportive can make a big difference to their morale. The frequency with which you engage with ship notices should be proportional to the level of oversight required for the project.

4. Provide Direct Support

Sometimes projects will need help. “Help” does not mean haranguing them, insisting that they work faster, or micromanaging them. Many projects will be performing Agile, but not actually practicing Agile. Sprintly ships will reveal this to be the case, but it won’t teach them how to work correctly. Demanding that they “be Agile” is not going to do it.

The solution is to teach them how to work, and that’s best done in the form of an internal digital service team. That can be a very small team, as few as four people: a user researcher, a software developer, an Agile coach, and an Agile project manager. It’s possible that one person could check more than one of those boxes. If your agency outsources a lot of software development, then a contracting officer should be in the mix, too. That team needs to have experience with consulting, because their task is ticklish.

If the software development team in question is comprised of agency employees, the digital service team can work directly with them to help them improve their processes. If the the software development is outsourced, the digital service team can help agency employees learn how to let the vendor team do their best work, or demonstrate that they need to replace that under-performing vendor with a better one (and then help them to do so).

It is not enough to tell them “do better,” because they may not know what “better” looks like, or how to get there within the realities of your agency. You need to give them the capacity to work better.


The goal here is to ensure that each project has spun up the flywheel of using constant user research to inform the development of in-production software that’s providing benefits to users.

This is all about scaling up “demos, not memos,” but it’s no replacement for actual demos. It’s important to attend sprint reviews as often as plausible, especially for high-risk and high-value projects, to ensure that value is being delivered to end users.

At the risk of straying into management consulting, I must caution that there is such a thing as too much oversight. Is the team consistently delivering high-value improvements to end users who are measurably, objectively happy? Then leave them alone! If they are falling short of that goal, you will determine that through getting access to their backlog, reading their sprintly ship reports, and engaging with those ship reports, and the way to address the problem is by providing them with direct support from an experienced, cross-functional digital services team. Just because you are qualified to observe that there is a problem does not mean that you are qualified to fix the problem. Only an experienced Agile coach can do that.

Again, this four-part approach is by no means a tested theory of change, but instead a series of things that I and/or my colleagues have tried at least once and had success with. I’m working every day on testing out these and other approaches to Agile oversight at scale, as I know many other people are doing as well. We’re on a path to collectively arrive at a theory of change over the next couple of years, and exercising this four-step process will help get us to realize that.

Preventing lousy vendors from bidding on your custom software project.

An abstract image of a circle backed by a circuit board. In the center of the circle is the letter Q. At the top it says 'COYPEGIIGHT.' Scattered around the circle are images of books, stacks of paper, pigeons, and globes.
Baffling illustration courtesy of DALL-E attempting to illustrate the introductory paragraph.

Government agencies that want different results from their software procurements recognize that they need to get bids from new vendors to make that happen. A lot of work needs to go into that (market research, dividing up projects differently, simplifying terms & conditions, circulating solicitations more widely, etc.), but three changes stand out above all others for their effectiveness in warding off incompetent vendors while attracting new, well qualified vendors: owning copyright, publishing the software as open source, and issuing smaller contracts. Let’s review each of these in turn.

Own Copyright

Some of the major vendors in the custom software space use government contracts as a source of R&D funding to develop software that they retain ownership of, and then sell to other agencies. They create it as custom software for the first agency, and then license it as “commercial off-the-shelf” (COTS) software to subsequent customers. This is wasteful spending by government, which is left licensing near-identical software that they should have retained ownership over in the first place. An enormous amount of work is required on the part of agencies to support vendors’ development services, and there is no reason whatsoever that the benefit should accrue to the vendor. (Sometimes the private sector should retain copyright for custom software! This is what SBIR funding is for. But if it’s not SBIR-funded, the vendor should not retain any ownership over the rights.)

This predatory contracting model is not something that agencies should participate in. A solicitation that makes crystal clear that copyright will be held by the agency will repel these vendors, while attracting vendors who have no interest in owning these work products.

Publish Software as Open Source

Once the agency owns the software, they should release it under an open source license; or, if it’s a federal agency, it should be released into the public domain or, better, under a Creative Commons Zero license (public domain, but with international applicability). I recently wrote at length about the value of agencies publishing their software as open source, so I won’t belabor this.

Issue Smaller Solicitations

Agencies often turn a monolithic need into a monolithic procurement. “The legislature requires us to stand up a new benefits system” turns into a solicitation to create a new benefits system. This is extraordinarily risky, with vanishingly slim odds of success. There is no vendor that’s able to write software, host software, operate a server farm, run a help desk, run a training program, run a call center, and manage payments, and there are only a literal few who will bid on such a contract (intending to subcontract most or all of the actual work). Issuing solicitations like this guarantees that you’re only going to receive bids from the usual suspects, and ensures that you’ll get the exact same results that you’ve always gotten.

Competent software development vendors don’t want to run your help desk or train your employees—they just want to write software. They’re not going to bid on $100 million projects, because the time and effort required to get a contract like that is enormous, making it too big of a risk to bother with. (Also, for a small vendor, actually getting a contract that big would pose an existential risk to them.)

The solution is to break up the contract into smaller pieces. Have one contract for software development, one for help desk services, one for hosting, etc. Is that more work for the contracting shop? Possibly, certainly the first time they take this approach. But then one failed contract won’t sink your whole effort, you’ll get vendors who are actually good at the thing you’re hiring them for, and you’ll get bids from vendors who aren’t the usual suspects.


There is a laundry list of other things to do differently to attract new vendors (and fend off bad ones), but the the highest-impact ones are the trinity of owning copyright, publishing the software as open source, and issuing smaller contracts. This combination will make a big difference.

Publishing your agency’s software as open source puts you at a major advantage.

In the United States, nearly all agencies and departments, at all levels of government, rely on open source software. Some do so as an explicit policy decision, others merely as a matter of practice. Objections around software quality and security collapsed long ago in the face of reality. But publishing open source software is a very different situation.

When agencies procure custom software, but keep the source closed, they are putting themselves at a significant disadvantage. There are some enormously compelling arguments—and some surprising arguments—in favor of agencies preemptively publishing source code.

Works of government may be public by default

Under the Copyright Act of 1976, “Copyright protection under this title is not available for any…work prepared by an officer or employee of the United States Government as part of that person’s official duties.” Although there are exceptions, it is broadly true that software developed by the federal government is in the public domain. Custom software produced for the federal government by vendors is often public domain, but not always. If the copyright is assigned to government—as it really, really should be—then it is generally public domain. (This is governed by FAR Subpart 27.4—Rights in Data and Copyrights. See also the prescribed clause in FAR Subpart 52.227-14—Rights in Data-General: “Government shall have unlimited rights in…[computer software] first produced in the performance of this contract.”)

At a state level, things are much messier. There is no consistency between states, and sometimes no consistency between agencies within a state. Custom software produced by or for state agencies might be public, or it might not be—you’d need to consult a lawyer to be sure.

Open-records laws may require you to turn source code over to requestors

Federal and state open-records laws have often been used to compel agencies to provide custom software’s source code to anybody who asks. It’s broadly true, at a state and federal level, that there are exemptions for releasing records that are otherwise prohibited by law, and every government has a law prohibiting the release of information that could undermine computer security. But if that exemption doesn’t apply, then federal agencies need to turn over the source code, which the recipient is then free to make public. Again, there are huge differences between states’ freedom of information laws, so research is necessary to know how those apply to source code in a given state.

Open source is more secure than closed-source software

People who aren’t software developers will often assume that software is more secure if the source code is kept secret. If that were true, source code would be inherently exempt from open records laws, the copyright held close by government. Source code would be treated as a state secret, stored and transferred like nuclear waste. But that’s not how we treat software, because we know that isn’t true.

A strong argument for open source’s security comes from the U.S. Department of Defense’s excellent Open Source Software FAQ, specifically their answer to “Doesn’t hiding source code automatically make software more secure?” which opens with, simply, “No.” The document goes into a great deal of detail, but suffice it to say, no less of an authority than the nation’s military thinks that it’s important for government to publish source code openly.

Open source supports agencies’ need to tell the public that they are being transparent in their work

It can be important for many agencies to help the public have confidence in their operations, and that’s especially when they issue decisions that are made by or augmented by software. Judges have often agreed with defendants’ demands that breathalyzer software source code be provided so that the defendant has a chance to prove that it contains bugs that led to an erroneous reading. Public benefits systems increasingly automate qualification decisions, and advocates are putting a corresponding amount of pressure on agencies to allow them to verify that those decisions are consistent with relevant laws and regulations. For types of software likely to be subject to such scrutiny, publishing the source code preemptively shows that the agency has nothing to hide.

Open source prevents vendors from including copyright poison pills

When government procures custom software from vendors, with government owning the copyright, unscrupulous vendors may see an opening to make some money on the back end, by including software that they already wrote and hold copyright on. They can deliver software that is 99% government-owned, but has interwoven into it a critical layer of software that’s owned by the vendor, for which they can demand licensing fees for as long as the software is in use. One way to avoid this problem is to contractually require that all software delivered by the vendor be published under an open source license, absent written permission providing exceptions.

For a high-profile example of a copyright poison pill, see Georgia v. Public.Resource.Org, the Supreme Court case that resulted from LexisNexis weaving copyrighted text throughout the Code of Georgia and to ensure that they’d receive licensing fees from anybody attempting to reproduce the state’s laws.

Open source sets up a powerful incentive to get the best work from the best vendors

Pity the software developer working at a standard-issue government consulting firm building a software product on contract with an agency. Odds are low that their code will ever make it into production, even lower that anybody will ever use it for anything at all, and basically nonexistent that anybody else will ever see a single line of code that they wrote. It’s a miserable way to make a living. What sort of work product would you deliver under those circumstances?

Incentives change substantially if the product is advertised as open source from the time that the first RFP is published to create it. Lousy vendors usually know that their work is bad. They don’t want to produce software for government that will be public, for potential future clients to see. And good vendors know that their work is good. They’re proud to have their work be public, to have something to show potential future clients. They want to put high-performing employees on those projects, who will make the vendor look good. And if the source code is on GitHub, those employees know that their daily work is being promoted to their professional network on GitHub, and available for review by potential future employers. They’re no longer toiling in obscurity, but instead working very much in public, doing their best work on behalf of an employer who wants to give them the resources to do their best work.

You can filter out lousy vendors and elevate good vendors by declaring a product to be open source from the outset.

Open source reduces risk for subsequent vendors, which reduces the dollar value of their bids

Software is never done. The vendor who gets the contract to build custom software may well not be the vendor who gets the contract to maintain it, or the vendor who gets the contract to add new functionality years later. There’s a lot of risk for vendors working on existing software if they can’t inspect it first. Is the code garbage? Is there documentation? Is it well linted? Is there good test coverage? Can developers run it in Docker on their laptops?

If vendors receive an RFP to improve existing software, and it’s trivial for them to look at the source code, and what they see is good, that reduces their risk, which in turn reduces their bid. Open source software is cheaper software.


There are vanishingly few downsides associated with government publishing source code publicly, but some substantial upsides. Agencies should default to working in the open, and reap the benefits of reduced cost, increased trust, and higher-quality results.

Budgeting for software projects in “scrum team years.”

It’s often said that if you want to know how long it will take to complete an Agile software project, then you should get started on building it. The theory is that once you have your backlog built out, your stories sized, and you know your velocity thanks to a few months of work, you can get a rough idea of how long it’ll take to complete the whole thing. (Henrik Kniberg explains this in his “Agile Product Ownership in a Nutshell” video.) And that might be fine in some scenarios, but in organizational contexts, it’s often impossible to get started without funding, and it’s impossible to get funding without having a defensible estimate of how long a project will take…which is difficult to do without getting started. What’s to be done?

There are a lot of bad ways to estimate software projects costs, and they fall into two camps: qualitative estimates (“I did something like this once and it took about 10,000 hours, so that’s how long this will take”) and quantitative estimates (“this is similar to these three other projects, which have an average of 600,000 lines of code, and historical data shows that line of code takes 1 minute to write, so this will require 10,000 hours”).

In government, in practice, neither of these are used. Instead, an agency publishes an request for information, vendors provide ballpark figures that are rarely rooted in any defensible math, the agency then makes a request for funding based on those responses (e.g. by tossing out the high and low numbers and averaging the remainders), funding is awarded, the agency publishes a request for proposals, and the bids come back for prices real close to that awarded funding. At any step of the way, if anybody asks why the cost is, say, $20 million…well, that’s not actually explainable. There is no internal logic that underlies this price tag. The result has been 20 years of spiraling of costs for custom software in government, as prices have gradually gone up because they are tethered to nothing but the amount of money that vendors say it’ll cost, and they have every incentive to provide a big number.

There is a better way: “scrum team years.”

Federal labor data shows us that the blended hourly rate for each member of a scrum team averages about $125, or about $235,000 (at 1,880 hours per year). A scrum team, therefore, will run you $1–2 million, depending whether it’s closer to four members or nine.

When procuring a major custom software project, you can interrogate the reasonableness of the price by breaking down the price into scrum team years. Is the bid for $20 million? Then you should be getting between 10–20 scrum team years, perhaps as 5 scrum teams working for 4 years, perhaps as 5 scrum teams working for 2 years, or any number of other mathematically plausible variants. Experienced software developers can compare the complexity of a project to the number of scrum team years and have a sense as to whether the price makes sense.

This works at an agency level, this works at a procurement level, this works at a budgeting level. It allows people who lack deep expertise in software development (which is to say nearly everybody involved in the entire budgeting and procurement process) to have some basic unit of value to compare and debate. (Does this project really require 500 people working for 5 years? What could we get from 10 people working for 6 months? Wait, we’re only getting 10 scrum-team years but we’re paying $50 million? And so on.)

It’s even possible to interrogate the price tag more deeply, if it’s arrived at thoughtfully. In the same way that a scrum team will generally estimate story sizes (“pointing stories”) before pulling them into a sprint, it’s also possible to use experience—some of that qualitative analysis—to make estimates at much larger scales. This will suffer from the same problems that plague standard estimation methods, but for the purpose of establishing a rough order of magnitude, when used by people with significant experience executing comparable projects, it’s an internally-coherent, externally-validatable approach. Reasonable minds can disagree as to whether creating (say) a basic account-creation system might require more or less than one month of work by a scrum team, but breaking down a project into a dozen or so such units and recording the estimated effort level for each unit allows those minds to disagree meaningfully and productively.

Scrum-team years are a good estimation tool for aligning program teams, budgeting, procurement, and oversight, giving everybody a common currency of understanding.

How an agency principal should oversee a major custom software project.

The success of many government agencies now hinges on their ability to successfully execute large custom software projects. And yet as a rule, the principals of those agencies lack the ability to ensure that those projects will succeed, or even to oversee them meaningfully. As a result, they have lost the ability to ensure that their agency can achieve its mission.

The Problem

Agencies are pretty specialized—in federal government, their key needs are unique, and in state government they’re one of just 50 (or as many as 56, depending on how you count) agencies with those needs. The truly generic needs can be addressed via commercial off-the-shelf software (COTS), but the mission-unique stuff must be met by what I call load-bearing software, which is inherently custom.

Load-bearing software became a thing in federal government midway through the last century. The ur-example of this is the IRS’s Individual Master File, their core computing system, written in COBOL and IBM System/360 assembly, which debuted in 1960. Load-bearing software became an increasingly common need for government agencies in the 1980s and 1990s, with that software almost entirely internal-facing. That changed in the 2000s and 2010s, as the public gradually came to expect internet-intermediated interactions with agencies, especially for application processes. In 2020, Covid forced agencies at all levels of governments to move service delivery online, and three years later there is no sign of that shift receding.

If a state unemployment agency’s UI system doesn’t work, in what sense are they a UI agency? If a state’s EBT system goes down, in what sense do they provide SNAP benefits? If the IRS’s Individual Master File crashes, in what sense are they a taxation agency?

Load-bearing software must work for agencies to achieve their missions. And yet, under the standard outsourcing paradigm, agencies outsource every aspect of the construction, maintenance, enhancement, support, and hosting of this software. In doing so, they outsource their mission. This is a terrifically dangerous practice.

When an agency principal lacks the knowledge or even interest to understand and control these software projects, they are handing their control of the agency to a consulting firm’s project manager. No leader wants to do that.

The Solution

In short, agency leaders need to give a damn about technology procurement, budgeting, oversight, and implementation. Load-bearing software is not a detail—it’s the whole ballgame.

There are four things that principals need to learn if they’re to control their agency’s ability to achieve its mission:

  1. How modern software is made
  2. What’s possible, at what level of effort
  3. How much software costs
  4. How to oversee software development

Let’s review each of these.

How modern software is made

To know how software gets built today, there are six core concepts that agency leaders need to grasp:

  1. User-centered design
  2. Agile software development
  3. Product ownership
  4. DevOps
  5. Building with loosely coupled parts
  6. Modular contracting

There’s a short overview of each of these in GSA’s “State Software Budgeting Handbook,” which I co-wrote in 2019, so I won’t re-explain them here. It’s not enough for agency principals to read a paragraph about each of these, though. Without about an hour of training in each of these subjects, agency leaders can have a good base of knowledge to how projects are being executed—or should be executed—by vendors and agency staff.

What’s possible, at what level of effort

Many of the work overseen by agency leaders draws from fields that they’re already equipped to understand the complexity of. If an agency needs to hire 500 new employees, a leader knows intuitively that this possible, and that will take longer than two months but less than two years. If it needs to buy new office equipment for 100 people, that’s very achievable, and will cost more than $100,000, but less than $1,000,000. If the agency needs to move into an entirely new building in two months on a budget of $5,000, that is not possible. And so on.

There is absolutely nothing that has prepared an agency principal for understanding the cost of software. There are a pair of xkcd comics that address this concept:

Tasks,” by Randall Munroe
Easy or Hard,” by Randall Munroe

The best corrective for this is to observe actual Agile software development teams actually developing software, by joining a series of sprint review sessions for multiple projects. That makes it possible to see what e.g. six people are capable of accomplishing within two weeks of work.

How much software costs

Grasping the cost of software is difficult in the space of government software because of the absurd levels of pricing distortion brought about by decades of procurement practices unsuited to the problem. At a state or federal level, $100 million is a normal price for the development of a load-bearing software system, and that’s a price tag that’s not meaningfully decomposed to any part of that system.

Again, observing actual Agile projects will do a lot of good here. By understanding the level of effort, and connecting that to the billing rate of a vendor, it becomes possible to see that the work produced by our six-person team in two weeks cost $60,000 in billed time. Observing this for a while, it soon becomes evident what software should actually cost.

How to oversee software development

I’ve already written a guide to overseeing major software projects, albeit intended for legislatures, but the 14 listed plays largely apply to an agency principal, either for them to apply themselves (especiallyDemos, not memos“) or to ensure that their staff are applying.

But there are a few leader-specific admonitions that I want to include here.

  • No stoplight charts. Agency principals receive regular updates on major software projects that say everything is going great, with green lights all the way….right up until the update that says everything is red and the project has failed. What happened? In short, strategic misrepresentation. Nobody wanted to provide troubling news to their boss, so as the state of the project got handed up the chain, the view got rosier and rosier, until the principal was told, with every update, that things were going great. The solution to this is to eschew reports in favor of live demos of the actual work being done. If leadership requires reports, make them narrative, authored by the agency’s product owner for the software project.
  • Require that live software be deployed to production regularly. Project will spin their wheels in isolation. This allows for bad decisions to be hidden, for a lack of progress to obfuscated. Leaders should insist that software improvements be incrementally delivered to end-users. Continuous delivery paired with continuous user research makes it very difficult to waste much money on a software project.
  • Require weekly ship reports. At the end of each week (or each sprint), a project’s leadership should write a “ship report,” which briefly describes what was shipped since the last ship report, along with what’s coming up, what blockers are preventing the project from progressing, and how much of the budget has been spent on the project to date. For any project that requires particularly close monitoring, it helps to require the inclusion of a list of all user stories that were completed within the period in question—this makes it crystal clear what the project team has accomplished.

This is all meant to to avoid precisely this scenario:

“Deloitte presented much too rosy of a picture to us,” [Governor Gina Raimondo] said. “I sat in meetings with Deloitte and questioned them and they gave us dashboards that showed us everything was green and ready to go, and the fact of the matter was it wasn’t.”

Raimondo Faults Vendor Deloitte For Delivering ‘Defective’ UHIP System,” The Public’s Radio, Feb. 15, 2017

It can be tremendously difficult for the principal of an agency to oversee a load-bearing software project, in no small part because leaders tend to get in the way, providing bad ideas, issuing self-important new project requirements, and measuring the wrong things. Learning how modern software is made will ensure that principals set their expectations properly and can engage in an appropriate fashion. Learning what’s possible, at what level of effort, will allow principals to understand and make demands that are reasonable. Learning how much software costs will allow principals to have reasonable expectations of costs, both in preparation for projects and when executing them. And learning how to oversee software development will draw together the prior three skills so that principals can effectively manage the projects on which the viability of their agency’s mission depends.

This approach will allow an agency principal to take back control of their agency from vendors, and to stop outsourcing their agency’s mission.

“Agile” versus “agile.”

When writing about Agile software development, I always capitalize the word. This isn’t an affectation, but instead an effort to communicate an important distinction.

The word “agile” has been a la mode for a few years now. Organizations should be “agile,” teams should be “agile,” leadership should be “agile,” employees should be “agile,” software should be “agile.” This use of the word is intended to indicate being nimble, flexible, and adaptive.

This is almost completely unrelated to “Agile” software development. Agile is a software development practice, summarized as valuing:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

There are different methodologies for implementing Agile—Scrum being the most common—but, in general, capital-A “Agile” means delivering software every two weeks, with all completed work being based on user needs that have been identified and validated through user research.

Lowercase-A “agile,” on the other hand, means none of that. It’s puffery. It means nothing.

When a government agency or a contractor says “oh, yes, we’re agile,” it’s important to find out if they mean “agile” or “Agile.” And when communicating with that audience, it’s important to make clear if you mean “Agile.” The mere capitalization of a letter isn’t the totality of how to accomplish that—it’s better to ask clear and direct questions about how they build their software—but it does help to consistently writing “Agile” when you mean Agile software development and “agile” when you mean nimbleness and flexibility. Even somebody only dimly aware of Agile software development is liable to take note of the capitalization of the word and realize that something very particular is being communicated there.

Capitalizing “Agile” helps to be clear in communications. I recommend it.

The work before the work: what agencies need to do before bringing on an Agile vendor.

Government agencies often hire Agile software vendors to build software for them, but then fail to do the pre-work that would allow the vendor to be successful. Interfacing a great scrum team with a standard government IT shop is like dropping a Ferrari engine into a school bus. There’s work to be done up front before there’s any sense in bringing on a team.

18F, the federal government’s software development shop, has done a lot of work to address this problem, and much of what I know about how to do this right comes of my four years there.

I’ve seen what happens when a high-performing team is dropped into a low-performing agency. The agency spends weeks or months running the team through mandatory security trainings, getting the team their PIV cards, procuring and issuing them agency laptops, getting them accounts on the VPN, getting them access to the various servers that they’ll be using, etc. The team expected to start writing software on day one, but instead they were left twiddling their thumbs at a collective $1,000/hour. They get bored, and after a few weeks, anybody competent gets sprung from their purgatory and put onto a project that’s actually doing stuff. When the team can finally get to work, all that remains is the people who weren’t good enough to escape, and morale is low.

Don’t do this.

Agencies need to prepare for vendors, not a couple of weeks before the vendor shows up, but months beforehand. Ideally before a solicitation is even issued. That’s because it’s a lot of work! It requires user experience research, journey mapping, process automation, coordination between agency silos, and perhaps even a prior procurement. Here are some of the specific things to do:

  • Allow the vendor to work entirely in the cloud. To the greatest extent possible, make it possible for the vendor to never touch your environment. That means that your agency needs to contract with a cloud vendor (but you did this long ago because it’s 2023, right?), ideally Microsoft Azure or Amazon Web Services, since those are the most widely used. That way the vendor can replicate your environment within their own cloud environment, and never have to touch your cloud environment, and certainly not your agency’s VPN and warren of physical servers. This eliminates a big part of the onboarding process, with the happy side benefits of significantly reducing stress among the vendor team (they can’t break your environment if they don’t have access to it) and reducing the cost of the vendor’s professional liability insurance policy.
  • Don’t make the vendor use GFE. Government-furnished equipment is awful, the cheapest stuff that Dell or HP makes, bought in lots of 1,000. Developers are disproportionately likely to be Mac users, but even the Windows users don’t want to use the laptops you got from the lowest bidder. They want 32 GB of memory so they can run the software entirely in Docker containers; your laptops have 8 GB of memory because they’re optimized for people using Word and Outlook. It’s like hiring a great woodworker to build stuff for you, but forcing them to use your collection of Harbor Freight tools. If they’re working on agency equipment then they have to jump through a bunch of hoops like acceptable-use policies, mandatory trainings, and you have to do things like actually procure the hardware, send it to them, and get it back when they’re done. No Agile team wants to use your lousy equipment, and you don’t want to deal with issuing it, so just don’t.
  • Put together a journey map of the vendor onboarding process. Once you’ve ensured that the vendor team will work in the cloud on their own equipment, step through the process of what’s left in bringing on a vendor. Talk to employees of vendors who have recently gone through it, or are currently going through it. Map out every step. Now, worst case, you have a document you can share with the new vendor to understand what’s ahead of them and where they are in that process. But, better: optimize the hell out of that process, removing anything that you can, simplifying anything that you can. Every hour that you shave off this process will save $1,000.
  • Study and document the process when the vendor starts. The first time you bring on an Agile vendor, the process you’ve put together won’t be done. It will be better, but there will still be frustrations. Discuss this with the team during their first days on the project, ask that it be a subject of their first sprint retrospective, and turn that feedback into specific change for the next time that a new person joins the scrum team.
  • Designate a product owner. Agile projects’ success is heavily dependent on the product owner’s abilities to do their job well. I’ve written about this elsewhere, but the short version is that the agency needs an empowered product owner to start to take an ownership role over this project before the vendor shows up. This person will be the vendor team’s primary interface with the agency, the fixer for onboarding problems, the smiling face who will greet them at 9 AM on day one of sprint one. Have them in place many weeks before the vendor starts.
  • Create a path to production. Two weeks after the vendor starts, they’ll have code that’s ready to go to production. If your agency has an authority to operate (ATO) process that requires completing a 250-page system security plan (SSP) as a Word file to get anything in production, you’re going to have a bad time. You need to pilot a new ATO process that can move at the speed of software developer, or else your vendor team will be unable to get their software before actual end users. And Agile is impossible without that crucial feedback loop. This problem is really hard. Don’t be fooled by the fact that this is single bullet point in a list of six items—it’s of greater difficulty and complexity than the others. This is a 6–12-month process for agencies, probably longer for federal agencies. Consider making the immediate project a pilot, so that instead of proposing that your CIO overhaul the ATO process entirely, you’re simply proposing that an experiment be run to see if a continuous ATO process could work.

You don’t want to do all of the hard work of procuring a top-notch scrum team only to send them face-first into a brick wall of bureaucracy. You’ll lose all of the momentum, lose the best members of the team, waste $40,000/week, and when the project finally starts it will be with a demoralized team. By doing the right prep work, you can leave that team free to do what they do best—develop software—and get the best possible performance out of the vendor.

“Customized COTS” is the worst of both.

Some vendors who sell commercial off-the-shelf (COTS) software to government bristle when their software is described as such. They want you to know that it’s not COTS: it’s “customized COTS” or “modifiable COTS.” That is intended to reassure agencies that their software is flexible, that it can meet the agency’s needs. But, in fact, “customized COTS” is actually much worse than COTS.

Vendors are generally eager to have their offerings fall under the umbrella of “COTS,” because both state and federal government heavily favor buying existing software over having custom software built. This is good and sensible. It would be foolish to have a custom word processor built when Microsoft Word and Google Docs exist, and ideally available for licensing at a significantly lower cost than custom development.

But the COTS label is a millstone around the neck for vendors of software that drives the operations of agencies operating under highly localized regulatory regimes. Take unemployment insurance. Every state labor agency operates under a dizzying array of state and federal laws and regulations about who can get coverage, for how long, for how much money, under what circumstances, on what timeline, through what qualification processes. And those laws and regulations change continuously. I don’t want to say that it’s impossible to build a COTS tool that could handle all of those variations, but I will say that it would be enormously difficult, and would require some very specific architectural decisions that I know that none of the vendors have made. In practice, every new state that becomes a customer for such a system would introduce vast new complexities into the code base, requiring that the “COTS” product actually be forked, with customizations made for that new state. That brings its own complexities, because any changes across the code base need to be grafted manually into each forked copy. This is no longer “COTS,” by any reasonable definition, but is instead what 18F calls “UMOTS,” or “Unrecognizably Modified Off the Shelf Software.”

Customized COTS is just custom software that the agency doesn’t own, the equivalent of paying for extensive renovation to a home that you are renting. If a state is going to use COTS, they should do so because they are happy with the software as it exists, and do not require modifications. Nobody would buy Microsoft Word and then demand that Microsoft add an essential feature that is missing. That violates the purpose of COTS, which is that the vendor has made those decisions for you. If you don’t like their decisions, don’t buy their product.

Sean Boots has a good test for the legitimacy of COTS: “If you can get a software solution to successfully meet your needs in one day, it’s a real COTS product.” I propose a corollary for testing the legitimacy of customized COTS: Will all customers receive identical software updates? If yes, it’s probably COTS. If no, it’s customized COTS, and you’re paying to renovate a house you are renting.

COTS can be great. Custom software can be great. Customized COTS is a tar pit, a way to pay for extensive renovations to software that you do not own, and now feel that you cannot leave, because the sunk cost fallacy is real. Don’t license customized COTS.

Why governors put this over here, with the rest of the fire.

It’s happens at least once in every gubernatorial administration: presented with a disastrous, multi-year, failing software project that’s preventing an agency from accomplishing its mission, the governor awards a big contract to a big vendor, maybe even the vendor that’s the source of the problem. Some major culprits are unemployment insurance, enterprise resource planning, Medicaid, child welfare, and payroll—all load-bearing systems for their agencies. Solving these failures by signing another big contract nearly always makes things worse. So why do governors do this?

Serving as governor is to be presented with a never-ending stream of decisions to be made, all of which have been vetted through several layers of people. Those decisions are generally teed up to include options, in the form of a right option and a wrong option, with the governor’s advisors fervently hoping that their principal will simply make the “right” choice. There is rarely time for the governor to go deep in any area. A state is a stage full of spinning plates; the governor’s job is to go where directed and give a plate a quick push, and to repeat this many times each day, for 4–8 years.

Decision-making at this level is all about triaging. The easiest option is the preferred option. It’s better to dispose of a problem permanently than temporarily, better for a longer time than a shorter time. The top priority is to get things off the governor’s desk.

An animated GIF of a man carrying a flaming fire extinguisher cautiously. He's saying "I'll just put this over here, with the rest of the fire."
Maurice Moss, in “The IT Crowd,” triaging.

This imperative is embodied in the chief of staff, who spends the bulk of their time blocking for the governor, ensuring that questions only come to the governor when they are ripe for a decision, and that their principal has enough information to be able to make that decision.

So what should a governor do when faced with a failed software system that is preventing an agency from delivering on its mission? The correct response is to have the state take control from the vendor, because no vendor will ever care about the state’s mission as much as the state. They need to move to shorter contract periods, an Agile delivery cadence, agency product ownership, and root all work in user research. But you can’t tell that to a governor. Literally, you can’t—the chief of staff will hurl themselves in front of your body to stop you. Because what you’re saying to that governor is “what if, instead of making a single decision, we replaced this with a large amount of time-consuming work and a series of decisions over the course of months or years?” It’s completely contrary to the entire process used to operate governors’ offices.

What the governor is hearing from vendors, on the other hand, is very compelling. The incumbent vendor and their competitors are all saying the same thing: write us a big check and we’ll make all of your problems go away. Will they actually make all of those problems go away? Absolutely not. Will throwing more money on the money fire make the fire go away? No, that’s not how fire works. But this message from the vendor is exactly what the governor’s office is optimized for, and exactly what the agency secretary’s office is optimized for. They cannot escape the siren song of the vendors, and nobody warned them about the need for mast-lashing and beeswax.

Even if a governor knew that the problem would return with full fury in 6–12 months, with accompanying admonishing headlines and stern editorials, it’s entirely possible that they would still elect to award that big contract, simply because it makes the problem go away for 6–12 months. A lot can happen in 6–12 months! Maybe something more important will be in the news cycle then. Maybe they’ll be out of office. Maybe they’ll be less busy and will have time to really buckle down and pay attention to this UI / PFML / MMIS / childcare / whatever situation. But all of that is a problem for Future Them. Current Them has other stuff going on.

In short, governors make the worst decision because, in that moment, it feels like the safest decision, and it may even be the safest decision, although only in a political sense. Although the outcome is generally terrible, governors are behaving rationally. Without changing the incentives, they’ll keep throwing money on the money fires.

An Excel error caused a $202 million state budget shortfall.

On Monday, the Richmond Times-Dispatch broke the story about a thorny budgeting problem for Gov. Glenn Youngkin that illustrates how bad technical practices can lead to bad public policy outcomes:

Local school divisions in Virginia just learned they will receive $201 million less in state aid than they expected — including $58 million less for the current K-12 school year that is almost three-quarters done.

The Virginia Department of Education has acknowledged the mistake in calculating state basic aid for K-12 school divisions after the General Assembly adopted a two-year budget and Gov. Glenn Youngkin signed it last June. The error failed to reflect a provision to hold localities harmless from the elimination of state’s portion of the sales tax on groceries as part of a tax cut package pushed by Youngkin and his predecessor, Gov. Ralph Northam.

The Washington Post provided more specifics about the source (or perhaps manifestation) of the mistake:

The problem originated with an online tool that allows school districts to see how much funding they should expect from the state, a number that takes into account the district’s number of students, how much it receives in property tax revenue and other factors.

The tool has been up since June 2022, allowing districts to build their budgets around the estimations. But last week, someone — the state would not say who — realized that the numbers were wrong. The miscalculation occurred after the state failed to account for funding changes connected to the elimination of the state’s tax on groceries, which took effect Jan. 1.

It’s not clear whether this was a conceptual problem (a failure to realize that it was necessary to account for funding changes) or a technical problem (an error of implementing that math). If the latter, this is an high-impact error from a software failure.

(I’d be remiss if I didn’t point out that some reasonable people are suspicious of this explanation, pointing out that Youngkin is no supporter of public education, and that it’s convenient that this mistake aligns with his policy preferences. But I think it’s much more likely to be a mistake, one that a governor would have no knowledge of or insight into. But that explanation is awkward for Youngkin, who has presented himself as a hard-nosed budget wonk whose private sector financial experience translates to fiscal competence. And yet, this.)

It’s instructive to look at the “online tool” in question, which turns out to be an Excel file. (Here’s a Wayback Machine link to the Excel file, because I expect that the problematic one will disappear. I’ve also put it in Google Sheets.) It has 38 worksheets, with a heterogeneous and puzzling series of titles like “Enroll. & At-Risk,” “FINAL SOURCE DATA,” “March 31, 2021 ADM,” “ASRFIN Queries,” and “Bedford County-City.” The message on the first worksheet would seem to indicate that the state published this without removing all of the placeholder text, which raises the question of how what else might be unreviewed or incomplete.

Many of these worksheets are dizzying, some hundreds of columns wide, most containing unexplained acronyms like “DABS,” “RLE,” “PPAs,” “ADJ ADM.” I don’t doubt that these make a lot of sense to state and local budget officials, but I have none of the subject-matter expertise to make heads or tails of them.

Excel is a fine way to build lightweight calculation software—you can build some pretty sophisticated systems in Excel and Google Sheets—but it shouldn’t serve as load-bearing infrastructure. Excel files can’t be diffed or version-controlled using standard revision control systems (e.g., Git). It’s impractical to perform automated tests on Excel files, as a part of a continuous integration process. Excel files become unwieldy as the number of worksheets increases—I can’t say where the tipping point is, but it’s for sure lower than 38. For a tool as critical as this one, it’s important to be able to at least perform some smoke tests, so you can check that providing providing particular sets of financial assumptions returns the correct numbers, and those should be run automatically every time that the tool is updated.

No doubt this started as some small, simple file, many years ago, put together by somebody at the Department of Education for internal purposes, shared informally with some municipalities, but gradually shared more broadly and standardized on. And then it grew and grew, without the necessary resources provided to support it that were commensurate with its newfound importance. Surely most other state agencies are vulnerable to similar failures with similar impacts due to the same problem.

Software failures causing public policy failures are a defining feature of our era. In that sense, this is a normal failure, although I’m not familiar with another instance of an Excel error in a government budgeting documenting leading to such a large financial problem. But history does give us one example to draw on: Fidelity Investments’ 1994 omission of a minus sign (-) from a spreadsheet, which rendered their $1.3 billion loss into a profit of $1.3 billion. They dutifully notified the three million investors in the Magellan mutual fund that they’d receive a dividend of $4.32 per share…only to have to notify all of them that they’d actually receive nothing, after outside auditors caught the mistake.

Spreadsheets are great tools. But some applications require more rigor, and we can see here that mutual funds and state budgeting are two strong examples.