Waldo Jaquith – Open source, procurement, and gov tech.

Agencies must not outsource compliance to their vendor scrum team.

There’s a lot about an outsourced scrum team that is attractive to government agencies, but the most appealing bit is usually this: instead of writing a bunch of requirements up front, you can write them as you go, in the form of user stories. That’s an easy sell. A harder sell, though, is another change that must accompany this: you cannot pass your agency’s legal, regulatory, and policy requirements straight through to vendors and say “you execute these”—the agency needs to understand those needs and express them in the form of acceptance criteria. This shift is frightening to agencies, and can be a deal-breaker.

Under an outsourced Agile model, the vendor’s scrum team performs user research, which results in user stories. The product owner (a government employee) edits and prioritizes the user stories in the backlog, writing acceptance criteria for each user story, which gives the agency constant oversight and control over the exact work being done. Then each sprint starts with the vendor’s scrum team pulling user stories from the prioritized backlog.

We take this approach to have lowercase-A agility in our work. After all, if we put all of these tasks into the contract, then the vendor would have to complete them, regardless of whether they turn out to be valuable to users. An Agile model gives us the flexibility to adjust work as we go based on what we learn. In fact, we put as few requirements in the RFP and contract as possible, because if any work is specified in the contract, it has to happen. So the requirements that we include are generally non-functional requirements like documentation, durability, and quality, plus a few like the programming language to be used or the cloud hosting environment that the software will live in. That list of requirements should require just a few sentences.

But if the contract includes any specific tasks that the vendor has to do, now we’ve got a mess on our hands. That’s because now the vendor will have to, at times, ignore what the product owner says, ignore the backlog, ignore user stories, and just do some other stuff because that’s what’s in the contract. If the contract says “the vendor shall integrate the new system with the legacy system,” and the vendor isn’t confident that there will be user stories that direct them to do that, then the scrum team will, at some point, declare “sorry, we’re going to completely ignore all user stories for as long as it takes us to perform the integration required by the contract.” That may not come at a time that is convenient for government, and that work may not be done in the way that the agency would choose, but it doesn’t matter because the vendor must comply with the contract. When you put such requirements in the contract, you’re tying your hands, not your vendor’s.

My interest here is not these specific tasks, though—it’s instead in the practice of incorporating complex policy documents by reference and saying “oh, also do all this stuff.”

It’s normal for agencies’ RFPs and professional services contracts to require that the software comply with agency-specific, local, state, and/or federal policy documents about privacy, security, and technical practices. Often these are arcane, agency-specific policies that would be meaningless to anybody but the vendors already performing work for that agency.

More often these documents are so vague as to be useless. What does it mean for software to “comply with HIPAA”? What does it mean for software to be “in compliance with NIST 800-53”? Such requirements are meaningless nonsense, but the larger problem is that passing along these binder-sized sets of requirements necessitates that the scrum team become experts in those binder-sized sets of requirements, instead of actually doing the work that you hired them to do. Then, after they become experts in those, they will have to apply this knowledge by ignoring what the agency product owner tells them to do, and instead doing what they think the binders say to do.

The user research might make clear that the system’s users (say, unhoused people) cannot reliably have access to a consistent mobile phone, but if the agency’s policy documents say “the vendor shall use SMS-based two-factor authentication,” then that’s what the vendor has to do, because the contract says that they must follow the agency policy documents. Incorporating these policy documents into the contract means that whenever there’s a conflict between user needs and the policy documents, the vendor must resolve them in favor of the policy documents; the agency will have no say over this. “Vendor must comply with Agency’s Technical Standards document” requires that the vendor be an expert in your agency’s legal, regulatory, and policy requirements, which is a waste of the scrum team’s time, it’s expensive, and it’s profoundly anti-Agile.

This whole process is terrible. But what’s to be done?

The solution is straightforward: have the agency incorporate the relevant bits of the various policy documents in the acceptance criteria of user stories, not the contract. When the product owner encounters a user story that implicates medical privacy, she should pull in the agency’s HIPAA expert to figure out what acceptance criteria are needed to ensure that the user story is in compliance. When the product owner encounters a user story that implicates security, she should pull in a representative from OCIO who can figure out which NIST 800-53 controls are relevant, and turn those into acceptance criteria. And so on.

The bottom line: If you’re going to hire a vendor to bring an Agile development methodology to a project, you cannot expect to both have fine-grained control over the work they’re doing and also hand them a pile of policy that they’re obliged to untangle and implement. If you make your Agile vendor be an expert in how your agency does stuff, you will only get the exact same vendors that you’re getting now. And how’s that working out for you?

Org. chart chasms cause custom software project failures.

A screenshot from "Office Space," showing an angry man in a necktie gesticulating, across a table from two men ("the Bobs") listening to him.

Custom software projects fail at an extremely high rate in government, and that’s because of two core problems: the decision makers who set contractual requirements are far removed from users, and those decision makers are far removed from the software development team. These are really the same problem: decision makers are out of touch with key constituencies, which prevents them from grasping the underlying need and from understanding or controlling the work that’s being done.

How does this happen?

Under a traditional government contracting process, nobody ever talks to the users to find out what they want. That’s simply not a part of the process. Instead, a program office is aware of a need that they think is best addressed by software, which so they go to the procurement office to say they need to buy a thing. Nobody is in charge of saying “wait, do you really need to buy that thing—how do you know?” If the program office has the budget, they generally get to make the decision to buy something. So the procurement office says “you need to tell us your requirements, so that we can paste them into our RFP template. The program office then asks IT to come up with a list of requirements, plus they’ll toss in a few of their own. If they’re flush, they’ll pay a consultant to come up with the list of requirements. Whoever comes up with the requirements they do so by, basically, thinking real hard about it. Those requirements go in the RFP, which becomes part of the contract, and the vendor is on the hook to build software that does what’s required. Again, nobody ever talked to the users.

Then it’s time to actually do the work. So the vendor assigns a team of developers to write code that meets the requirements, and they go away and spend months or years doing that, without the agency ever seeing or using that code until the very end of the project. The developers never get to talk to users, but they’re no fools—they are capable of seeing that the requirements that they’re building toward are Actually Bad, and that their employer has underbid and overpromised. But the developers don’t meet with agency leadership or even agency employees to communicate that. The vendor has a client relations person who gets updates from the team’s project manager, periodically meets with the agency’s project manager, and maybe talks to the contracting officer’s representative once a month.

The result of this game of telephone is that the client agency has no real idea of how the work is going. Sure, they’re given quarterly stoplight charts that say everything’s going great, but it’s not—the developers know that this project is dead on arrival. And they tell their team lead that. And their team lead tells their project manager that the developers have concerns. And their project manager tells the agency’s project manager that there are some wrinkles but it’ll all be fine. And the agency’s project manager tells their change control board that everything is fine. And the change control board tells the CIO that everything is great, who tells the agency secretary that the project is on time and within budget. At every step of the way, people are lying to their superiors—this is known as “strategic misrepresentation,” and is famously analyzed in “Underestimating Costs in Public Works Projects: Error or Lie?” by Bent Flyvbjerg et al, which I recommend highly. (The paper answers the title’s question: lie.)

What’s to be done?

The answer is, naturally, my hobbyhorse: pair user research with incremental delivery of software via Agile development practices. No project should receive funding without user research that validates the underlying problem and the proposed solution. Leadership needs to ask the right questions and require the right answers. Agencies need to use an Agile contract format with a time & materials contract type to allow for user-centered software development, incremental delivery, and to facilitate trivial termination of contracts when a vendor is not performing. Agency principals need to oversee the work, and their deputies need to oversee the work more rigorously still.

In short: close up those gaps in the organizational chart that ignore user needs, and allow project statuses to be built atop layers of lies. If the users’ needs are understood, and the software is incrementally being delivered as it’s determined to solve user needs, that makes failure far less likely.

Beyond the stoplight chart: How to perform Agile oversight at scale.

Four years ago I co-wrote the federal government’s guide to budgeting and oversight of major software projects. In the intervening years, the advice contained in that document has been put into practice in a number of state legislature and governors’ offices, to the point where a new problem has appeared: How do you oversee software projects at scale?

So you’ve left behind the fiction of stoplight charts. You’re getting demos instead of memos. Teams are ostensibly delivering software incrementally, based on constant user research. You have two, five, ten, thirty projects working in this way. But, oh no…you have thirty projects working in this way. Now your calendar is just demos of functioning software, all day, week after week. This is a great problem to have, but it’s a problem just the same.

I have worked on solving this problem, and I’ve applied parts of a solution, but my solution is by no means a tested theory of change. But I think it’s important to work out loud, so what follows is the four-stage process that I recommend to anybody facing the problem of scaling up oversight of major software projects. (I wrote a bit about this last May, though targeted at agency heads, in “How an agency principal should oversee a major custom software project.” That’s meant for the actual head of an agency; this guide, on the other hand, is meant for their deputies, advisors, etc., for legislative staff in similar positions, and for state IT agencies and procurement shops.)

Here are the four steps: 1. Get access to the backlog. 2. Require sprintly ship notices sent to leadership. 3. Engage with the ship notices. 4. Provide direct support. Let’s take each of these in turn.

1. Get Access to the Backlog

Any Agile team practicing user-centered software development is going to maintain a backlog of user stories, in GitHub Issues, Jira, or a similar tool. (And if they are not, that is the biggest, reddest of red flags.) Tell the team that they need to provide you with access to that backlog, so that you can monitor their work. This is the single source of truth for any software project—without some serious deception, looking at this will reveal exactly how much work they’re getting done and how they are doing it. If the team is reluctant or claims technological obstacle, tell them that you’ll give them 30 days until you need access; they will use that time frantically getting their act together, like somebody tidying up their home before the house-cleaner arrives.

When you have access to the backlog, periodically review it to get a sense of what they’re doing, why they’re doing, how it’s going, and the extent to which that’s rooted in user needs uncovered through one-on-one user research. Look at the software they’re developing to see those user stories realized in staging and production. It is important to fight the the urge to extract metrics from the backlog. You might think “I can tally the story points completed per sprint to measure productivity,” or “I can measure the time allocation per activity.” Down this path lies madness. A scrum team is a small, self-organizing, cross-functional group that incrementally builds software to address user needs. The one external metric that they should be held to is “are they meeting user needs?” Turning their activities into metrics in the name of transparency will wreck that work product.

2. Require Sprintly Ship Notices Sent to Leadership

A reasonable request to make of a team is that they send ship notices at the end of each sprint—a summary of what they’ve been up to for the past two weeks. A ship notice should also include what’s planned for the next sprint, a list of blockers that they need help clearing, and a one-sentence budget update that indicates how much has been spent against what total dollar value with a forecast end date based on that burn rate. To report what they’ve been doing for the past two weeks, have them simply paste in a list of all user stories that they completed, with each one linked to that user story in the backlog, for traceability. These only take a few minutes to prepare, since it’s mostly copying and pasting from their backlog, and a cadence of sending one per sprint is sufficient for leadership. This email should be CCed to the most important people who need to receive these updates. A major initiative for an agency should be CCed to the agency principal.

3. Engage with the Ship Notices

It is not enough to simply receive the ship notices. For teams, that will eventually feel like they’re CCing /dev/null. It is important to actually read every ship notice, understand every ship notice and, when appropriate, engage. When you see that a particularly valuable or interesting user story has shipped, you should try out the new feature on the site and respond with praise for the team. If you see that there’s a blocker that you could or should address for them, do so. If you see that a team’s velocity has slowed down over the past few sprints, you should check up to see if they need support (not to complain, but to help). Remember that your emailed responses will be shared with the team—seeing that leadership is engaged, interested, and supportive can make a big difference to their morale. The frequency with which you engage with ship notices should be proportional to the level of oversight required for the project.

4. Provide Direct Support

Sometimes projects will need help. “Help” does not mean haranguing them, insisting that they work faster, or micromanaging them. Many projects will be performing Agile, but not actually practicing Agile. Sprintly ships will reveal this to be the case, but it won’t teach them how to work correctly. Demanding that they “be Agile” is not going to do it.

The solution is to teach them how to work, and that’s best done in the form of an internal digital service team. That can be a very small team, as few as four people: a user researcher, a software developer, an Agile coach, and an Agile project manager. It’s possible that one person could check more than one of those boxes. If your agency outsources a lot of software development, then a contracting officer should be in the mix, too. That team needs to have experience with consulting, because their task is ticklish.

If the software development team in question is comprised of agency employees, the digital service team can work directly with them to help them improve their processes. If the the software development is outsourced, the digital service team can help agency employees learn how to let the vendor team do their best work, or demonstrate that they need to replace that under-performing vendor with a better one (and then help them to do so).

It is not enough to tell them “do better,” because they may not know what “better” looks like, or how to get there within the realities of your agency. You need to give them the capacity to work better.

The goal here is to ensure that each project has spun up the flywheel of using constant user research to inform the development of in-production software that’s providing benefits to users.

This is all about scaling up “demos, not memos,” but it’s no replacement for actual demos. It’s important to attend sprint reviews as often as plausible, especially for high-risk and high-value projects, to ensure that value is being delivered to end users.

At the risk of straying into management consulting, I must caution that there is such a thing as too much oversight. Is the team consistently delivering high-value improvements to end users who are measurably, objectively happy? Then leave them alone! If they are falling short of that goal, you will determine that through getting access to their backlog, reading their sprintly ship reports, and engaging with those ship reports, and the way to address the problem is by providing them with direct support from an experienced, cross-functional digital services team. Just because you are qualified to observe that there is a problem does not mean that you are qualified to fix the problem. Only an experienced Agile coach can do that.

Again, this four-part approach is by no means a tested theory of change, but instead a series of things that I and/or my colleagues have tried at least once and had success with. I’m working every day on testing out these and other approaches to Agile oversight at scale, as I know many other people are doing as well. We’re on a path to collectively arrive at a theory of change over the next couple of years, and exercising this four-step process will help us to accomplish that.

Preventing lousy vendors from bidding on your custom software project.

An abstract image of a circle backed by a circuit board. In the center of the circle is the letter Q. At the top it says 'COYPEGIIGHT.' Scattered around the circle are images of books, stacks of paper, pigeons, and globes. — Baffling illustration courtesy of DALL-E attempting to illustrate the introductory paragraph.

Government agencies that want different results from their software procurements recognize that they need to get bids from new vendors to make that happen. A lot of work needs to go into that (market research, dividing up projects differently, simplifying terms & conditions, circulating solicitations more widely, etc.), but three changes stand out above all others for their effectiveness in warding off incompetent vendors while attracting new, well qualified vendors: owning copyright, publishing the software as open source, and issuing smaller contracts. Let’s review each of these in turn.

Own Copyright

Some of the major vendors in the custom software space use government contracts as a source of R&D funding to develop software that they retain ownership of, and then sell to other agencies. They create it as custom software for the first agency, and then license it as “commercial off-the-shelf” (COTS) software to subsequent customers. This is wasteful spending by government, which is left licensing near-identical software that they should have retained ownership over in the first place. An enormous amount of work is required on the part of agencies to support vendors’ development services, and there is no reason whatsoever that the benefit should accrue to the vendor. (Sometimes the private sector should retain copyright for custom software! This is what SBIR funding is for. But if it’s not SBIR-funded, the vendor should not retain any ownership over the rights.)

This predatory contracting model is not something that agencies should participate in. A solicitation that makes crystal clear that copyright will be held by the agency will repel these vendors, while attracting vendors who have no interest in owning these work products.

Publish Software as Open Source

Once the agency owns the software, they should release it under an open source license; or, if it’s a federal agency, it should be released into the public domain or, better, under a Creative Commons Zero license (public domain, but with international applicability). I recently wrote at length about the value of agencies publishing their software as open source, so I won’t belabor this.

Issue Smaller Solicitations

Agencies often turn a monolithic need into a monolithic procurement. “The legislature requires us to stand up a new benefits system” turns into a solicitation to create a new benefits system. This is extraordinarily risky, with vanishingly slim odds of success. There is no vendor that’s able to write software, host software, operate a server farm, run a help desk, run a training program, run a call center, and manage payments, and there are only a literal few who will bid on such a contract (intending to subcontract most or all of the actual work). Issuing solicitations like this guarantees that you’re only going to receive bids from the usual suspects, and ensures that you’ll get the exact same results that you’ve always gotten.

Competent software development vendors don’t want to run your help desk or train your employees—they just want to write software. They’re not going to bid on $100 million projects, because the time and effort required to get a contract like that is enormous, making it too big of a risk to bother with. (Also, for a small vendor, actually getting a contract that big would pose an existential risk to them.)

The solution is to break up the contract into smaller pieces. Have one contract for software development, one for help desk services, one for hosting, etc. Is that more work for the contracting shop? Possibly, certainly the first time they take this approach. But then one failed contract won’t sink your whole effort, you’ll get vendors who are actually good at the thing you’re hiring them for, and you’ll get bids from vendors who aren’t the usual suspects.

There is a laundry list of other things to do differently to attract new vendors (and fend off bad ones), but the the highest-impact ones are the trinity of owning copyright, publishing the software as open source, and issuing smaller contracts. This combination will make a big difference.

Publishing your agency’s software as open source puts you at a major advantage.

In the United States, nearly all agencies and departments, at all levels of government, rely on open source software. Some do so as an explicit policy decision, others merely as a matter of practice. Objections around software quality and security collapsed long ago in the face of reality. But publishing open source software is a very different situation.

When agencies procure custom software, but keep the source closed, they are putting themselves at a significant disadvantage. There are some enormously compelling arguments—and some surprising arguments—in favor of agencies preemptively publishing source code.

Works of government may be public by default

Under the Copyright Act of 1976, “Copyright protection under this title is not available for any…work prepared by an officer or employee of the United States Government as part of that person’s official duties.” Although there are exceptions, it is broadly true that software developed by the federal government is in the public domain. Custom software produced for the federal government by vendors is often public domain, but not always. If the copyright is assigned to government—as it really, really should be—then it is generally public domain. (This is governed by FAR Subpart 27.4—Rights in Data and Copyrights. See also the prescribed clause in FAR Subpart 52.227-14—Rights in Data-General: “Government shall have unlimited rights in…[computer software] first produced in the performance of this contract.”)

At a state level, things are much messier. There is no consistency between states, and sometimes no consistency between agencies within a state. Custom software produced by or for state agencies might be public, or it might not be—you’d need to consult a lawyer to be sure.

Open-records laws may require you to turn source code over to requestors

Federal and state open-records laws have often been used to compel agencies to provide custom software’s source code to anybody who asks. It’s broadly true, at a state and federal level, that there are exemptions for releasing records that are otherwise prohibited by law, and every government has a law prohibiting the release of information that could undermine computer security. But if that exemption doesn’t apply, then federal agencies need to turn over the source code, which the recipient is then free to make public. Again, there are huge differences between states’ freedom of information laws, so research is necessary to know how those apply to source code in a given state.

Open source is more secure than closed-source software

People who aren’t software developers will often assume that software is more secure if the source code is kept secret. If that were true, source code would be inherently exempt from open records laws, the copyright held close by government. Source code would be treated as a state secret, stored and transferred like nuclear waste. But that’s not how we treat software, because we know that isn’t true.

A strong argument for open source’s security comes from the U.S. Department of Defense’s excellent Open Source Software FAQ, specifically their answer to “Doesn’t hiding source code automatically make software more secure?” which opens with, simply, “No.” The document goes into a great deal of detail, but suffice it to say, no less of an authority than the nation’s military thinks that it’s important for government to publish source code openly.

Open source supports agencies’ need to tell the public that they are being transparent in their work

It can be important for many agencies to help the public have confidence in their operations, and that’s especially when they issue decisions that are made by or augmented by software. Judges have often agreed with defendants’ demands that breathalyzer software source code be provided so that the defendant has a chance to prove that it contains bugs that led to an erroneous reading. Public benefits systems increasingly automate qualification decisions, and advocates are putting a corresponding amount of pressure on agencies to allow them to verify that those decisions are consistent with relevant laws and regulations. For types of software likely to be subject to such scrutiny, publishing the source code preemptively shows that the agency has nothing to hide.

Open source prevents vendors from including copyright poison pills

When government procures custom software from vendors, with government owning the copyright, unscrupulous vendors may see an opening to make some money on the back end, by including software that they already wrote and hold copyright on. They can deliver software that is 99% government-owned, but has interwoven into it a critical layer of software that’s owned by the vendor, for which they can demand licensing fees for as long as the software is in use. One way to avoid this problem is to contractually require that all software delivered by the vendor be published under an open source license, absent written permission providing exceptions.

For a high-profile example of a copyright poison pill, see Georgia v. Public.Resource.Org, the Supreme Court case that resulted from LexisNexis weaving copyrighted text throughout the Code of Georgia and to ensure that they’d receive licensing fees from anybody attempting to reproduce the state’s laws.

Open source sets up a powerful incentive to get the best work from the best vendors

Pity the software developer working at a standard-issue government consulting firm building a software product on contract with an agency. Odds are low that their code will ever make it into production, even lower that anybody will ever use it for anything at all, and basically nonexistent that anybody else will ever see a single line of code that they wrote. It’s a miserable way to make a living. What sort of work product would you deliver under those circumstances?

Incentives change substantially if the product is advertised as open source from the time that the first RFP is published to create it. Lousy vendors usually know that their work is bad. They don’t want to produce software for government that will be public, for potential future clients to see. And good vendors know that their work is good. They’re proud to have their work be public, to have something to show potential future clients. They want to put high-performing employees on those projects, who will make the vendor look good. And if the source code is on GitHub, those employees know that their daily work is being promoted to their professional network on GitHub, and available for review by potential future employers. They’re no longer toiling in obscurity, but instead working very much in public, doing their best work on behalf of an employer who wants to give them the resources to do their best work.

You can filter out lousy vendors and elevate good vendors by declaring a product to be open source from the outset.

Open source reduces risk for subsequent vendors, which reduces the dollar value of their bids

Software is never done. The vendor who gets the contract to build custom software may well not be the vendor who gets the contract to maintain it, or the vendor who gets the contract to add new functionality years later. There’s a lot of risk for vendors working on existing software if they can’t inspect it first. Is the code garbage? Is there documentation? Is it well linted? Is there good test coverage? Can developers run it in Docker on their laptops?

If vendors receive an RFP to improve existing software, and it’s trivial for them to look at the source code, and what they see is good, that reduces their risk, which in turn reduces their bid. Open source software is cheaper software.

There are vanishingly few downsides associated with government publishing source code publicly, but some substantial upsides. Agencies should default to working in the open, and reap the benefits of reduced cost, increased trust, and higher-quality results.

Budgeting for software projects in “scrum team years.”

It’s often said that if you want to know how long it will take to complete an Agile software project, then you should get started on building it. The theory is that once you have your backlog built out, your stories sized, and you know your velocity thanks to a few months of work, you can get a rough idea of how long it’ll take to complete the whole thing. (Henrik Kniberg explains this in his “Agile Product Ownership in a Nutshell” video.) And that might be fine in some scenarios, but in organizational contexts, it’s often impossible to get started without funding, and it’s impossible to get funding without having a defensible estimate of how long a project will take…which is difficult to do without getting started. What’s to be done?

There are a lot of bad ways to estimate software projects costs, and they fall into two camps: qualitative estimates (“I did something like this once and it took about 10,000 hours, so that’s how long this will take”) and quantitative estimates (“this is similar to these three other projects, which have an average of 600,000 lines of code, and historical data shows that line of code takes 1 minute to write, so this will require 10,000 hours”).

In government, in practice, neither of these are used. Instead, an agency publishes an request for information, vendors provide ballpark figures that are rarely rooted in any defensible math, the agency then makes a request for funding based on those responses (e.g. by tossing out the high and low numbers and averaging the remainders), funding is awarded, the agency publishes a request for proposals, and the bids come back for prices real close to that awarded funding. At any step of the way, if anybody asks why the cost is, say, $20 million…well, that’s not actually explainable. There is no internal logic that underlies this price tag. The result has been 20 years of spiraling of costs for custom software in government, as prices have gradually gone up because they are tethered to nothing but the amount of money that vendors say it’ll cost, and they have every incentive to provide a big number.

There is a better way: “scrum team years.”

Federal labor data shows us that the blended hourly rate for each member of a scrum team averages about $125, or about $235,000 (at 1,880 hours per year). A scrum team, therefore, will run you $1–2 million, depending whether it’s closer to four members or nine.

When procuring a major custom software project, you can interrogate the reasonableness of the price by breaking down the price into scrum team years. Is the bid for $20 million? Then you should be getting between 10–20 scrum team years, perhaps as 5 scrum teams working for 4 years, perhaps as 5 scrum teams working for 2 years, or any number of other mathematically plausible variants. Experienced software developers can compare the complexity of a project to the number of scrum team years and have a sense as to whether the price makes sense.

This works at an agency level, this works at a procurement level, this works at a budgeting level. It allows people who lack deep expertise in software development (which is to say nearly everybody involved in the entire budgeting and procurement process) to have some basic unit of value to compare and debate. (Does this project really require 500 people working for 5 years? What could we get from 10 people working for 6 months? Wait, we’re only getting 10 scrum-team years but we’re paying $50 million? And so on.)

It’s even possible to interrogate the price tag more deeply, if it’s arrived at thoughtfully. In the same way that a scrum team will generally estimate story sizes (“pointing stories”) before pulling them into a sprint, it’s also possible to use experience—some of that qualitative analysis—to make estimates at much larger scales. This will suffer from the same problems that plague standard estimation methods, but for the purpose of establishing a rough order of magnitude, when used by people with significant experience executing comparable projects, it’s an internally-coherent, externally-validatable approach. Reasonable minds can disagree as to whether creating (say) a basic account-creation system might require more or less than one month of work by a scrum team, but breaking down a project into a dozen or so such units and recording the estimated effort level for each unit allows those minds to disagree meaningfully and productively.

Scrum-team years are a good estimation tool for aligning program teams, budgeting, procurement, and oversight, giving everybody a common currency of understanding.

How an agency principal should oversee a major custom software project.

The success of many government agencies now hinges on their ability to successfully execute large custom software projects. And yet as a rule, the principals of those agencies lack the ability to ensure that those projects will succeed, or even to oversee them meaningfully. As a result, they have lost the ability to ensure that their agency can achieve its mission.

The Problem

Agencies are pretty specialized—in federal government, their key needs are unique, and in state government they’re one of just 50 (or as many as 56, depending on how you count) agencies with those needs. The truly generic needs can be addressed via commercial off-the-shelf software (COTS), but the mission-unique stuff must be met by what I call load-bearing software, which is inherently custom.

Load-bearing software became a thing in federal government midway through the last century. The ur-example of this is the IRS’s Individual Master File, their core computing system, written in COBOL and IBM System/360 assembly, which debuted in 1960. Load-bearing software became an increasingly common need for government agencies in the 1980s and 1990s, with that software almost entirely internal-facing. That changed in the 2000s and 2010s, as the public gradually came to expect internet-intermediated interactions with agencies, especially for application processes. In 2020, Covid forced agencies at all levels of governments to move service delivery online, and three years later there is no sign of that shift receding.

If a state unemployment agency’s UI system doesn’t work, in what sense are they a UI agency? If a state’s EBT system goes down, in what sense do they provide SNAP benefits? If the IRS’s Individual Master File crashes, in what sense are they a taxation agency?

Load-bearing software must work for agencies to achieve their missions. And yet, under the standard outsourcing paradigm, agencies outsource every aspect of the construction, maintenance, enhancement, support, and hosting of this software. In doing so, they outsource their mission. This is a terrifically dangerous practice.

When an agency principal lacks the knowledge or even interest to understand and control these software projects, they are handing their control of the agency to a consulting firm’s project manager. No leader wants to do that.

The Solution

In short, agency leaders need to give a damn about technology procurement, budgeting, oversight, and implementation. Load-bearing software is not a detail—it’s the whole ballgame.

There are four things that principals need to learn if they’re to control their agency’s ability to achieve its mission:

How modern software is made
What’s possible, at what level of effort
How much software costs
How to oversee software development

Let’s review each of these.

How modern software is made

To know how software gets built today, there are six core concepts that agency leaders need to grasp:

User-centered design
Agile software development
Product ownership
DevOps
Building with loosely coupled parts
Modular contracting

There’s a short overview of each of these in GSA’s “State Software Budgeting Handbook,” which I co-wrote in 2019, so I won’t re-explain them here. It’s not enough for agency principals to read a paragraph about each of these, though. Without about an hour of training in each of these subjects, agency leaders can have a good base of knowledge to how projects are being executed—or should be executed—by vendors and agency staff.

What’s possible, at what level of effort

Many of the work overseen by agency leaders draws from fields that they’re already equipped to understand the complexity of. If an agency needs to hire 500 new employees, a leader knows intuitively that this possible, and that will take longer than two months but less than two years. If it needs to buy new office equipment for 100 people, that’s very achievable, and will cost more than $100,000, but less than $1,000,000. If the agency needs to move into an entirely new building in two months on a budget of $5,000, that is not possible. And so on.

There is absolutely nothing that has prepared an agency principal for understanding the cost of software. There are a pair of xkcd comics that address this concept:

The best corrective for this is to observe actual Agile software development teams actually developing software, by joining a series of sprint review sessions for multiple projects. That makes it possible to see what e.g. six people are capable of accomplishing within two weeks of work.

How much software costs

Grasping the cost of software is difficult in the space of government software because of the absurd levels of pricing distortion brought about by decades of procurement practices unsuited to the problem. At a state or federal level, $100 million is a normal price for the development of a load-bearing software system, and that’s a price tag that’s not meaningfully decomposed to any part of that system.

Again, observing actual Agile projects will do a lot of good here. By understanding the level of effort, and connecting that to the billing rate of a vendor, it becomes possible to see that the work produced by our six-person team in two weeks cost $60,000 in billed time. Observing this for a while, it soon becomes evident what software should actually cost.

How to oversee software development

I’ve already written a guide to overseeing major software projects, albeit intended for legislatures, but the 14 listed plays largely apply to an agency principal, either for them to apply themselves (especially “Demos, not memos“) or to ensure that their staff are applying.

But there are a few leader-specific admonitions that I want to include here.

No stoplight charts. Agency principals receive regular updates on major software projects that say everything is going great, with green lights all the way….right up until the update that says everything is red and the project has failed. What happened? In short, strategic misrepresentation. Nobody wanted to provide troubling news to their boss, so as the state of the project got handed up the chain, the view got rosier and rosier, until the principal was told, with every update, that things were going great. The solution to this is to eschew reports in favor of live demos of the actual work being done. If leadership requires reports, make them narrative, authored by the agency’s product owner for the software project.
Require that live software be deployed to production regularly. Project will spin their wheels in isolation. This allows for bad decisions to be hidden, for a lack of progress to obfuscated. Leaders should insist that software improvements be incrementally delivered to end-users. Continuous delivery paired with continuous user research makes it very difficult to waste much money on a software project.
Require weekly ship reports. At the end of each week (or each sprint), a project’s leadership should write a “ship report,” which briefly describes what was shipped since the last ship report, along with what’s coming up, what blockers are preventing the project from progressing, and how much of the budget has been spent on the project to date. For any project that requires particularly close monitoring, it helps to require the inclusion of a list of all user stories that were completed within the period in question—this makes it crystal clear what the project team has accomplished.

This is all meant to to avoid precisely this scenario:

“Deloitte presented much too rosy of a picture to us,” [Governor Gina Raimondo] said. “I sat in meetings with Deloitte and questioned them and they gave us dashboards that showed us everything was green and ready to go, and the fact of the matter was it wasn’t.”
“Raimondo Faults Vendor Deloitte For Delivering ‘Defective’ UHIP System,” The Public’s Radio, Feb. 15, 2017

It can be tremendously difficult for the principal of an agency to oversee a load-bearing software project, in no small part because leaders tend to get in the way, providing bad ideas, issuing self-important new project requirements, and measuring the wrong things. Learning how modern software is made will ensure that principals set their expectations properly and can engage in an appropriate fashion. Learning what’s possible, at what level of effort, will allow principals to understand and make demands that are reasonable. Learning how much software costs will allow principals to have reasonable expectations of costs, both in preparation for projects and when executing them. And learning how to oversee software development will draw together the prior three skills so that principals can effectively manage the projects on which the viability of their agency’s mission depends.

This approach will allow an agency principal to take back control of their agency from vendors, and to stop outsourcing their agency’s mission.

“Agile” versus “agile.”

When writing about Agile software development, I always capitalize the word. This isn’t an affectation, but instead an effort to communicate an important distinction.

The word “agile” has been a la mode for a few years now. Organizations should be “agile,” teams should be “agile,” leadership should be “agile,” employees should be “agile,” software should be “agile.” This use of the word is intended to indicate being nimble, flexible, and adaptive.

This is almost completely unrelated to “Agile” software development. Agile is a software development practice, summarized as valuing:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

There are different methodologies for implementing Agile—Scrum being the most common—but, in general, capital-A “Agile” means delivering software every two weeks, with all completed work being based on user needs that have been identified and validated through user research.

Lowercase-A “agile,” on the other hand, means none of that. It’s puffery. It means nothing.

When a government agency or a contractor says “oh, yes, we’re agile,” it’s important to find out if they mean “agile” or “Agile.” And when communicating with that audience, it’s important to make clear if you mean “Agile.” The mere capitalization of a letter isn’t the totality of how to accomplish that—it’s better to ask clear and direct questions about how they build their software—but it does help to consistently writing “Agile” when you mean Agile software development and “agile” when you mean nimbleness and flexibility. Even somebody only dimly aware of Agile software development is liable to take note of the capitalization of the word and realize that something very particular is being communicated there.

Capitalizing “Agile” helps to be clear in communications. I recommend it.

The work before the work: what agencies need to do before bringing on an Agile vendor.

Government agencies often hire Agile software vendors to build software for them, but then fail to do the pre-work that would allow the vendor to be successful. Interfacing a great scrum team with a standard government IT shop is like dropping a Ferrari engine into a school bus. There’s work to be done up front before there’s any sense in bringing on a team.

18F, the federal government’s software development shop, has done a lot of work to address this problem, and much of what I know about how to do this right comes of my four years there.

I’ve seen what happens when a high-performing team is dropped into a low-performing agency. The agency spends weeks or months running the team through mandatory security trainings, getting the team their PIV cards, procuring and issuing them agency laptops, getting them accounts on the VPN, getting them access to the various servers that they’ll be using, etc. The team expected to start writing software on day one, but instead they were left twiddling their thumbs at a collective $1,000/hour. They get bored, and after a few weeks, anybody competent gets sprung from their purgatory and put onto a project that’s actually doing stuff. When the team can finally get to work, all that remains is the people who weren’t good enough to escape, and morale is low.

Don’t do this.

Agencies need to prepare for vendors, not a couple of weeks before the vendor shows up, but months beforehand. Ideally before a solicitation is even issued. That’s because it’s a lot of work! It requires user experience research, journey mapping, process automation, coordination between agency silos, and perhaps even a prior procurement. Here are some of the specific things to do:

Allow the vendor to work entirely in the cloud. To the greatest extent possible, make it possible for the vendor to never touch your environment. That means that your agency needs to contract with a cloud vendor (but you did this long ago because it’s 2023, right?), ideally Microsoft Azure or Amazon Web Services, since those are the most widely used. That way the vendor can replicate your environment within their own cloud environment, and never have to touch your cloud environment, and certainly not your agency’s VPN and warren of physical servers. This eliminates a big part of the onboarding process, with the happy side benefits of significantly reducing stress among the vendor team (they can’t break your environment if they don’t have access to it) and reducing the cost of the vendor’s professional liability insurance policy.
Don’t make the vendor use GFE. Government-furnished equipment is awful, the cheapest stuff that Dell or HP makes, bought in lots of 1,000. Developers are disproportionately likely to be Mac users, but even the Windows users don’t want to use the laptops you got from the lowest bidder. They want 32 GB of memory so they can run the software entirely in Docker containers; your laptops have 8 GB of memory because they’re optimized for people using Word and Outlook. It’s like hiring a great woodworker to build stuff for you, but forcing them to use your collection of Harbor Freight tools. If they’re working on agency equipment then they have to jump through a bunch of hoops like acceptable-use policies, mandatory trainings, and you have to do things like actually procure the hardware, send it to them, and get it back when they’re done. No Agile team wants to use your lousy equipment, and you don’t want to deal with issuing it, so just don’t.
Put together a journey map of the vendor onboarding process. Once you’ve ensured that the vendor team will work in the cloud on their own equipment, step through the process of what’s left in bringing on a vendor. Talk to employees of vendors who have recently gone through it, or are currently going through it. Map out every step. Now, worst case, you have a document you can share with the new vendor to understand what’s ahead of them and where they are in that process. But, better: optimize the hell out of that process, removing anything that you can, simplifying anything that you can. Every hour that you shave off this process will save $1,000.
Study and document the process when the vendor starts. The first time you bring on an Agile vendor, the process you’ve put together won’t be done. It will be better, but there will still be frustrations. Discuss this with the team during their first days on the project, ask that it be a subject of their first sprint retrospective, and turn that feedback into specific change for the next time that a new person joins the scrum team.
Designate a product owner. Agile projects’ success is heavily dependent on the product owner’s abilities to do their job well. I’ve written about this elsewhere, but the short version is that the agency needs an empowered product owner to start to take an ownership role over this project before the vendor shows up. This person will be the vendor team’s primary interface with the agency, the fixer for onboarding problems, the smiling face who will greet them at 9 AM on day one of sprint one. Have them in place many weeks before the vendor starts.
Create a path to production. Two weeks after the vendor starts, they’ll have code that’s ready to go to production. If your agency has an authority to operate (ATO) process that requires completing a 250-page system security plan (SSP) as a Word file to get anything in production, you’re going to have a bad time. You need to pilot a new ATO process that can move at the speed of software developer, or else your vendor team will be unable to get their software before actual end users. And Agile is impossible without that crucial feedback loop. This problem is really hard. Don’t be fooled by the fact that this is single bullet point in a list of six items—it’s of greater difficulty and complexity than the others. This is a 6–12-month process for agencies, probably longer for federal agencies. Consider making the immediate project a pilot, so that instead of proposing that your CIO overhaul the ATO process entirely, you’re simply proposing that an experiment be run to see if a continuous ATO process could work.

You don’t want to do all of the hard work of procuring a top-notch scrum team only to send them face-first into a brick wall of bureaucracy. You’ll lose all of the momentum, lose the best members of the team, waste $40,000/week, and when the project finally starts it will be with a demoralized team. By doing the right prep work, you can leave that team free to do what they do best—develop software—and get the best possible performance out of the vendor.

“Customized COTS” is the worst of both.

Some vendors who sell commercial off-the-shelf (COTS) software to government bristle when their software is described as such. They want you to know that it’s not COTS: it’s “customized COTS” or “modifiable COTS.” That is intended to reassure agencies that their software is flexible, that it can meet the agency’s needs. But, in fact, “customized COTS” is actually much worse than COTS.

Vendors are generally eager to have their offerings fall under the umbrella of “COTS,” because both state and federal government heavily favor buying existing software over having custom software built. This is good and sensible. It would be foolish to have a custom word processor built when Microsoft Word and Google Docs exist, and ideally available for licensing at a significantly lower cost than custom development.

But the COTS label is a millstone around the neck for vendors of software that drives the operations of agencies operating under highly localized regulatory regimes. Take unemployment insurance. Every state labor agency operates under a dizzying array of state and federal laws and regulations about who can get coverage, for how long, for how much money, under what circumstances, on what timeline, through what qualification processes. And those laws and regulations change continuously. I don’t want to say that it’s impossible to build a COTS tool that could handle all of those variations, but I will say that it would be enormously difficult, and would require some very specific architectural decisions that I know that none of the vendors have made. In practice, every new state that becomes a customer for such a system would introduce vast new complexities into the code base, requiring that the “COTS” product actually be forked, with customizations made for that new state. That brings its own complexities, because any changes across the code base need to be grafted manually into each forked copy. This is no longer “COTS,” by any reasonable definition, but is instead what 18F calls “UMOTS,” or “Unrecognizably Modified Off the Shelf Software.”

Customized COTS is just custom software that the agency doesn’t own, the equivalent of paying for extensive renovation to a home that you are renting. If a state is going to use COTS, they should do so because they are happy with the software as it exists, and do not require modifications. Nobody would buy Microsoft Word and then demand that Microsoft add an essential feature that is missing. That violates the purpose of COTS, which is that the vendor has made those decisions for you. If you don’t like their decisions, don’t buy their product.

Sean Boots has a good test for the legitimacy of COTS: “If you can get a software solution to successfully meet your needs in one day, it’s a real COTS product.” I propose a corollary for testing the legitimacy of customized COTS: Will all customers receive identical software updates? If yes, it’s probably COTS. If no, it’s customized COTS, and you’re paying to renovate a house you are renting.

COTS can be great. Custom software can be great. Customized COTS is a tar pit, a way to pay for extensive renovations to software that you do not own, and now feel that you cannot leave, because the sunk cost fallacy is real. Don’t license customized COTS.