Units of automation

Pretty much every presentation I give about AI includes the following slide, in the context of automatically authenticating users using face recognition.

The idea is that even automation-reliant businesses don’t need to go 100% automatic to start serving customers: you can start with a basic model that outputs a confident “yes” (green region), a confident “no” (red region), or delegates to a human because it is unable to decide (yellow region). The goal of automation product development is to decrease the relative size of the yellow area.

In other words, the goal is to increase automation rate: the percentage of all incoming face photos where the decision is made completely automatically, i.e. with no human intervention. Let’s generalise :

A unit of automation is a chunk of work that can either be performed fully automatically, or in case the system is not confident enough, assigned to a person to solve through human fallback.

Automation rate measures the proportion of all units of automation that were performed without using human fallback.

Units of product

The unit of automation may or may not be exactly the same as a unit of product.

Starship‘s sidewalk robots deliver groceries, food, and packages. A natural unit of product for them is a delivery, but a unit of automation could be smaller: a predefined segment of road, a road crossing, or something else. The reason to distinguish between the two is because a unit of product (a delivery) may consist of a hundreds units of automation (road segments), and if your AI predictably fails in only one of these small segments, it would be wasteful to fall back to a human for the whole delivery.

Defining a unit of automation more granularly also makes metrics clearer. If robots start servicing areas that require driving distances twice as long, the % of deliveries done fully automatically (product automation rate) will decrease simply because instead of the previous hundred road segments, two hundred segments are tried, so the aggregate chance of encountering at least one failure increases. This is confusing because the % of segments done fully automatically stays exactly the same, but we get the impression of our algorithms getting worse due to the definition of the metric.

the distribution of units of product shifts (e.g.

Let’s look at another example. x.ai‘s automated assistant schedules meetings, and a unit of product is one scheduled meeting. Since the scheduling happens over email threads which naturally decompose into single emails, we can define a unit of automation as one email reply from the x.ai assistant. Since email is not perceived as a real-time communication channel, humans can handle the most difficult cases. The definition of automation rate for their business then becomes “% of all email replies from the x.ai assistant that require no human intervention”.

Human fallback time

Another very useful concept is human fallback time:

Human fallback time is the amount of time a human spends completing a unit of automation.

For the robotic delivery case above, human fallback time shows the average time it takes a human to control the robot for the length of a road segment. For scheduling meetings over email, it shows the average time a human spends writing the email reply.

We can now decompose the overarching goal of automation into two main directions.

Increasing automation rate means reducing the proportion of units of automation where a human intervention is required. Since by definition we need to increase cases where humans never enter the process, this can only be done by building better software, training higher-quality models, and handling more corner cases. Robots will have to drive more segments and email bots need to write more replies without human intervention.

Reducing human fallback time means making human interventions faster. One approach here is improving operations (e.g. training employees), or UX (e.g. more intuitive placement of controls or faster page loads), but you could also combine the former with building software and models. Instead of writing a whole email reply, an x.ai employee could select from a few prebuilt templates and fill in some variables like the date and time. You could go further and ask the employee to simply classify the intention of the email, and then do the rest automatically – with the added advantage of collecting training data for the future.

Startups need to be mindful of unit economics, i.e. the revenues and costs related to a single unit of product when produced at scale. The further along the company, the more thought goes into this. In automation-critical business models the concepts laid out above play a critical role: automation rate and human fallback time directly affect your ability to scale and the cost of doing so. If your plan requires a million person workforce, that is a bad sign.

Automation calculator

My initial impulse for writing this post came from a simple calculation: I wanted to know what automation rate was required for reaching certain product volumes. It’s a useful back-of-the-napkin feasibility study: can you service a significant proportion of the market with a reasonable amount of employees (say, a thousand) and an automation rate that seems technically viable?

Let’s take the example of x.ai. Suppose they are looking into the far future and want to serve a billion customers with a maximum of 2000 employees, each of them scheduling on average 2 meetings per week, coming to a total of approximately 100 meetings per year. Typically I need about 3 back-and-forth emails to schedule one meeting, so x.ai has 3 x 100 x 1 billion = 300 billion units of automation to deliver.

For the rest of the calculation I built the simple calculator shown below. The result for our hypothetical x.ai case is 99.94% required automation rate, so for every email written by a human, AI needs to write about 1700 emails automatically.

(everything you enter into the calculator stays in your browser)

For most tasks that are simple for humans, 90% sounds easy, 99% still reasonable, and even 99.9% (one human fallback in a thousand units) is not intimidating given a few years. The 99.94% for x.ai looks difficult, but then again they focus on a very narrow task that there is plenty of example data in users’ mailboxes. And this would serve everyone in the world with English as their first or second language.

Your intuition will obviously vary based on the difficulty of your problem and state of the art in the relevant academic field, but for me the takeaway is clear:

Serving billions of customers is possible without hundreds of thousands of employees, or cosmic automation rates.

Thanks to Joonatan Samuel and Brett Astrid Võmma for reading drafts of this.