Autonomous Travel Recovery Agent:

Designing a system that proactively detects missed-flight connections and prepares personalized recovery actions before a travel disruption.

Product Designer

Product Designer

Product Designer

2 months

2 months

2 months

Product Strategy · Systems Design · Interaction Design · User Testing · MVP Engineering

Product Strategy · Systems Design · Interaction Design · User Testing · MVP Engineering

End-to-end PDP redesign

2 months

Product Strategy · Systems Design ·

User Testing · MVP Engineering

Framing

Framing

Most travel apps help users understand disruptions. I explored what happens when the system begins acting on the traveler’s behalf.


In 2022, over 200 million passengers in the US alone experienced flight delays or cancellations. The economic cost across travelers, airlines, and the broader ecosystem reaches $60 billion annually. The infrastructure to handle disruption exists. What's missing is a recovery layer that works for the traveler, not just the airline.

Framing

||||||||||

||||||||||

AA 123

Mon 5/30

Gate arrival in 1h 6m

27m late

7:48 am

MIA

Maimi

12:20 pm

DFW

Dallas

A321

Gate C12

3

Economy

22A

AA 789

Mon 5/30

Boarding in 1h 21m

On time

1:50 pm

DFW

Dallas

4h 25m

3:15 pm

SEA

Seattle

A321

Gate B24

3

Economy

22A

Preparing flight rebooking...

30m connection

We’ll notify you when ready...

You will not have time to connect.

||||||||||

||||||||||

AA 123

Mon 5/30

Gate arrival in 1h 6m

27m late

7:48 am

MIA

Maimi

12:20 pm

DFW

Dallas

A321

Gate C12

3

Economy

22A

AA 789

Mon 5/30

Boarding in 1h 21m

On time

1:50 pm

DFW

Dallas

4h 25m

3:15 pm

SEA

Seattle

A321

Gate B24

3

Economy

22A

Preparing flight rebooking...

30m connection

We’ll notify you when ready...

You will not have time to connect.

Most travel apps already tell you when something goes wrong. Delay notifications, gate changes, rebooking prompts — the visibility layer is largely solved.


What isn't solved is what happens next.


When a connection starts collapsing, the operational burden falls entirely on the traveler. You're refreshing apps, scanning for alternatives, calculating whether you can make it, and competing for the last seats on the next flight — all while physically moving through an airport. The stress isn't from not knowing. It's from having to act, fast, with incomplete information, alone.


I started this project thinking the answer was better guidance. A smarter, more proactive way to surface what travelers should do during disruption. I was wrong about that — and the user testing that proved it became the most important moment of the project.

V1 Building to Learn

I didn't start with a concept. I built a functioning MVP using live flight and airport data — real arrival delays, connection windows, time estimates — to test whether proactive disruption guidance was worth building.


The system classified connections into four states: safe, tight, likely missed, impossible. Each state triggered different monitoring behavior, recovery recommendations, and user messaging. The interface focused on explanation — surfacing what was happening, why it was happening, and what the traveler should do next.


At the time, I thought that was the product.

The Pivot

When I reviewed the V1 testing, I made a deliberate choice about what to focus on. Instead of cataloguing UI fixes — and there were plenty — I wanted to understand whether the product was solving something users actually needed.


And, it wasn't. Not really.


The core value proposition of V1 was the "what to do now" guidance layer. But users could already figure that out. In most disruption scenarios, the path forward is straightforward: rebook through the app, call an agent, or find a counter. A smarter explanation of that process wasn't going to change behavior. Users would still default to their airline app.


What they actually wanted was to not have to deal with it at all.


The strongest signal from testing wasn't frustration with information. It was frustration with standing in long customer service lines, fighting for limited remaining seats, manually repairing poor airline auto-rebookings. Users liked when airlines rebooked them automatically. They just didn't see it happening enough, and when it did the result rarely matched what they actually wanted.


The real pain wasn't understanding the disruption. It was escaping the recovery process.

V2 Designing the Recovery Layer

The pivot wasn't just a product direction change. It forced a more fundamental question: if the system is going to act on a traveler's behalf, what does it actually need to do that?


Digging into how airline booking infrastructure actually works changed everything. Same-ticket rebooking during a disruption — the kind that happens instantly, without a counter, without a phone call — requires the original ticketing authority. The agency that issued the ticket is the only one that can reissue it. That meant a standalone monitoring app layered on top of someone else's booking was never going to work. The recovery agent needed to live inside a booking platform, not on top of one.


That constraint became the product's biggest strategic advantage.


Most booking platforms compete on price, inventory, and interface. This one competes on what happens when things go wrong. Book through the platform, and the system watches your trip from the moment you confirm. When something goes wrong, it doesn't send you a notification. It starts solving the problem.

Constrained Autonomy

Full automation was never the goal. A system that freely purchases flights, spends money without asking, and removes the traveler from the loop entirely would fail on trust before it ever got the chance to prove itself useful.


So I designed around a specific autonomy model: the system prepares, the traveler confirms.


Within that boundary the system could monitor itinerary viability continuously, evaluate recovery options against user preferences, prepare an alternative itinerary, and present it for a single confirmation tap. Outside that boundary it wouldn't act. No irreversible changes without explicit approval.


This wasn't a technical limitation. It was a deliberate trust decision. The goal was a system that felt like a capable travel agent working on your behalf — not an algorithm making decisions you didn't sanction.

Designing for Trust

As the system took on more responsibility, trust became the harder design problem than automation.


User testing surfaced a consistent pattern. People weren't asking "can this system rebook my flight?" They were asking "how does the airline communication work?", "can I still see other options?", "what if the booking fails?", "will my luggage transfer?" The questions weren't about capability. They were about visibility and control.


That shifted the design focus from "can the system automate recovery?" to "what does the traveler need to see to feel safe letting it?"


The answer wasn't more information. It was the right information at the right moment. What the system knew, what it predicted, what could still change, and what required their input. Automation that explains itself feels trustworthy. Automation that disappears into the background feels dangerous.

Designing for Failure

Automated recovery introduces a new category of failure that monitoring tools never have to face. The system could prepare a perfect alternative and then lose the seat inventory before confirmation. The airline system could go offline mid-rebooking. The traveler could disagree with the recommendation entirely.


Each of those needed a designed response, not an error screen.


I added a "waiting for airline to declare disruption" state — because third-party rebooking has real timing constraints, and pretending otherwise would have broken trust the first time it failed silently. A "can't wait?" prompt gave users willing to pay for a separate ticket a way out. Airline system unavailability got its own state with a retrying indicator. Ticketing failures exposed the specific point of breakdown with three clear recovery paths.


The goal wasn't a system that never failed. It was a system that failed in ways the traveler could understand and act on.

Interaction Design Iteration

Disruption scenarios are unforgiving. A traveler mid-connection has seconds to glance at a screen and know whether to run or relax. Hesitation has a cost. That reality shaped every interface decision in V2.

The interface had to be readable before it could be trusted.

11:10

||||||||||

You will barely make this connection

I’m actively monitoring...

+3m buffer

Only a few minutes to spare...

You will have 25 min to connect.

Flight 1

AA717

Gate arrival in 1h

17m late

7:48 am

MIA

Miami

4h 22m

12:10 pm

DFW

Dallas

A321

Gate C12

3

Economy

22A

40m connection

10m to deplane

12m

A8 → B6

11:10

MCT

32 minutes

Your connection in DFW is tight.

We’re actively monitoring...

40m connection

Based on current data...

You’ll have only 8 minutes to spare.

UA 123

Mon 5/30

Gate arrival in 1h 0m

17m late

7:48 am

MIA

Miami

12:10 pm

DFW

Dallas

A321

Gate A8

3

Economy

22A

||||||||||

||||||||||

11:10

||||||||||

You will barely make this connection

I’m actively monitoring...

+3m buffer

Only a few minutes to spare...

You will have 25 min to connect.

Flight 1

AA717

Gate arrival in 1h

17m late

7:48 am

MIA

Miami

4h 22m

12:10 pm

DFW

Dallas

A321

Gate C12

3

Economy

22A

40m connection

10m to deplane

12m

A8 → B6

11:10

MCT

32 minutes

Your connection in DFW is tight.

We’re actively monitoring...

40m connection

Based on current data...

You’ll have only 8 minutes to spare.

UA 123

Mon 5/30

Gate arrival in 1h 0m

17m late

7:48 am

MIA

Miami

12:10 pm

DFW

Dallas

A321

Gate A8

3

Economy

22A

||||||||||

||||||||||

Early testing revealed a consistent problem. Users were doing math they shouldn't have to do. The connection component displayed timing information in a way that required mental calculation to interpret — buffer duration, connection time remaining, minimum connection time breakdown — all present, none immediately clear. Users tried to verify the numbers in their head mid-test.


The "buffer" label was the sharpest offender. Users initially inferred it meant time being added. I removed the term entirely, restructured the information hierarchy so the most critical piece — "You'll have only 8 minutes to spare" — was the largest element on the component, and replaced the static timing breakdown with a visual progress widget showing the traveler's position during connections in real time. And I condensed the minimum connection time (MCT) breakdown into an expandable widget. The math didn't disappear. It just stopped being the traveler's problem.

Giving users control without giving them a decision.

When the rebooking overlay first appeared in testing, users felt cornered. One option, two buttons, no visibility into what else was available. The instinct was to not confirm anything until they understood what they were giving up.


The fix wasn't showing more options upfront. That just moved the decision overload earlier. Instead I anchored the overlay around a single recommended path — the best available option based on user preferences and calculated risk, with explicit reasoning shown beneath it. Earliest reliable arrival. Same airline. Below that, an expandable section let users swipe between alternatives filtered by earlier arrival, better buffer, and lower cost, each with its own reasoning and a separate call to action.


The confidence rating that had always been present became clickable, opening an overlay that explained exactly how the system arrived at its prediction. Users didn't need to trust a number. They needed to understand what the number meant.

The interface needed to constantly answer: what is happening right now.

View

Rebooking is ready for confirmation.

You will not have time to connect...

A new flight was prepared for you.

Flight 1

AA717

Gate arrival in 1h 8m

27m late

7:48 am

MIA

Miami

4h 22m

12:20 pm

DFW

Dallas

A321

Gate C12

3

Economy

22A

30m connection

10m to deplane

12m

A8 → B6

Flight 2

AA717

Boarding starts in 23m

On time

12:50 pm

DFW

Dallas

4h 25m

2:15 pm

SEA

Seattle

A321

Gate B6

3

Economy

22A

You're now eligible for free rebooking.

View

We’ve selected the best alternative for you...

Select view’ to confirm.

AA 123

Mon 5/30

Gate arrival in 1h 8m

27m late

7:48 am

MIA

Miami

12:20 pm

DFW

Dallas

A321

Gate A8

3

Economy

22A

||||||||||

||||||||||

AA 456

Mon 5/30

Boarding in 23m

On time

12:50 pm

DFW

Dallas

4h 25m

2:15 pm

SEA

Seattle

A321

Gate B6

3

Economy

22A

View

Rebooking is ready for confirmation.

You will not have time to connect...

A new flight was prepared for you.

Flight 1

AA717

Gate arrival in 1h 8m

27m late

7:48 am

MIA

Miami

4h 22m

12:20 pm

DFW

Dallas

A321

Gate C12

3

Economy

22A

30m connection

10m to deplane

12m

A8 → B6

Flight 2

AA717

Boarding starts in 23m

On time

12:50 pm

DFW

Dallas

4h 25m

2:15 pm

SEA

Seattle

A321

Gate B6

3

Economy

22A

You're now eligible for free rebooking.

View

We’ve selected the best alternative for you...

Select view’ to confirm.

AA 123

Mon 5/30

Gate arrival in 1h 8m

27m late

7:48 am

MIA

Miami

12:20 pm

DFW

Dallas

A321

Gate A8

3

Economy

22A

||||||||||

||||||||||

AA 456

Mon 5/30

Boarding in 23m

On time

12:50 pm

DFW

Dallas

4h 25m

2:15 pm

SEA

Seattle

A321

Gate B6

3

Economy

22A

View

Rebooking is ready for confirmation.

You will not have time to connect...

A new flight was prepared for you.

Flight 1

AA717

Gate arrival in 1h 8m

27m late

7:48 am

MIA

Miami

4h 22m

12:20 pm

DFW

Dallas

A321

Gate C12

3

Economy

22A

30m connection

10m to deplane

12m

A8 → B6

Flight 2

AA717

Boarding starts in 23m

On time

12:50 pm

DFW

Dallas

4h 25m

2:15 pm

SEA

Seattle

A321

Gate B6

3

Economy

22A

You're now eligible for free rebooking.

View

We’ve selected the best alternative for you...

Select view’ to confirm.

AA 123

Mon 5/30

Gate arrival in 1h 8m

27m late

7:48 am

MIA

Miami

12:20 pm

DFW

Dallas

A321

Gate A8

3

Economy

22A

||||||||||

||||||||||

AA 456

Mon 5/30

Boarding in 23m

On time

12:50 pm

DFW

Dallas

4h 25m

2:15 pm

SEA

Seattle

A321

Gate B6

3

Economy

22A

As the system became more autonomous, a new problem surfaced. Users were losing track of where they were in the process. Was the flight already rebooked? Was the original flight still active? Was this a recommendation or a confirmation?


One user inferred that the "Ready for Confirmation" screen meant the rebooking had already happened. It hadn't. That single misread was the most consequential UX failure in testing — in a real scenario it could mean a traveler misses both flights.


I restructured the state hierarchy entirely. "Select view to confirm" moved to the largest type on the screen. The confirm button increased in size and filled with the primary color. I removed green from the status pill color coding — it had been appearing in moments of normalcy rather than moments of positive action, creating false reassurance. Every state in the flow got a single clear answer to the question a traveler was actually asking in that moment.

Tradeoffs

The most consequential call was how much autonomy to give the system. Full automation was technically possible to design toward — but V2 user testing pushed back hard. Users wanted to see options before confirming anything. Removing that step entirely felt like the system was acting without permission. I landed on semi-autonomous: the system prepares, the traveler confirms. That wasn't a compromise. It was the foundation of a trust ladder. Full automation is a future state the product earns through repeated successful recoveries, not something it starts with.


On coverage, the question was whether to surface same-airline options only or expand to alliance partners. Show too many cross-airline alternatives and you dilute confidence in the recommendation. Show none and users question whether the system actually has access to meaningful inventory. V2 prioritizes same-airline rebooking because it keeps the system's core promise intact — low-friction recovery with high reliability. Alliance partners come in V3, when expanding coverage meaningfully improves outcomes rather than just adding options.


The minimum connection time display created a specific problem. Users kept trying to verify the math mentally — deplane time plus walking time plus gate close — while already under stress. Making the breakdown expandable rather than always visible removed the mental load from the default state without hiding the information entirely. It's there when you want it. It doesn't demand attention when you don't.


The MCT label itself was a deliberate simplicity call. Spelling it out in full added clutter to a component that was already information-dense. I kept the acronym, trusting that users who expanded the section would understand it in context. Clarity at a glance, detail on demand.


What I Learned

The most important thing this project taught me had nothing to do with interaction design.


When V1 testing came back, my instinct was to fix the interface. Clearer labels, better hierarchy, smoother flows. That work was real and it needed to happen. But the deeper problem wasn't comprehension, it was that the product wasn't doing enough to matter. Users understood the disruption just fine. They just didn't need my help understanding it.


That distinction, between interface friction and product friction, is one I won't forget. Polishing the wrong product is still the wrong product.


The trust design work in V2 changed how I think about automation entirely. I went into it assuming the hard problem was technical: how does the system access airline infrastructure, how does it prepare a rebooking, how does it handle edge cases. Those were real challenges. But the harder problem turned out to be behavioral: what does a person need to see, and when, to feel safe handing control to a system during one of the most stressful moments of travel?


That question has no clean answer. It's calibrated through testing, through watching someone hesitate before hitting confirm, through noticing that a single misread state made a user think their flight was already changed when it wasn't. Trust is designed in details most people never consciously notice.


The last thing, and maybe the most practically useful, is that operational constraints are not the enemy of good product thinking. Learning that same-ticket IROPS reissuance requires original ticketing authority didn't kill the concept. It clarified it. The product stopped being a layer on top of someone else's infrastructure and became a reason to use a different booking platform entirely. The constraint was the strategy.

V2 establishes the core recovery loop — monitor, prepare, confirm. But there's meaningful product surface left unexplored.


The delegation model is the most interesting frontier. Right now the system operates within a fixed confirmation requirement. A natural evolution is a preference-based autonomy dial — travelers who've confirmed enough recoveries to trust the system could opt into fuller automation for defined scenarios. Same airline, economy cabin, arrival within two hours of original. The system acts, then notifies. No confirmation required.


The onboarding experience also needs real design attention. Traveler preferences — seat preferences, airline loyalty, connection time comfort, cost tolerance — are what make the recovery agent personal rather than generic. Right now those preferences are assumed. Designing the moment a traveler actually sets them, and understands what they're delegating, is its own significant design problem.


If this shipped, here's what I'd measure:


Rebooking confirmation rate — what percentage of prepared recoveries do travelers actually confirm. A low rate signals a trust or relevance problem in the recommendation engine.


Preference match score — how closely confirmed rebookings align with stated traveler preferences. This is the metric that separates this product from airline auto-rebooking.


Time to resolution — from disruption detection to confirmed alternative, measured against the airline baseline of standing in a customer service line. This is the core value proof.


Autonomy adoption rate — over time, what percentage of users move toward higher delegation settings. This measures whether trust is actually being built through repeated use.


*Further research into ARC accreditation rules revealed that same-ticket execution authority requires a declared disruption — a cancellation or a delay of 3 or more hours — with at least 2 hours remaining before departure. A next version would redesign around two execution models: same-ticket reissuance for itineraries booked through the platform when a qualifying disruption is declared (most likely cancellation), and a separate ticket model for all other scenarios, covering any itinerary regardless of booking source. The first is elegant and constrained. The second is broader and more universally useful. The long-term product likely needs both.