Hiring note · internal memo

The PM at Juno is not what you think.

Execution used to be the bottleneck. At Juno, taste is the bottleneck — the PM is the person who decides what "a trip that feels right for this family" actually means, in the places the Memory agent can't.

Most PMs were never bottlenecked by execution. They were bottlenecked by taste and judgment. Team capacity functioned as a governor that prevented bad ideas from shipping. Remove that governor and you discover who was driving and who was just steering.

What we're looking for
Role

Product Manager — First Trip

We hire operators. On day one, this PM prototypes the onboarding flow in Claude Code in an afternoon — not a Figma, not a PRD, a clickable v0 the team reacts to. They write their own evals in Braintrust, read a LangSmith trace without asking for help, and move fluently across the eight layers from foundation models to strategy. They are craftspeople with model uncertainty: they can write the sentence "Juno is 90% sure this villa fits all nine of you — here's what we couldn't verify" and know why that sentence builds more trust than a confident lie. Above all, they have taste — the judgment to decide what "a trip that feels right for this family" means when the agent can't, in the moments that earn the second booking.

One outcome they own
Everything lands here.
% of Juno bookings that repeat within 18 months.
Target 65%
First Trip is where Juno has no memory yet — the hardest problem Juno has. If trip one doesn't earn trip two, Juno has no moat.
A week in the life

Five days. Prototype → eval → ship → talk to humans. Repeat.

Monday
Prototype.

Builds a new first-trip onboarding flow in Claude Code before standup. Clickable v0 live by 11am.

Tuesday
Evals.

Writes 24 evals against last month's first-trip failure logs. Each one becomes a regression test the Memory agent has to pass.

Wednesday
Ship.

Ships the new onboarding to 10% of traffic. No sprint ceremony. Posts a 90-second Loom walking through the change + eval deltas.

Thursday
Read the trace.

Reviews overnight traces in LangSmith. Spots that Matching is under-weighting accessibility on first trips. Writes the fix themselves. Engineer cleans up before lunch.

the bit evals can't give you
Friday
Talk to Sarah.

Calls three Sarah-type users who hit the failure mode. Rewrites the uncertainty copy live. Ships by 4pm.

Habits

Five rituals this role refuses.

PRDs
no.
Prototype in Claude Code instead. A clickable v0 kills a 12-page spec every time. Evidence precedes documentation.
Quarterly roadmap decks
no.
Stakeholder theatre. Ship the thing, then show it. A Loom of the feature running beats a deck about the feature coming.
Waiting for eng capacity
no.
At Juno, capacity is infinite — the PM is the bottleneck. If a decision can't wait, it doesn't; the PM ships it themselves and an engineer cleans up.
Handoff drift
no.
The PM builds in the codebase. No PRD → Figma → Jira → ticket relay. The idea and the artefact share a git history.
Stakeholder-management theatre
no.
One Loom beats four status meetings. If an exec wants a demo, they get a link, not a calendar invite.
The stack

What a Juno PM lives in.

PROTOTYPING
Claude Code
Cursor
UI / DEMOS
v0
Bolt
EVALS
Braintrust
OBSERVABILITY
LangSmith
ISSUES
Linear
(not sprints)
RESEARCH
Loom + phone
THINKING PARTNER
Claude / ChatGPT
NOT ON THIS LIST
Jira.
Confluence.
Quarterly decks.
One agent-PM day · end to end

Prototype at 8:45am. Shipped by noon. Fixed by 3:30pm. Called a user by 4:30.

OVERNIGHT
Pricing agent shipped a 3% markup experiment on first-trip bundles.
8:45am
PM opens Braintrust.
Spots regression on "family paying in a second currency."
9:30am
Writes four new evals capturing the failure.
Multi-currency · missing billing country · mixed-family payers.
11:00am
Prototypes a fix in Cursor.
First-trip checkout asks one clarifying question. Matching agent rescores in real time.
12:15pm
Fix passes evals. Shipped to 10% of first-trip traffic.
2:00pm
Re-runs traffic on the original markup experiment with the fix in place.
3:30pm
LangSmith confirms regression gone.
90-second Loom posted to #product.
4:30pm
Calls a user who hit the failure yesterday.
Asks what the checkout felt like now. That's the bit no eval can give.
That's taste.