Hiring note · internal memo

The PM at Juno is not what you think.

Execution used to be the bottleneck. At Juno, taste is the bottleneck — the PM is the person who decides what "a trip that feels right for this family" actually means, in the places the Memory agent can't.

Most PMs were never bottlenecked by execution. They were bottlenecked by taste and judgment. Team capacity functioned as a governor that prevented bad ideas from shipping. Remove that governor and you discover who was driving and who was just steering.

— Gemini, Head of Product

What we're looking for

the hardest role at Juno

Role

Product Manager — First Trip

We hire operators. On day one, this PM prototypes the onboarding flow in Claude Code in an afternoon — not a Figma, not a PRD, a clickable v0 the team reacts to. They write their own evals in Braintrust, read a LangSmith trace without asking for help, and move fluently across the eight layers from foundation models to strategy. They are craftspeople with model uncertainty: they can write the sentence "Juno is 90% sure this villa fits all nine of you — here's what we couldn't verify" and know why that sentence builds more trust than a confident lie. Above all, they have taste — the judgment to decide what "a trip that feels right for this family" means when the agent can't, in the moments that earn the second booking.

One outcome they own

Everything lands here.

% of Juno bookings that repeat within 18 months.

Target 65%

First Trip is where Juno has no memory yet — the hardest problem Juno has. If trip one doesn't earn trip two, Juno has no moat.

A week in the life

Five days. Prototype → eval → ship → talk to humans. Repeat.

Monday

Prototype.

Builds a new first-trip onboarding flow in Claude Code before standup. Clickable v0 live by 11am.

Tuesday

Evals.

Writes 24 evals against last month's first-trip failure logs. Each one becomes a regression test the Memory agent has to pass.

Wednesday

Ship.

Ships the new onboarding to 10% of traffic. No sprint ceremony. Posts a 90-second Loom walking through the change + eval deltas.

Thursday

Read the trace.

Reviews overnight traces in LangSmith. Spots that Matching is under-weighting accessibility on first trips. Writes the fix themselves. Engineer cleans up before lunch.

the bit evals can't give you

Friday

Talk to Sarah.

Calls three Sarah-type users who hit the failure mode. Rewrites the uncertainty copy live. Ships by 4pm.

Habits

Five rituals this role refuses.

PRDs

no.

Prototype in Claude Code instead. A clickable v0 kills a 12-page spec every time. Evidence precedes documentation.

Quarterly roadmap decks

no.

Stakeholder theatre. Ship the thing, then show it. A Loom of the feature running beats a deck about the feature coming.

Waiting for eng capacity

no.

At Juno, capacity is infinite — the PM is the bottleneck. If a decision can't wait, it doesn't; the PM ships it themselves and an engineer cleans up.

Handoff drift

no.

The PM builds in the codebase. No PRD → Figma → Jira → ticket relay. The idea and the artefact share a git history.

Stakeholder-management theatre

no.

One Loom beats four status meetings. If an exec wants a demo, they get a link, not a calendar invite.

The stack

What a Juno PM lives in.

PROTOTYPING

Claude Code

Cursor

UI / DEMOS

v0

Bolt

EVALS

Braintrust

OBSERVABILITY

LangSmith

ISSUES

Linear

(not sprints)

RESEARCH

Loom + phone

THINKING PARTNER

Claude / ChatGPT

NOT ON THIS LIST

Jira.

Confluence.

Quarterly decks.

One agent-PM day · end to end

Prototype at 8:45am. Shipped by noon. Fixed by 3:30pm. Called a user by 4:30.

OVERNIGHT

Pricing agent shipped a 3% markup experiment on first-trip bundles.

8:45am

PM opens Braintrust.

Spots regression on "family paying in a second currency."

9:30am

Writes four new evals capturing the failure.

Multi-currency · missing billing country · mixed-family payers.

11:00am

Prototypes a fix in Cursor.

First-trip checkout asks one clarifying question. Matching agent rescores in real time.

12:15pm

Fix passes evals. Shipped to 10% of first-trip traffic.

2:00pm

Re-runs traffic on the original markup experiment with the fix in place.

3:30pm

LangSmith confirms regression gone.

90-second Loom posted to #product.

4:30pm

Calls a user who hit the failure yesterday.

Asks what the checkout felt like now. That's the bit no eval can give.
That's taste.