Productside Webinar

How PMs 10× Their Role with AI: Part 2

Building Smarter — Live Demo

Date:

11/13/2025

Time EST:

2:00 pm
Watch Now

Stop spending weeks validating ideas and months building prototypes that miss the mark. In this 60-minute live demo, Dean Peters and Tom Evans show you four practical AI techniques that compress the build-measure-learn cycle from weeks to days. You’ll see live demonstrations of context engineering, synthetic evals, agentic workflow automation, and vibe coding. No theory, just working examples you can replicate Monday morning. 

What You Will Learn: 

  • Context Engineering: Build AI workspaces with persistent domain knowledge. Stop explaining the same context every session. Work as a team instead of in individual chatbot silos. Get outputs that actually understand your product, your users, your constraints. 
  • Synthetic Evals: Generate Acceptance Criteria and Test cases, then automate quality checks before you have real user data. See where your product breaks before building. 
  • Agentic Workflows: Automate repetitive research and draft sprint backlogs. Get competitive intelligence and user insights gathered automatically. Your backlog writes itself while you think strategically. 
  • Vibe Coding: Vibe code touch-and-feel, proof-of-life prototypes in minutes instead of days using Gemini and Claude Code. Turn concepts into clickable demos fast enough to test and iterate in the same day. 

Welcome & Introductions

Tom Evans | 00:00–00:32
Nice. All right. Well, hello everyone. Welcome to today’s webinar. Good morning or afternoon or evening depending on which part of the world you are in. We’re going to wait just a few minutes here to let some more people join us in the webinar before we move on into the content.

Tom Evans | 00:32–00:55
But in the meantime, we’d love to know a little bit about you. So our question that we always ask — and I think many others ask — is: tell us where you’re from. We’re just trying to get to know what part of the world we have participants joining us from. So, if you would, just type that into the chat window.

Dean Peters | 00:55–01:12
Oxford. Cool. Yeah, we get consistent attendance from the UK. That is awesome.

Tom Evans | 01:12–01:25
Yeah, I know, right? Well, I think it’s getting near the evening. Trying to get that last hour in. Got Iowa.

Dean Peters | 01:25–01:40
Oh man, how cold is it there? Got Indianapolis. Indianapolis. Toronto. That’s probably a little colder. Yeah, I would imagine it is. Hold that nice little lake effect.

Audience Check-In & Northern Lights Conversation

Dean Peters | 01:40–02:10
Oh, here I got another question while we’re waiting. Did anyone who was more up to the north able to see the Aurora Borealis the last night or two? Because down in Austin, some said that outside of Austin they were able to see it, but within Austin I wasn’t able to see anything.

Dean Peters | 02:10–02:26
But who was able to see that? Don’t know. Kevin’s giving us greetings from the UK. He was with us in some of the Vodafone episodes we’ve had.

Tom Evans | 02:26–02:42
Saw it in the India area. Excellent. 60° in Iowa. Dang, it’s colder here in Raleigh.

Dean Peters | 02:42–03:05
Sushil. I think I know Sushil. I think I know where… I think… and all the peeps there in Indie from the Arimo days.

Tom Evans | 03:05–03:30
Oh, wow. Yeah, it’s raining in the UK. It’s raining in the UK — who ever heard. I’ve actually been there on a very beautiful sunshiny day. I was there this time last year and caught their first snow when I was in London.

Renovation Stories & Setting Up the Webinar Theme

Tom Evans | 03:30–04:00
Okay, if you’re just joining us, let us know where you’re calling in from. We want to know what part of the world — and also if you were able to see the northern lights the last couple of nights.

Dean Peters | 04:00–04:22
It pushed its way through fairly far south — further south than usual. Love to see who was able to see the northern lights too, because I’ll be envious of you.

Tom Evans | 04:22–04:40
Lenor is visiting us from beautiful San George Peters. Don’t listen to that guy. He’s in Los Alamos. Sil, my man. We got to get out to Indie. Maybe catch one of those football games.

Dean Peters | 04:40–05:05
Who’s Joe Galy? Joey. I never heard of that guy. Let anyone in here.

Tom Evans | 05:05–06:00
Let me transition us a little bit. We’re going to get into renovation stories later because they tie directly into how we think about AI and workflow changes — but for now, let’s keep meeting everyone joining the room.

Context Engineering: Breaking Team Silos

Dean Peters | 06:00–06:28
All right, folks, let’s go ahead and shift gears. Today we’re going to get into what I think is one of the most under-leveraged, under-taught, but absolutely transformative pieces of using AI as a product manager — and that is context engineering.

Dean Peters | 06:28–06:52
If you were in Part 1, you heard me say this before: Context is the new code. If AI doesn’t understand your product, your domain, your users, and your constraints, then you’re basically prompting into the void. And every PM on your team is doing their own thing in their own chatbot silo.

Tom Evans | 06:52–07:11
And that’s the biggest gap we see with PM teams. Five PMs, ten PMs — everyone has an individual chat thread with AI, none of them share the same memory, none of them share the same definitions, and everyone gets different answers.

Dean Peters | 07:11–07:32
Exactly. So, what we want to show you here is how to build persistent AI workspaces — think of them like shared brains for your entire product team — so you stop re-explaining the basics every time you start a new session.

Dean Peters | 07:32–07:54
Context engineering is the foundation of everything else: better acceptance criteria, better prototyping, better research automation. If you don’t get this right, everything downstream is messy.

Demonstrating Notebook LM & Gemini Projects

Dean Peters | 07:54–08:15
So let me pull something up here. This is Notebook LM. If you’ve never used it, think of it as a place where you can store your product domain knowledge — docs, specs, customer interviews, journeys — and the AI remembers it session to session.

Dean Peters | 08:15–08:36
But what’s powerful is the Gemini Projects version because it lets you scale that memory across your entire team. Everyone is working on the same shared brain. Everyone pulls from the same domain context. Everyone’s outputs are consistent.

Tom Evans | 08:36–08:49
Which means if Dean generates acceptance criteria and I generate acceptance criteria, we’re not getting two completely different interpretations of the product anymore.

Dean Peters | 08:49–09:12
Exactly. For example, I can drop in our core personas, our constraints, our architecture doc, our roadmap themes, our user pains — and then ask Gemini to generate prototypes or user flows or PRDs, and it’s pulling from the same persistent context every time.

Dean Peters | 09:12–09:34
This is the difference between “an AI model answering a question” and “an AI model that knows your product.” And that gap — that is where most PMs lose 80% of the potential value.

Dean Peters | 09:34–10:00
All right, so let me show you a quick demo. I’m going to switch screens. So here: imagine I tell the model, “Here’s our onboarding journey. Here are the friction points. Here are the error paths. Here’s our user research summary.” And then I ask it to generate a prototype flow.

[Screen demonstration occurs — dialogue continues as narration]

Dean Peters | 10:00–10:30
See what it’s doing? It’s stitching together insights from the entire knowledge base. No single prompt could have pulled that off. This is all context engineering — and we’re only scratching the surface.

Synthetic Evals: What Good Looks Like Before You Have Users

Dean Peters | 10:30–11:02
How would I start working once I had these broken down into these six steps here? How would I go ahead and start building an agent? How would I go ahead and start vibe coding a piece of software especially if I don’t have evals to test and validate the steps.

Dean Peters | 11:02–11:27
Now, we’ve all been hearing a lot about evals these days. You go listen to Lenny’s podcast or something else and they’re all talking about it being some magical mystical thing. Folks, we’re here to tell you this is nothing more than evolved acceptance criteria.

Dean Peters | 11:27–11:47
And it’s a combination of what we used to do in acceptance criteria plus an observability play, which is why you’re starting to see so much data science language brought into this. But at the bottom line, it’s just telling what good looks like. It’s painting a picture of what good looks like.

Dean Peters | 11:47–12:12
And what’s really important here is that we need to understand that if we’re working with generative AI, it is nondeterministic. So instead of being black and white or on and off, we need to write our evaluations to basically land within a range of acceptability and a range of behaviors.

Tom Evans | 12:12–12:31
Dean, just real quick, just to re-emphasize that point. If I’m working with standard software and I write acceptance criteria, I know that if I do this input, this input, and this input, I’m always going to get this output, right?

Dean Peters | 12:31–12:44
Yep. Yep. Absolutely.

Tom Evans | 12:44–13:03
With AI, you can enter the same inputs and you’re going to get a variation on the output. And so this is really about: did it fall within an appropriate range?

Dean Peters | 13:03–13:25
Yep. A range of behaviors or a range of answers. Exactly — because you’re working with generative AI. Now of course, if we’re working with machine learning or other modalities, that’s a different story. But yes — generative AI behaves within ranges.

Dean Peters | 13:25–13:49
Here’s some language borrowed from our friends in data science. There’s the concept of “goldens,” which is basically a file or reference file: the expected outputs and the desired behaviors. For example, you might see a prompt and an expected response. That’s it — nothing mystical.

Dean Peters | 13:49–14:17
You’ll also see “traces.” Traces are logs of actual responses or behaviors — what came back and why. It’s our job as product managers to look at those traces and evaluate which ones have weak prompts or a weaker model that needs attention. It helps us answer questions like:

Is our model good enough?

Are our prompts good enough?

Is our filtering good enough?

Is our data corpus good enough?

Dean Peters | 14:17–14:33
So by walking through these processes, we can help make a better product. But here’s the chicken-and-egg problem, don’t you think?

Dean Peters | 14:33–14:46
How do we create evals based on prompt responses if we don’t have any users yet?

Dean Peters | 14:46–15:02
So what I’m going to suggest is that we can actually synthesize responses here.

Synthetic Evals Continued: Turning “What Good Looks Like” Into Something Testable

Dean Peters | 15:02–15:25
So here’s the trick: if we don’t have real users yet, we can actually synthesize responses. And it’s not cheating. It’s simply generating the types of responses you expect users to give — good ones, bad ones, weird ones, edge cases — and using those to test whether your system behaves inside the acceptable range.

Dean Peters | 15:25–15:51
So think about it like this: you write user stories before you have users. You write acceptance criteria before the feature is built. Synthetic evals follow the same pattern — you define the conditions, the boundaries, the expected behaviors before the thing exists.

Dean Peters | 15:51–16:14
And by the way, this is why AI product management feels familiar: we’ve done versions of this for years. We just never had to do it with nondeterministic systems before. And nondeterministic systems require ranges, not absolutes.

Dean Peters | 16:14–16:37
Now let me show you an example. I can prompt the model: “Give me 30 examples of likely user responses for onboarding, including outliers, misinterpretations, incomplete answers, and one-word replies.” Instant synthetic dataset.

Dean Peters | 16:37–17:02
Then I can say: “Based on these, generate the eval criteria: functional, quality, safety, and user-level evaluations.” Boom. Now we have a way to test whether our prototype falls within the defined ranges.

Tom Evans | 17:02–17:19
And this is the part most PMs get hung up on — they think evals require real data, or weeks of collecting samples. Nope. Not anymore. You generate it.

Dean Peters | 17:19–17:36
Right. And once you have evals, you can turn them into agents. Let’s talk about that.

From Synthetic Evals to Agentic Workflows

Dean Peters | 17:36–17:54
So now that you’ve defined what good looks like, the next logical step is: “How do I automate the process of checking, analyzing, and using that good?” That’s where agentic workflows come in.

Dean Peters | 17:54–18:15
Agents are not magic. They are structured sequences of steps — your steps — that AI executes on your behalf. And if you give those agents your evals, they can test, verify, pull research, rewrite, compare, and even draft ticket backlogs.

Dean Peters | 18:15–18:35
This is why evals come first. If you don’t know what good looks like, your agents don’t know what to do. Evals tell agents how to behave.

Tom Evans | 18:35–18:49
And this is also where PMs get confused because they see tools like LangFlow, or n8n, or Gemini workflows and think: “This feels like engineering.” But it’s just your product thinking structured out loud.

Dean Peters | 18:49–19:11
Exactly. Let me show you the six-step breakdown I use almost every time. It looks like this:

Define the task.

Generate synthetic examples.

Derive eval criteria.

Convert evals into agent steps.

Run the workflow.

Review traces and adjust.

Dean Peters | 19:11–19:33
That’s it. It’s literally “write acceptance criteria → automate acceptance criteria.” And again, traces tell us where the system falls short. If something is failing, you update your evals, not your entire project.

Agentic Workflows: Automating Research, Drafting Backlogs, and Synthesizing Insights

Dean Peters | 19:33–19:57
All right, so let’s shift into a demo of agentic workflows. This is where the lightbulb really goes on for most PMs because you’ll see the grunt work disappear.

Dean Peters | 19:57–20:23
For example, competitive intelligence: instead of spending hours on Google, YouTube, Reddit, app reviews, LinkedIn, or wherever you look, you build an agent that goes through those sources in the background and returns structured comparisons.

Dean Peters | 20:23–20:47
Or user research synthesis: instead of manually summarizing 20 discovery calls, the agent pulls patterns, pain points, quotes, contradictions, and opportunities.

Tom Evans | 20:47–21:02
And this is where PMs really start to see leverage. Because once the agent can pull insights for you, your backlog starts writing itself.

Dean Peters | 21:02–21:31
Exactly. Let me show you a workflow I built in VS Code with Claude Code plugged in. I generated a small “ProdBot.” I asked it to identify market size for AI chatbots. It made a live call. Then I asked it for stakeholder maps — another live call. This is all automated behind the scenes.

Dean Peters | 21:31–21:54
Then I said: “Give me more delight. I want magic.” And the agent rewrote the UI scaffolding. This is the future — PMs shaping tools, not waiting for engineering.

Vibe Coding: Proof-of-Life Prototypes in Minutes

Dean Peters | 21:54–22:17
All right — let’s go into vibe coding. This is going to be the most fun part of the session. Vibe coding is basically “give me a working prototype that communicates the idea without needing engineering.”

Dean Peters | 22:17–22:45
And it works beautifully when paired with evals and agents. Because vibe coding is about showing, not telling. Stakeholders don’t want a 25-page PRD. They want something they can click on — they want to feel the idea.

Dean Peters | 22:45–23:11
And the tools have gotten wild. Claude Code can scaffold an entire front-end. Gemini can generate an interactive simulation. VS Code with AI extensions can generate UI components, data mocks, and event handlers.

Dean Peters | 23:11–23:40
Let me show you what I built. I didn’t like the initial UI that GPT generated, so I pasted it into VS Code, used the integrated AI, and asked: “Make this delightful. Make this magical.” And it regenerated the layout, rearranged components, and created a little joy.

Dean Peters | 23:40–24:04
And the final step — I can hand this prototype to stakeholders. Or to a user. Or to engineering. Or to design. And everyone gets aligned instantly because it’s not a paragraph describing the feature — it’s the feature.

Tom Evans | 24:04–24:18
And by the way, the beauty of vibe coding is that it dramatically reduces misalignment. Everyone sees the same thing.

Dean Peters | 24:18–24:54
Exactly. All right, we could talk about this all day, but we’ve got to keep moving. Let’s move toward key takeaways before we hit the poll.

Poll #1 – What Information Do You Include in Requirements?

Tom Evans | 24:54–25:12
All right, folks, we’re going to launch a quick poll here. We want to get a sense for how you all think about requirements today. Specifically: What information do you typically include in your requirements documents?

Tom Evans | 25:12–25:33
This is always fascinating because every company, every team, every PM seems to have their own version — some super detailed, some really lightweight.

Dean Peters | 25:33–25:55
Yeah, and the reason we’re asking is because when you start using AI, especially when you build context libraries like we showed, your requirements shift dramatically. They become more structured and more reusable.

Tom Evans | 25:55–26:10
All right — poll is open. Go ahead and choose the answer that best describes what you usually include in your requirements. User stories, acceptance criteria, flows, constraints, design comps… whatever it is you usually include.

Dean Peters | 26:10–26:32
Yeah and we’ve got folks across the spectrum. So don’t overthink it — give us what you actually do today, not the ideal version.

Tom Evans | 26:32–26:52
Okay, we’re getting responses coming in pretty fast. We’ll give it just another moment here.

Dean Peters | 26:52–27:10
All right, Tom, what do the results look like?

Tom Evans | 27:10–27:32
Looks like the majority of you focus heavily on user stories and acceptance criteria — no surprise there. A good chunk of you also include flows, UI comps, constraints, or edge cases.

Dean Peters | 27:32–27:54
That tracks. And the reason this question matters is because everything you put in requirements today becomes input for your persistent context workspace tomorrow. That’s the shift — requirements become memory for the whole team.

Q&A and Closing Remarks

Tom Evans | 27:54–28:08
All right, we’re going to shift into Q&A. We’ve had a ton of questions come in, so we’ll get through as many as we can.

Audience Question | 28:08–28:12
“How would you approach this if your company restricts AI tool usage?”

Dean Peters | 28:12–28:38
Fantastic question. First, always align with your company’s policy — never put proprietary info into an unapproved tool. That said, you can still practice context engineering and evals locally using approved environments. Build the structure internally so that when the door opens, your team is ready.

Tom Evans | 28:38–28:55
Yeah, and the reality is: most companies will eventually approve controlled AI usage. The risk isn’t “using AI too early,” it’s your team falling behind by not learning the workflows.

Audience Question | 28:55–28:59
“How do we keep the team aligned when multiple PMs work with AI?”

Dean Peters | 28:59–29:25
Great. That’s exactly why we emphasize persistent context. You don’t want five PMs each with their own version of the product model. You want one shared memory — a single substrate — so every PM’s work reinforces everyone else’s.

Tom Evans | 29:25–29:40
Otherwise you’re basically multiplying chaos. AI makes alignment easier, not harder, when you use shared spaces.

Audience Question | 29:40–29:44
“How do you know when your evals are good enough?”

Dean Peters | 29:44–30:10
When they help you catch drift early. If your evaluations can flag weak outputs before they hit customers, they’re working. Don’t wait for perfect — build evals that cover the fundamentals, then expand as the system grows.

Audience Question | 30:10–30:14
“How do you get engineering bought in?”

Tom Evans | 30:14–30:36
By showing them prototypes and agent workflows that actually help them. When developers see that your requirements are clearer, your traces are structured, your evals are automated, and your prototypes are clickable — they shift from skeptical to grateful.

Dean Peters | 30:36–30:54
Yeah, engineers love clarity. When your AI workflows reduce ambiguity, you’re not taking power away from engineering — you’re giving them better inputs.

Audience Question | 30:54–30:59
“What’s the biggest mistake PMs make with AI?”

Dean Peters | 30:59–31:26
Treating AI like a one-off chatbot instead of a team member. If every PM works separately in siloed chats, you lose 70% of the value. The shift is building shared memory, not one-off conversations.

Tom Evans | 31:26–31:44
All right, folks — we’re at time here. I want to thank all of you for spending this hour with us. This is Part 2 of the series, and we’re excited to keep showing you the practical side of AI for product managers.

Dean Peters | 31:44–32:10
Yeah, this stuff is fun and transformative. Start small: build a context workspace, define a few synthetic evals, create your first agent, and vibe code something rough. You’ll be shocked at how quickly this compounds.

Tom Evans | 32:10–32:28
We’ll send out the recording and the templates, so keep an eye out for that.

Dean Peters | 32:28–32:45
Thank you again. Have a great rest of your day, and we hope to see you at Part 3 of the series.

Tom Evans | 32:45–33:00
Take care everyone.

Webinar Panelists

Dean Peters

Dean Peters, a visionary product leader and Agile mentor, blends AI expertise with storytelling to turn complex tech into clear, actionable product strategy.

Tom Evans

Tom Evans, Senior Principal Consultant at Productside, helps global teams build winning products through proven strategy and practical expertise.

Webinar Q&A

To 10× your role as a Product Manager with AI means dramatically reducing the time required for validation, research, prototyping, and stakeholder alignment. In this webinar, Dean Peters and Tom Evans demonstrate how AI compresses weeks of PM work into days using context engineering, synthetic evals, agentic workflows, and vibe-coding—allowing PMs to learn earlier, validate smarter, and deliver higher-quality solutions faster than traditional workflows.
Context engineering helps Product Managers build persistent AI workspaces that “remember” your product, your users, and your domain constraints. Instead of re-explaining the same information in every chat session, PMs upload artifacts (PRDs, research, frameworks, logs, transcripts) so the AI produces 10× more accurate, domain-aligned outputs. This shared context also breaks PM teams out of isolated chatbot silos and aligns everyone on the same source of truth.
Synthetic evals are AI-generated Acceptance Criteria and Test Cases used to predict and validate product behavior before real users or real data exist. They let PMs catch product drift, identify quality issues, and run scenario testing long before engineering builds anything. This shifts evaluation from reactive to proactive and gives PMs early insight into whether their product direction is feasible, safe, and aligned with user needs.
Agentic workflows use multi-step AI agents to automate the most time-consuming PM tasks—competitive analysis, user research synthesis, backlog drafting, and ongoing insight collection. Instead of manually pulling data, agents run in the background, gather insights, and assemble structured outputs (backlogs, reports, recommendations). PMs shift from “doing the grunt work” to reviewing, refining, and making strategic decisions, massively increasing their leverage.
Vibe coding is an AI-assisted method for generating touch-and-feel, proof-of-life prototypes in minutes using tools like Gemini and Claude Code. PMs provide a prompt and context, and AI instantly generates HTML/JS prototypes with editable components. Instead of waiting weeks for engineering support, PMs can visualize concepts the same day, gather feedback early, and iterate quickly—dramatically compressing the idea-to-prototype cycle.