A framework for building generative user experiences
Inputs, outputs, and strategic considerations
I’ve been thinking about generative UX—generative user experiences—ever since this Stratechery article articulated the idea of generative UI:
Once again, however, the real enabler will be AI. In the smartphone era, user interfaces started out being pixel perfect, and have gradually evolved into being declarative interfaces that scale to different device sizes. AI, however, will enable generative UI, where you are only presented with the appropriate UI to accomplish the specific task at hand. This will be somewhat useful on phones, and much more compelling on something like a smartwatch; instead of having to craft an interface for a tiny screen, generative UIs will surface exactly what you need when you need it, and nothing else.
Nielsen-Norman Group has written about the topic, as well, defining generative UI as “a user interface that is dynamically generated in real time by artificial intelligence to provide an experience customized to fit the user’s needs and context.”
For reasons I’ll articulate below, I think the strategic questions around this topic are broader than just the user interface, per se, so rather than “generative UI” I’m going to talk about “generative UX,” which I’ll define as this:
Generative UX refers to bespoke user experiences dynamically generated at the time of use, for each individual user, based on a comprehensive understanding of the relevant context.
One key distinction of generative vs. traditional deterministic user experiences is that generative experiences are generated on-the-fly, resulting in a new-to-the-world user experience for each user that wasn’t predefined by a designer. User experiences that dynamically adapt are not uncommon, but if they’re just repackaging existing designs then they’re not what I mean by generative UX. To be crystal clear, here are some examples of user experiences that are AI-informed but are not what I mean by generative UX1:
When the iPhone Wallet app shows a notification for a boarding pass to a flight that’s about to depart. While this is a great use of contextual factors (like location or time of day), it’s not a bespoke experience for a given user because anyone with that same flight will receive the same predetermined notification at the same time. Therefore, it’s not generative UX.
When Microsoft Word surfaces a different context menu for different styles of text. These menus may be streamlined to eliminate irrelevant options, but the contents of the menu are not generated on-the-fly, nor are they bespoke to any individual user, and as such don’t qualify as generative UX.
Now, I’m not suggesting that generative user experiences are some far-off possibility for the distant future. In fact, there are actually several examples of generative user experiences in the most popular apps, particularly with respect to the content that is shown to users. Algorithmically-driven consumer apps like TikTok, YouTube, or Instagram have been delivering bespoke content feeds for years, and we’re not far from a future where the content in these feeds is itself dynamically generated, as dystopian as that may seem. So we’ve already crossed the Rubicon of generative UX when it comes to content—but what about other aspects of the experience?
The generative UX stack
To make sense of generative user experiences, it helps to think of them as a stack of five layers that combine to deliver a bespoke user experience to each user. At the top are three layers—content, chrome, and command—that constitute the parts of the user experience with which users directly engage. Beneath them lies context, a pivotal layer that influences exactly how those top layers manifest for each user. And at the foundation is the core, the default UX before any context-driven adjustments are made.
There’s a lot to unpack with this, so I’ll start with a quick overview of each layer, followed by three examples of generative user experiences. These examples are where the concept really comes to life—you’ll see when it makes sense to apply these ideas to the user experience and when it’s better left unchanged.
Content
Content is the what and why of the app, the reason you’re using the app in the first place. It’s the music and podcasts in Spotify, the maps in Google Maps, and the spreadsheets in Excel. And as we’ve already seen, it’s the layer of experience most amenable to generative UX. Beyond algorithmically-driven feeds, we see this starting to show up in productivity apps, where Google Sheets will proactively propose formulas based on your data, or almost every document app can help you generate text on the fly.
Chrome
Chrome is how the app is presented, specifically the things we interact with to engage with content. These are all the different interactive components that make up a design system2, things like navigation bars, buttons, text boxes, sliders, calendar widgets, etc. Chrome is trickier to change via generative UX for a variety of reasons (see “The importance of control,” below), but there are glimpses of this in Spotify’s home screen, pictured below:
Towards the top of the screen there is a bespoke set of eight quick-link buttons, which will be different for each user. Granted, each button is not bespoke (other people who obsessively listen to the Tron soundtrack obsessively will see the same button), but the set of buttons is bespoke and generated at the time of use. I’m classifying this as Chrome rather than Content because these are interactive elements rather than pure content.
Command
Command refers to how users interact with an app’s Chrome and Content. It includes both input methods such as voice, gestures, taps, and clicks, as well as output and modalities such as audio, screen size, screen brightness, cursor speed, etc. Since the contextual factors that influence this layer are more likely to be stable rather than fleeting (for example, the preferences of people with disabilities or particular accessibility needs are unlikely to change from one minute to the next), this layer is moderately amenable to generative UX adaptations.
As with Chrome, examples of generative UX at the Command layer are still rare. But a proto-example of this is how voice assistants like Alexa or Google Assistant can adjust themselves to account for the user’s tone of voice and pronunciation. Since this is a one-time calibration rather than a dynamic, continuous adjustment, it doesn’t quite meet my definition of truly generative UX, but it’s easy to see how it could become such.
Context
Products have been incorporating context into their user experiences for a while, and the idea for such inputs stretches back to a seminal paper3 by a trio of Xerox PARC researchers articulating the idea of context-aware computing as systems that adapt “according to the location of use, the collection of nearby people, hosts, and accessible devices.”
Since that seminal paper in the 1990s, computing systems have gotten ever-more contextually aware—modern phones have a wide array of sensors to measure ambient light, physical movement, location, and even barometric pressure. But the dynamism enabled by generative UI unleashes a ton of possibilities for leveraging these sensors to improve the user experience, so I thought it’d be instructive to understand the different dimensions of context that might be incorporated into generative UI experiences.
There are several taxonomies4 breaking down different dimensions of context5, but I’ve settled on these six dimensions:
User context: Personal characteristics, including stable traits of our baseline personality (e.g. preferences, abilities) and dynamic states that vary across situations (e.g. mood, fatigue, focus)6
Task context: Details of the activity being performed, including the implicit goals of the task and the sequence of steps to be accomplished to reach said goals
Temporal context: The timeframe of the user experience and any time-related factors such as time of day, task duration, and task urgency
Environmental context: Physical surroundings and ambient conditions, such as location, weather, lighting, and noise.
Social context: Who the user is currently surrounded by or communicating with, as well as any relevant group dynamics or social norms
Technology context: The device(s) being used and their input/output capabilities, battery level, and quality of connectivity
These dimensions of context of course don’t live in isolation, but I’ve found it helpful to break these out as a reminder that there’s much more to context than obvious factors such as our physical surroundings or how we’re feeling in the moment. And in the examples that I’ll walk through below, it’s often the combination of these contexts that yield the most compelling use cases for generative UI.
Core
The Core layer is simply the default user experience prior to context-driven adjustments. As mentioned above, the core user experience could be adaptive and dynamic even without being bespoke and generated at the time of use. Recall the iPhone Wallet notification for boarding passes mentioned above—this dynamic user experience would become generative if, for example, the boarding pass notification used custom text to tell me when I should arrive at the airport to account for security wait times or based on my known walking speed.
Three examples
Okay, enough theory—let’s get into some examples! I’ll explore three popular apps across a range of personal and work-related use cases, imagining how generative UX could improve them by contrasting the current experience with a generative version. And—naturally—I used ChatGPT (with some very pointed prompts) to help brainstorm these ideas!
Example #1: Spotify
Let’s imagine a situation where a user is working on their computer late at night with an urgent deadline. Spotify knows the current time, of course, but also that the user has an urgent deadline based on the operating system having visibility across apps (a permission granted earlier by the user). Based on that context, as well as Spotify’s knowledge of the user in similar situations, it knows that the job-to-be-done (JTBD) that the user is hiring Spotify for is to obtain maximal focus.
Example #2: Microsoft Excel
In this example, our user is a structural engineer working at a construction site under bright sunlight. They’re manually entering data from a temporary strain gauge sensor into an Excel spreadsheet on an iPad to ensure the structural integrity of the building is within safety limits. Excel knows that the user is at the construction site, it uses the iPad’s brightness sensor to detect high levels of sunlight, and it knows that the user has already created a spreadsheet for a task that, it reasons, is related to this construction site.
Example #3: Google Maps (with AirPods)
Imagine a user cycling—for the first time—through the hilly terrain of Marin County, where the scenery is amazing but the mobile connectivity is spotty. The user wants clear navigation guidance and to be able to anticipate changes in elevation, so they’ve mounted their iPhone to their bike and are wearing a future version AirPods (equipped with hypothetical haptic feedback capabilities7. Google Maps understands the user’s location, their speed and other aspects of their movement, their route and its terrain, and traffic conditions.
The importance of control
“Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.”
– Jeff Goldblum, as Ian Malcolm in Jurassic Park
One consistent theme in these examples is the tension between control and friction. Should designs proactively change in anticipation of user needs, from which users can opt-out, or should users be asked to opt-in before any changes are made? The former has the potential to decrease cognitive friction (“don’t make me think!”), but could come at the expense of user control. And vice versa for the latter.
I suspect that as generative UX becomes more and more possible, there’ll be an overreaction and way too many things that should remain static will become dynamic. As Jeff Goldblum’s character in Jurassic Park said8: “Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.”
I think the synthesis here is to ensure that users always feel in-control. As Jakob Nielsen’s canonical 3rd usability heuristic states, “Users often perform actions by mistake [and] need a clearly marked "emergency exit" to leave the unwanted action without having to go through an extended process.”
So it seems paramount to ensure that users always feel9 like they’re in-control. The generative UX pendulum will swing back and forth, but over time user expectations will likely settle into a new equilibrium in which user experiences are much more dynamic—and much more helpful—than they’ve historically been.
Strategic considerations
As exciting as the potential of generative UX may be, its adoption hinges on overcoming some significant strategic challenges. As the examples above illustrate, generative UX depends on a rich understanding of context, but today not all context is readily available to all apps. To fully unlock the power of generative UX, we need to address three key challenges:
Overcoming data silos: Today contextual data is fragmented across different apps, devices, and ecosystems, meaning different companies—with different incentives and business models—control different pieces of the context puzzle. Will Spotify be able to view my Google Calendar to infer that I need a bespoke ‘focus’ playlist? Finding ways to bridge these silos without introducing unnecessary complexity will be critical to unlocking the power of generative UX.
Low-friction permissions: Of course we could prompt users for permission to share data across apps every time it was needed, but that will often be more trouble than it’s worth. Do I really want to click through three popups to link my Google Calendar to Spotify just to get a bespoke playlist? The challenge here is designing low-friction systems for user permissions that preserves user control (that sacred 3rd heuristic of Nielsen’s) and trust that their data is protected.
Privacy-preserving context abstractions: One way to thread the needle posed by the above two points is OS-level context management, where platforms like iOS or Android provide high-level signals (e.g., “User has urgent deadline” or “Strong likelihood of desire for hands-free operation”) rather than exposing raw data to every app. Of course, a key challenge here is ensuring enough contextual detail is preserved to make the generative UX changes valuable. Providing such a layer of abstraction could be a way for OS providers to differentiate their platforms from rivals while preserving privacy.
Toward contextual computing
I hope the generative UX stack framework presented here is helpful in guiding the creation of generative user experiences—I certainly found it helpful just in imagining the three examples shared above. My prompts for generating the examples (which I then edited quite a bit) were to first imagine various conditions within the six dimensions of context, then mash them up to create specific scenarios, and finally to articulate potential UX changes at the content, chrome, and command levels. I suspect that creative designers and engineers who apply the guidelines of this framework will imagine even more creative ideas than these!
But of course we would do well to remember that just because we can do something doesn’t mean we should. We need to ensure that generative user experiences add real value while being careful to preserve privacy and ensure users remain in control, lest we erode user trust. And like the integration of any new technology into user experiences, the process won’t be perfect—products will inevitably over-rotate in favor of doing too much, users will inevitably push back, and over time a new equilibrium will be reached. “Life,” as they say, “finds a way.”
Just to be crystal clear, I’m also not talking about using tools like Cursor or Claude to generate deterministic UX designs.
Schlitt, Adams, and Want’s paper describes context as “where you are, who you are with, and what resources are nearby,” adding that this includes “lighting, noise level, network connectivity, communication costs, communication bandwidth. and even the social situation: e.g., “whether you are with your manager or with a co-worker.”
Bradley and Dunlop’s literature review discussed several different taxonomies of context, if you’re interested in a super deep dive into the topic.
Wigelius and Väätäjä found five dimensions of context: social, spatial, temporal, infrastructural, and task.
“Traits and states” is a classic concept in psychology. Traits are sort of our baseline personality, i.e. the ways that we typically think, feel, and act. States are temporary conditions—think mood or stress—that can temporarily override or amplify our traits. For example, you might normally be pretty laid back (trait), but when you’re rushing through airport security to catch a flight, you’ll probably be stressed out and full of anxiety (state).
This is such a great moment in Jurassic Park. I thought about embedding this clip in this article but didn’t want readers to get distracted!
Understanding whether a design affords users a sense of control is a great use case for qualitative UX research.