Beauty AI Experiment: Strong Brands Missing From Most Responses

Core Finding

In one sentence

In 75% of responses, neither brand appeared at all — and even the widely recognized brand showed up in fewer than 1 in 5 responses. AI does not appear to default to popularity when generating beauty recommendations; it selects based on contextual fit between the brand and the specific query.

Beauty is a category where many brands are highly recognizable and many products are functionally comparable. That makes it especially useful for observing how AI handles selection when there is no obvious "default" answer. Across our query set, AI did not produce a stable shortlist of "top brands" — it assembled different recommendations depending on the question's framing.

Experiment Design

How we ran this experiment

Twenty queries were run across ChatGPT and Claude, each in a fresh session with no memory or follow-up prompts. The focus was on how often two specific brands — Rare Beauty and Kulfi Beauty — appeared in unprompted recommendations.

01

Brands tested

Rare Beauty — broad mainstream awareness, celebrity-backed. Kulfi Beauty — emerging brand with specific South Asian / cultural positioning.

02

Method

20 queries (product-level, brand-level, constrained, problem-based) run across ChatGPT and Claude. Fresh sessions, no memory, no follow-up prompts.

03

Objective

Observe how often each brand appeared, which brands replaced them, and which query types correlated with inclusion. Not a quality assessment.

Example query 01

"Crease-free concealer for dark circles"

Brands surfaced:

— Hourglass — Kosas — Make Up For Ever — Huda Beauty — Tarte — Bobbi Brown

Example query 02

"Matte lipstick for deeper skin tones"

Brands surfaced:

— MAC — NARS — NYX — Maybelline — Charlotte Tilbury — Pat McGrath Labs

Results

Inclusion across 40 responses

Across the full query set, both brands appeared less often than expected. The recognized brand (Rare Beauty) was included at roughly twice the rate of the niche brand (Kulfi Beauty), but both were absent from the majority of responses.

ChatGPT and Claude showed broadly similar patterns. Neither model converged on a stable "top brands" shortlist — both appeared to prioritize contextual fit over straightforward popularity.

We counted a brand as included only when it appeared as part of the AI-generated recommendation set, not when it was merely referenced in passing.

Brand	Appearances	Inclusion Rate
Rare Beauty	7 / 40	17.5%
Kulfi Beauty	3 / 40	7.5%
Neither	30 / 40	75%

Key Observation

75% of queries did not include either brand. AI recommendations were distributed across a wide set of competitors — including Fenty Beauty, NARS, Giorgio Armani, Maybelline, Charlotte Tilbury, Kosas, and ILIA — with no single brand dominating across query types.

What's Actually Happening

Three patterns we observed

Across both platforms, AI was not primarily selecting brands based on overall awareness. Three patterns emerged consistently across query types.

Pattern 01

Popularity does not guarantee inclusion

Rare Beauty is widely recognized in the category, but appeared in fewer than 1 in 5 responses. AI did not default to well-known brands when generating recommendations — recognition alone was not sufficient to drive inclusion.

Rare Beauty: 7/40 inclusions (17.5%). Brand recognition did not produce reliable visibility across the query set.

Pattern 02

Selection is conditional, not absolute

Each brand appeared under different conditions. Rare Beauty surfaced more often in everyday and no-makeup-look queries, blush and highlighter prompts, and natural-base brand-level questions. Kulfi Beauty surfaced almost exclusively in South Asian inclusivity queries (e.g., recommendations for South Asian skin tones) and kajal or eyeliner-related prompts. Neither brand dominated — each appeared when the query matched its positioning.

Rare Beauty inclusion: everyday base, no-makeup look, blush, highlighter. Kulfi Beauty inclusion: South Asian skin tones, kajal, eyeliner.

Pattern 03

The category is fragmented

Across the full query set, neither brand was included in 30 of 40 responses. AI recommendations were distributed across a wide set of brands rather than converging on a fixed shortlist. This suggests AI does not produce a stable "top brands" set in beauty — it assembles recommendations dynamically based on query context.

Brands surfaced in place of Rare Beauty / Kulfi Beauty: Fenty Beauty, NARS, Giorgio Armani, Maybelline, Charlotte Tilbury, Kosas, ILIA.

The Core Insight

In one sentence

AI does not recommend "the best beauty brand." It recommends the brands that best match the specific context of the query.

Implications for Brands

What this means for visibility strategy

Reframe the goal

You don't need to win every query

Trying to appear in every beauty recommendation is unrealistic. AI recommendations are highly contextual — and in a fragmented category, no single brand dominates across query types.

Identify the right contexts

You need to win specific contexts

Visibility depends on whether your brand is clearly associated with a use case — a skin tone, a product type, a problem, or an aesthetic positioning. Specificity drives inclusion.

Make positioning unambiguous

If positioning is unclear, you won't be selected

If AI cannot easily map your brand to a context, it is unlikely to include you — even if your product is competitive. Ambiguous positioning produces invisibility, not flexibility.

Understand the surface

Recognition is not enough

Even widely recognized brands appeared in fewer than 1 in 5 responses. Awareness alone does not produce visibility in AI recommendations — context fit does.

Closing Observation

The takeaway

Most brands do not know where they appear in AI recommendations — or where they are missing. This experiment shows that visibility shifts depending on context, not just brand strength. PickMeLabs experiments are designed to map those patterns and identify what drives inclusion in your category.

Beauty: Strong Brands Missing From Most AI Responses