We tested how AI systems recommend beauty brands across a set of buyer-style queries. The goal was not to evaluate product quality, but to observe which brands are included in responses — and how often.
Across 40 AI-generated responses, the brands we tested appeared in just 25% of cases — meaning they were not considered 75% of the time. This suggests that even strong brands are frequently excluded from AI recommendations.
In 75% of responses, neither brand appeared at all — and even the widely recognized brand showed up in fewer than 1 in 5 responses. AI does not appear to default to popularity when generating beauty recommendations; it selects based on contextual fit between the brand and the specific query.
Beauty is a category where many brands are highly recognizable and many products are functionally comparable. That makes it especially useful for observing how AI handles selection when there is no obvious "default" answer. Across our query set, AI did not produce a stable shortlist of "top brands" — it assembled different recommendations depending on the question's framing.
Twenty queries were run across ChatGPT and Claude, each in a fresh session with no memory or follow-up prompts. The focus was on how often two specific brands — Rare Beauty and Kulfi Beauty — appeared in unprompted recommendations.
Rare Beauty — broad mainstream awareness, celebrity-backed. Kulfi Beauty — emerging brand with specific South Asian / cultural positioning.
20 queries (product-level, brand-level, constrained, problem-based) run across ChatGPT and Claude. Fresh sessions, no memory, no follow-up prompts.
Observe how often each brand appeared, which brands replaced them, and which query types correlated with inclusion. Not a quality assessment.
Across the full query set, both brands appeared less often than expected. The recognized brand (Rare Beauty) was included at roughly twice the rate of the niche brand (Kulfi Beauty), but both were absent from the majority of responses.
ChatGPT and Claude showed broadly similar patterns. Neither model converged on a stable "top brands" shortlist — both appeared to prioritize contextual fit over straightforward popularity.
We counted a brand as included only when it appeared as part of the AI-generated recommendation set, not when it was merely referenced in passing.
| Brand | Appearances | Inclusion Rate |
|---|---|---|
| Rare Beauty | 7 / 40 | 17.5% |
| Kulfi Beauty | 3 / 40 | 7.5% |
| Neither | 30 / 40 | 75% |
75% of queries did not include either brand. AI recommendations were distributed across a wide set of competitors — including Fenty Beauty, NARS, Giorgio Armani, Maybelline, Charlotte Tilbury, Kosas, and ILIA — with no single brand dominating across query types.
Across both platforms, AI was not primarily selecting brands based on overall awareness. Three patterns emerged consistently across query types.
Pattern 01
Rare Beauty is widely recognized in the category, but appeared in fewer than 1 in 5 responses. AI did not default to well-known brands when generating recommendations — recognition alone was not sufficient to drive inclusion.
Rare Beauty: 7/40 inclusions (17.5%). Brand recognition did not produce reliable visibility across the query set.
Pattern 02
Each brand appeared under different conditions. Rare Beauty surfaced more often in everyday and no-makeup-look queries, blush and highlighter prompts, and natural-base brand-level questions. Kulfi Beauty surfaced almost exclusively in South Asian inclusivity queries (e.g., recommendations for South Asian skin tones) and kajal or eyeliner-related prompts. Neither brand dominated — each appeared when the query matched its positioning.
Rare Beauty inclusion: everyday base, no-makeup look, blush, highlighter. Kulfi Beauty inclusion: South Asian skin tones, kajal, eyeliner.
Pattern 03
Across the full query set, neither brand was included in 30 of 40 responses. AI recommendations were distributed across a wide set of brands rather than converging on a fixed shortlist. This suggests AI does not produce a stable "top brands" set in beauty — it assembles recommendations dynamically based on query context.
Brands surfaced in place of Rare Beauty / Kulfi Beauty: Fenty Beauty, NARS, Giorgio Armani, Maybelline, Charlotte Tilbury, Kosas, ILIA.
AI does not recommend "the best beauty brand." It recommends the brands that best match the specific context of the query.
Trying to appear in every beauty recommendation is unrealistic. AI recommendations are highly contextual — and in a fragmented category, no single brand dominates across query types.
Visibility depends on whether your brand is clearly associated with a use case — a skin tone, a product type, a problem, or an aesthetic positioning. Specificity drives inclusion.
If AI cannot easily map your brand to a context, it is unlikely to include you — even if your product is competitive. Ambiguous positioning produces invisibility, not flexibility.
Even widely recognized brands appeared in fewer than 1 in 5 responses. Awareness alone does not produce visibility in AI recommendations — context fit does.
Most brands do not know where they appear in AI recommendations — or where they are missing. This experiment shows that visibility shifts depending on context, not just brand strength. PickMeLabs experiments are designed to map those patterns and identify what drives inclusion in your category.
Every category behaves differently. We'll run a controlled experiment on your brand and tell you exactly what AI sees — and what it doesn't.