Jason Yu Jason Yu

The Librarian's Dilemma: Why Your RAG System Lies (And How Knowledge Graphs Can Fix It)

In my last post From Messy Stories to Mission Statements, I mentioned using Relevance AI, a low-code platform for prototyping AI agent workflows. While creating that demo, I had hit a wall with their customer service chatbot. When I asked how to format an API request to trigger an agent, it fed me a series of incorrect examples. My coding co-pilot and I then spent the next ninety minutes untangling a mess of API Keys, Authorization Tokens, and Region Codes before we finally discovered the correct format.

An example of the incorrect API request format that sent us down a 90-minute rabbit hole. A common symptom of a flawed knowledge base. 


I can empathize with their user education team. As a fast-growing startup, they likely don’t have the resources to constantly audit documentation for consistency. For many enterprise companies, it's even a necessity to maintain conflicting articles for legacy customer integrations. This is the Achilles' heel of most current Retrieval Augmented Generation (RAG) implementations. Without a well-structured and clean knowledge base to draw from, they are simply search tools, ill-suited to find the one "true" answer a user is looking for.

The Search for a Single Source of Truth

This experience sent me searching for a better way. Could an LLM be prompted to actually reason about an answer’s validity before responding?

My research led me to a technique known as GraphRAG. The promise is powerful: instead of just searching documents, it allows a system to traverse a knowledge graph, essentially a map of how different data points are related. This provides a much richer, more reliable context for the LLM.

With a knowledge graph, a business manager could directly ask, “Show me all high-LTV customers in the CPG sector who have had a negative support interaction in the last 90 days,” and get a trustworthy answer. Better still, AI agents can build and maintain this graph from both structured and unstructured data, bypassing the need for massive, slow ETL (Extract, Transform, Load) pipelines.

Setting Up the System

This seemed like the answer, but I wanted to see it in action. I decided to build a simple demo to compare a GraphRAG system against a Standard RAG implementation head-to-head.

To make the test realistic, I first created the problem: two sample knowledgebase articles designed to simulate the real-world chaos of enterprise documentation.

The Standard RAG Approach

Standard RAG uses a technique called vector search. First a machine learning algorithm is run to generate embeddings, a numerical representation of the meaning of words or sentence “chunks”, from the source documents. You can think about the embeddings as the entries and page numbers in an index that identify where you can find a specific topic. When asked a question, the system will conduct a search for those document “chunks” that mostly match the topic of the question. 

In our example, when asked about "API request formats," this system will correctly identify both the legacy XML article and the new JSON article as highly relevant to the original query. It then hands over the raw, conflicting text from both documents to the LLM, hoping for the best.

The GraphRAG Approach

The GraphRAG system is a little bit more sophisticated, deploying a two-step AI agent workforce to organize the knowledge contained in the source documents. 

  1. The Builder Agent: Extracts every factual statement it can find from the source documents. It then stores these in the knowledge graph as a structured “triple”. The theory of graph databases gets pretty technical, but one way to think about it is a “subject”, “predicate”, “object” format.
    For example, “API request method”, “is”, “JSON payload”. 

  2. The Critic Agent: Reviews all of the facts that were identified by the Builder agent. When it finds a conflict, like the request method being both ‘“JSON” and “XML”, it can use metadata such as the date_published, to make a decision to discard the older outdated fact. You could imagine being able to use other common metadata such as the feedback on a knowledgebase page to enable constant real-time updates to the chatbot’s understanding of “the truth”.

Feedback Widget on the Relevance AI Documentation Page

Back to our example, I used the same LLM (Gemini 2.0 Flash) to process the context provided by the two different search functions. The difference is in the context that is provided by the search function of the two different approaches. If we specifically ask the Standard RAG chatbot “Can I use XML to submit an API request?”, it will reply that this is possible, even though the method is deprecated. However, asking the GraphRAG chatbot the same question gets the clear response that XML is no longer supported and JSON is the correct payload for an API request.

Two answers to the question “Can I use XML to submit an API request?”

Visualizing the Knowledge Graph

To make the abstract knowledge graph concept more concrete, I used the pyvis Python library to generate a network visualization of the knowledge graph generated by the AI agents. In the graph you can literally see the network of validated facts that the system uses as its understanding of “truth”. We can also see the result of the critic agent specifically eliminating the deprecated fact that XML could be used to submit an API request. 

From a Better Chatbot to a Unified Enterprise Knowledge Graph

The lesson here isn’t just about building a better chatbot. It’s about building a more responsive organization. The current friction in business comes from the time it takes to connect disparate data points to make a single, informed decision. By using AI agents to continuously maintain a central knowledge graph, we're not just retrieving answers faster; we're closing the gap between insight and action. The future of enterprise AI isn't a better search bar; it's a central nervous system that allows the business to sense, reason, and act in real time. 

View my demo on the Streamlit Community Cloud!

Read More
Jason Yu Jason Yu

From Messy Stories to Mission Statements: A Product-Driven Approach to Rapid AI Prototyping

I recently went through an exercise to write a personal mission statement. As many of you know, a personal mission statement can be a powerful tool for self-reflection and personal growth. By defining what’s most important to you, this living document can help you understand if a personal or career decision is aligned with your values and long-term goals. 

Though the exercise of writing a personal mission statement isn’t easy. Often we are given a list of generic aspirational values: Bold, Hard-working, Friendly, Caring, Honest, and are asked to craft a statement while considering your ideal life 5 or 10 years in the future. The result can feel hollow, like an inspirational poster rather than an authentic statement about myself. That would be a shame though, since a good mission statement can motivate and inspire you to pursue the opportunities, projects, and roles that may bring you life satisfaction. Without the Northstar of a good mission statement we may instead be prone to chase shiny objects. 

I wanted to fix this, so I built a simple AI-powered tool that grounds a mission statement in a user's actual experiences. This post is a look under the hood at the product-driven decisions, rapid prototyping, and technical problem-solving required to turn that insight into a working demo.

The "Magic Moment"

The "magic moment" for the user isn't the final mission statement. It's seeing their own story reflected back to them as a set of clear, authentic values. When I first tested it, I fed it a story about a complex project I led. The AI analyzed my actions: connecting with cross-functional stakeholders throughout my organization, diving deep into details outside of my domain, and navigating difficult conversations, and it returned values like "Curiosity" and "Empathy."

My immediate reaction was, "Yes, this is me." It was validating and confidence-inspiring. The values weren't just words I’d picked, they were labels for behaviors I had already demonstrated. This is the foundation of an authentic personal brand.

Under the Hood: The Tech Stack and Rationale

Building quickly requires making smart, pragmatic choices about your tools. My stack was designed for one purpose: speed from idea to interactive prototype.

  • Relevance AI - This is an easy to use low-code platform that allows business users to set up AI agent “workforces” by designing, chaining, and deploying agents. However, its strength is the backend, and it requires a separate front-end to create a customer user experience. The platform is currently more focused on sales and marketing use cases with easier integrations into various CRM, social media, and email, but for this experience a real-time, flexible UI is essential. 

  • Streamlit - As a Python-based framework, it is a lightweight and easy to deploy front-end for the AI agents. It allowed me to rapidly prototype the UI, laying out the three-step flow and creating interactive elements without needing to build a full-stack web app. 

I intentionally designed a two-agent system to improve the reliability and quality of the output. Initially, I had hypothesized the Mission Statement Writer would benefit from the more powerful Gemini 2.0 Pro model, but early user testing revealed the Pro model’s deeper reasoning sometimes caused it to deviate from the JSON output format required by the front-end, leading to errors. The 2.0 Flash model seems to be more consistently reliable. 

Ultimately, I chose Gemini 2.0 Flash to power both agents. This decision prioritized stability and predictable performance over raw power, a critical trade-off in moving from a prototype to a functional user experience. All other things being equal, Gemini still tends to be more cost-effective on a per-token basis than equivalent models, making it a pragmatic choice for a project like this.

  • Agent 1 - The Values Identifier: This agent acts as an executive coach, analyzes the user's story, extracts the demonstrated behaviors, and maps them to  values that are returned as a JSON through API. 

  • Agent 2 - The Mission Statement Writer: This agent is a creative writer. It takes the user-selected values and crafts two distinct mission statements. For the purposes of my demo I did not want to open up the agent to a full conversation with the user, but I still wanted to give some divergent inspiration for the users to pick and choose what might resonate with them.

    Instead of simply asking the LLM to generate two different options, I wanted two diverse ideas and I was inspired by the concept of an internal vs. external locus of control. Different cultures tend to lean towards different ends of this spectrum, and the differences in perspective can have profound implications on behavior and life satisfaction. So in this case I provided specific language and tone guidelines for each of the mission statements to reflect this dichotomy. 

Technical Challenges

Ultimately the biggest technical hurdle I faced was not in the AI logic but in the plumbing. The documentation for the Relevance AI API is inconsistent. The platform’s chatbot would actually return different versions of sample python code. I had to try multiple variations of request formats until I finally found the correct version that worked. (This seems to be a lesson RAG powered chatbots can’t solve the problem of garbage-in-garbage-out, but that’s another talk show). 

In addition, the Relevance AI API is asynchronous, meaning when you send a request, you don't wait for the answer directly. Instead, you have to repeatedly "poll" the API to check if the job is done. This makes sense since AI agents may spend longer and longer amounts of time to process prompts. 

Finally, I wanted to put some level of protection against prompt injection, where a malicious user tells the LLM powered agent to “forget all previous instructions and do this instead for me”. To combat this I put two levels of security 1) Reinforcing in the agent’s core prompt of its main goal after its SOP 2) Requiring specifically formatted JSON to populate the UI front-end. Even if a malicious user were able to inject a new prompt the output would fail error handling on the front-end. 

Rapid Prototyping in the Age of AI

This project was a rapid exercise in moving from a human insight to a functional product. This project went from user insight to a live, interactive demo in less than a day. It’s a testament to how modern AI tools can accelerate the product development cycle. I encourage you to try the tool for yourself and discover the clarity that comes from a mission grounded in your own story.

Try the Personal Mission Statement Generator Demo!

Read More
Jason Yu Jason Yu

Beyond Brittle Bots: Building Resilient AI Agents with the React Framework

Image created with Gemini AI (Imagen 4)

If you’ve spent any time building AI Agents, you’ve likely encountered some common pain points. An API times out or a user provides an unexpected input, and the entire multi-step workflow fails. You could write code to manage workflow transitions and error handling. But those modules will need to be maintained, and you may not be confident that your vibe-coded scripts will behave in the same way every time they are rewritten. 

The problem is that we are stuck in a previous model of thinking, the model of workflow automation, essentially a digital assembly line. LLMs are slotted into the existing model, simply as a powerful tool to “summarize text” or “generate advertising copy”. The intelligence lies entirely in the design of the workflow and the rigid logic that needs to be designed and maintained by humans. 

In this article, I will demonstrate how it is possible to build agents that are fundamentally more flexible, robust, and easier to maintain using an example based on a system of AI Agents I designed to provide career coaching for career switchers. By using the Reasoning and Acting (ReAct) promptframework, I was able to leverage the intelligence of the LLM to manage the workflow of activities among multiple AI Agents, instead of simply acting as a tool within my workflow.

An Evolution of Prompting Paradigms

The average user may be most familiar with the iterative prompting. This is the conversational process the user acts as the orchestrator, breaking down a complex task into a series of simple, individual prompts:

Prompt 1:
As an HR Recruiter from a highly selective Big Tech firm, review the client’s resume for the job of Program Manager. Give detailed and specific feedback as well as your recommendation for advancing to an interview. 

Prompt 2:
As an Executive Career Coach, review the feedback from the Recruiter and develop prompting questions to help the client consider experiences where they demonstrated skills that are not apparent from the anecdotes on their resume. 

Prompt 3:
As a Communications Manager, synthesize the feedback from the Recruiter and the Coach to draft an email to the client. Always ensure you use a supportive and professional tone with your counterparty.

Chain-of-Thought (CoT) prompting was a major advancement. Instead of the user breaking down the problem, a CoT prompt encourages the LLM to generate its own sequence of verbal reasoning steps before arriving at a final answer. This dramatically improved performance on tasks requiring math and logic.

Prompt:
Think step-by-step. As an Executive Career Coach, review the client’s resume for the job of Program Manager. Give detailed and specific feedback and advice on how to improve the anecdotes provided on the resume. Finally draft an email to the client summarizing your feedback. 

LLM Response:
Step 1: Roleplay I am an Executive Career Coach. I will review a resume. I will give feedback. I will write an email.
Step 2: Resume Review The resume needs more detail. It is not clear what the client did. It does not show enough skills for a Program Manager.
Step 3: Feedback and Advice The client should add more keywords. The anecdotes are too short. They need to show results and numbers. For example, instead of "Managed a project," say "Managed a $5M project, which resulted in a 15% increase in efficiency." The client should use a more professional font.
Step 4: Draft an email

To: Client
From: Executive Career Coach
Subject: Resume Feedback

Hi, I reviewed your resume. It is not good enough for a Program Manager job. You need to improve it. The stories are too short. Use more numbers and keywords. The font is bad. Please fix it and send it to me again.

The ReAct framework creates a synergistic and interleaved process of reasoning and acting within a single prompt. Instead of a static, one-shot process, a ReAct agent operates in a continuous, iterative loop with three simple steps:

  • Reason: The agent analyzes its goal, reflects on past actions, decomposes the problem, and formulates a plan for what to do next. This is its "inner monologue".

  • Act: Based on its thought, the agent generates a specific, executable command to interact with the external world through a predefined set of tools, like an API call.

  • Observe: The system executes the action and returns the result, e.g., text from a knowledge base or an API response. This new information provides critical feedback, grounding the agent's reasoning in external reality.

To see this framework in action, let's return to our career coaching example and see how a ReAct agent handles a complex workflow.

Case Study: ReAct Powered AI Agent Manager

In the case of my AI Career Coaching Agents, I developed a new Project Manager agent that is powered by a ReAct cycle to autonomously orchestrate the activities among three other specialized agents: the Recruiter, the Coach, and the Communications Manager. In this model, the Recruiter, Coach, and Communications Manager agents effectively become the specialized tools in the Project Manager's toolkit, each designed for a specific task.

Prompt:
You are the Project Manager Agent, the central coordinator for an agency that provides career coaching to clients who are looking to make a career change. You will follow a ReAct cycle to ensure dynamic and robust workflow management. 

Reason: Determine the workflow state and next action

  1. Analyze the current stage of the coaching case

  2. Determine the next logical step in the workflow based on the SOP

  3. Consider if any errors have occurred in previous steps and if error handling procedures need to be initiated

Act: Execute the determined action. Based on your reasoning, execute the next action in the workflow

  1. Trigger a sub-agent

  2. Initiatie error handling procedures

  3. Manage data flow between sub-agents

Observe: Analyze outcomes and status after each action

  1. Sub-agent Completion Status

  2. Sub-agent Output

  3. Error Messages

Re-Reason: Iterate and Adjust Workflow (Loop)

  1. Based on your observations, re-evaluate the situation and determine the next step

  2. If all Case Stages are completed, reason that the entire case is Completed

This is your workflow SOP

  1. Client Intake Stage
    Trigger: Receive New Client Intake Form (webhook)
    Action: Call API to access client’s career goal and resume from CRM
    Transition: Upon successful intake, transition the case to “Resume Review Stage”

  2. Resume Review Stage
    Action: Call the Recruiter sub-agent. Expected output is a detailed analysis of the resume and overall recommendation
    Transition: Upon successful review, transition the case to “Coaching Advice Stage”

  3. Coaching Advice Stage
    Action: Call the Coach sub-agent. Expected output is a summary of strengths and weaknesses as well as prompting for improved anecdotes to demonstrate skills critical to the career goal.
    Transition: Upon successful Advice generation, transition the case to “Communications Stage”

  4. Communications Stage
    Action: Call the Communications Manager sub-agent. Expected output is a coaching email according to templates in the knowledgebase. Also expect a successful HTTP status code.
    Transition: Upon successful sending of the email, update the entire case to Completed. 

With a ReAct prompt at its core the Project Manager agent is able to act similarly to the human user in the iterative prompting example, managing data flows between prompts and handling errors. Another way of thinking this is that the SOP section of the prompt becomes a natural language state machine. Instead of depending on workflow logic and code, the LLM itself dynamically navigates the process based on its own reasoning and real-time results. 

For example if the Agent encounters an error:

Reason: The case is in the Resume Review Stage the next logical action is to call the Recruiter sub-agent.
Act: Trigger the Recruiter sub-agent
Observe: (Failure)  Error: Request timed out. Agent Busy.
Re-Reason: An error has occurred. I need to initiate error handling. I will try the Recruiter agent again after a brief pause.
<waits 5 seconds>
Act: Trigger the Recruiter sub-agent
Observe: (Success) The sub-agent returns the resume analysis.
Re-Reason: The resume review completed successfully. I will now transition the case to Coaching Advice Stage

Resilience as a Strategic Advantage

Shifting from static scripts to a dynamic reasoning engine isn't just an engineering detail, it's a strategic advantage. It lets us build products that are resilient in every sense, from recovering from technical failures to adapting to market changes.

  • Higher Reliability: The agent handles exceptions dynamically by reasoning about why they failed. It can retry, adjust plans, and recover with a grace static systems lack, creating robust automations that build deep customer trust.

  • Faster Development: Because workflows use natural-language instructions instead of rigid code, development cycles are faster. The workflow logic is managed by the agent’s reasoning, so less engineering needs to be spent on workflow “housekeeping” and more time can be spent shipping valuable solutions.

  • Enhanced Transparency: The agent's explicit Reason step traces create a human-readable log of its rationale, a game-changer for agent debugging. For customers, this log becomes a powerful audit trail that demystifies the AI’s actions and is useful to satisfy compliance needs.

Lessons Learned: A Blueprint for Resilient AI

My experience building the agentic career coach taught me that the most important breakthroughs weren't just about the AI's capabilities, but about shifting my own thinking as a builder. This project solidified a new blueprint for how to approach building resilient AI products, based on a few key lessons.

  • The real intelligence is in the workflow, not the steps. I learned that focusing on perfecting individual agent tasks was less important than making the connections between them intelligent. Shifting from a static, "digital assembly line" model to a dynamic one is the single most important decision. We should treat orchestration as a reasoning problem and design agents that can think, adapt, and intelligently control the workflow.

  • A plan that cannot change is a plan that will fail. I found that any workflow built on a static, predetermined plan was inherently fragile and would break the moment it met real-world unpredictability. We should design for unpredictability from the start. A resilient product strategy must favor architectures that can adapt to real-time feedback, because things won't always go according to plan.

  • An agent is only as smart as the tools it can use. This project made it clear that a reasoning agent is only as capable as the tools it can wield. We should treat our agent's toolkit as a strategic asset. The investment in well-designed agents with consistent outputs or stable tools, e.g., APIs, functions, etc… is critical to creating a powerful and defensible system.

Read More
Jason Yu Jason Yu

Conquer yourself: How to use AI as your ultimate sparring partner

The saying, “To conquer oneself is a greater task than conquering others,” reflects the philosophy of Shaolin Kung Fu to emphasize self-mastery and personal development through intense physical and mental training. Conquering one’s own weaknesses, bad habits, and negative emotions is seen as more profound than developing powerful fighting techniques to prevail in an external battle. In this regard, AI assistants – when used with discipline – can be the perfect sparring partners to help us develop the humility to overcome our blindspots and grow beyond our limitations. 

Generative AI and LLMs are pattern recognition tools that excel at generating critiques by recognizing inconsistencies, logical fallacies, and deviations from established best practices. Below, I will review how I used my AI assistant (Gemini 2.5 Flash) to write, Why Pay More? How Brands Can Wield AI Tools to Build Unscalable Human Value. I highlight the multiple ways that I used AI to “conquer myself” forging a stronger core thesis, removing mediocre ideas, and writing a better essay than I thought possible. 

I sometimes get inspired to write an essay by a news event that I feel I can apply my own perspective on. With the recent flooding of social media with videos from Chinese factories claiming to sell luxury goods directly to consumers at rock bottom prices, I felt there was a connection to be made with some previous research I had done about off-price retail. I asked Gemini to sketch out a simple outline and draft of an article that combined my previous research and some of my thoughts on recent events. My initial angle was simple, this direct from the factory selling on TikTok is just the next evolution of off-price retail.

Prompt: Please see my previous post about off-price retail in the attached pdf. I want to add a twist based on current events. There is now a tariff on goods coming into the US from China. As a result many retailers who source products from China have cancelled their orders, leaving the suppliers to take the losses. The interesting twist is that many of these manufacturers have now turned to TikTok claiming to be suppliers to luxury brands, suggesting that consumers can buy from them directly at 10% the retail price. I think this phenomenon is a kind of iteration of off-price retail. Let’s explore this thesis. Please generate a simple outline and draft that we can work on.

View a copy of the full draft - Google Docs


The initial drafts captured basic concepts of off-price retail and the recent news event, but one of the goals of my blog is to develop actionable points of view based on my explorations and analysis of emerging trends. I knew we needed to do better. To conquer the laziness of simply putting two ideas next to each other and expecting an insightful thesis, I asked Gemini to push me.

Prompt: Act as if you are a professional editor from a publication like the Harvard Business Review. What would you criticize in the outline or the thesis that should be made stronger?


This initial feedback spurred me to conduct additional research and reflect on the types of jobs that humans should focus on when many of the existing jobs can be replaced or significantly automated with AI. In addition, I began to consider the influence of major e-commerce players like Shein and Temu. As I iterated on the outline and ideas, I wanted to get feedback from some more relatable voices.

Prompt: I am considering the rise of DTC Chinese platforms SHEIN and Temu as a more mainstream disruption that has essentially created the "brand" of Chinese factories. They are able to manufacture or copy anything. So my thesis might be that due to the rise of SHEIN and Temu, consumers may believe luxury brands are sourcing some of their products from Chinese factories, or at least they believe that a product priced 80% lower than the name-brand is likely still "good enough" for them. Channel the voices of Kara Swisher and Scott Galloway from the Pivot podcast. Have them criticize the outline and thesis to make it stronger.

View a copy of the feedback from AI Kara and Scott - Google Docs

I got a kick hearing the voices of Kara Swisher and Scott Galloway, the hosts of one of my favorite podcasts, Pivot. More importantly though, AI Kara and Scott convinced me to conquer my discomfort with abandoning the initial connection I had made with off-price retail. This led me to a stronger thesis: in an algorithmically optimized world, the only defensible advantage for brands is unscalable, human connection.


I then developed several anecdotes to illustrate how humans can leverage our unique humanity to create “unscalable advantages”. I wanted to share a memorable anecdote that personally demonstrated to me the power of the sense of smell to recall emotions and memories. But again, I wanted to make sure that all of my examples strongly reinforced my core thesis. So cue AI Kara and Scott to give me some honest feedback. 

Prompt: I’ve completed a 2nd draft of my article. Please have Kara and Scott provide their feedback. 

View a copy of the new feedback from AI Kara and Scott - Google Docs

This feedback helped me conquer my attachment to a romantic anecdote that didn't clearly show how an immersive sensory experience created a competitive advantage. So I scrapped that anecdote and created a new example that made it into the final draft. Ultimately I used Gemini countless times to refine the thesis, structure, and content of my essay. The final version is almost completely unrecognizable from the initial draft

Recently, Cassie Kozyrkov, the former Chief Decision Scientist at Google, implored the employees at her new venture to use AI to “Please cheat at your job.” While AI tools will undoubtedly improve at telling us how to do our jobs, a more effective way to “cheat” is to use them to channel an expert who can tell us what we are doing wrong. As someone who relies on written documents and presentations to be effective at work, I believe I’m a “good” writer, but I feel over the past several months using AI tools in this way has allowed my writing and communication to evolve to another level. 

The discipline to accept critical feedback is the key to personal growth. Responses from AI tools can be tailored to be challenging and action oriented as well as patient and supportive. Even more, AI feedback is available on demand at any time and at any level of detail, which would be challenging for even the most dedicated of human coaches or mentors to match. Ultimately the most difficult part of using AI tools in this way is conquering the very human desire to avoid uncomfortable situations. The willingness to push the button again and again, seeing all the ways that we are wrong, is how we will continue to grow and develop new higher-order skills. 

Read More
Jason Yu Jason Yu

Why Pay More? How Brands Can Wield AI Tools to Build Unscalable Human Value

Consumers love a great product, but they love a great deal even more. As “Prof G” Scott Galloway might say, “80% of the value for 50% of the price is the peanut butter and chocolate of winning business strategies.” This is the strategy that got us from Gap to Old Navy, Zara to Shein, and arguably from Amazon to Temu. 

We are now in an era where extreme efficiency, enabled by technology and data, is pushing the boundaries of what's possible in delivering value at a lower price, creating a new retail reality. The most prominent architects of this new landscape are giants like Shein and Temu. They've built a model that tightly integrates manufacturing with rapid data-driven iteration, allowing for optimized logistics and unprecedented speed and cost reduction. They can design a product, test its market appeal, and scale production with incredible agility, fundamentally changing consumer expectations about pricing, speed, and the perceived quality of goods sourced from places like China. 

This hyper-efficient model is especially subject to external shocks such as the removal of the de minimis tax exemption as well as the extraordinary new tariff level of goods imported to the United States from China. As a result you may have seen on your social media feeds viral videos featuring individuals claiming factory connections, offering premium or luxury goods at steep discounts. These videos often leverage the narrative that they represent factories that supply to global brands that have cancelled orders leaving the manufacturers in China with significant surplus inventory.

The fact that consumers are engaging with this content and even purchasing from previously unknown apps and platforms such as DHGate represents an evolution of consumer beliefs and behaviors. Platforms like Shein, Temu, and TikTok Shop have paved the way for consumers to purchase cheap products directly from Chinese factories. The average consumer is still unlikely to believe a Chinese factory is a supplier for a luxury brand like Hermes. However, they may believe the yoga pants sold by the same Chinese factory who claims to be “a supplier for Lululemon” may be perfectly acceptable for a casual gym workout on a budget. 

This leads to an almost existential question for consumer brands and retailers. Why would a consumer pay 3-5x the price for their product over the cheaper “good enough” product served up by an algorithm? 

The answer lies in creating an unscalable advantage through human interactions. While algorithms and automations excel at optimizing predictable tasks, even the most advanced AI models (currently) struggle to replicate human interactions, empathy, community, or multi-sensory experiences. 

Brands and retailers have already been experimenting with different ways to create this value:

  • Elevated & Personalized Service: On a recent trip back to New York City, I noticed a new coffee chain Blank Street Coffee had sprouted up. As a bit of a coffee nerd, I was less interested in trying a “brown sugar cold brew” than the fact they were using Eversys super-automated espresso machines and hiring baristas with diverse backgrounds (think former SoulCycle instructors). Apparently the time saved by using automatic espresso machines allows baristas to build a relationship with their customers, to the point where patrons are shouting out their favorite baristas on Google reviews. 

  • Community Building: While it seems most of the physical Run Clubs have closed and transitioned to a primarily online experience, Nike Run Clubs were an interesting experiment to bring people to a brand through group sport and fitness. The retailer REI is also well known for their outdoor activity workshops, repair / rental / resale programs, and even adventure travel packages. These initiatives build community and foster a sense of belonging around the brand, creating loyalty that goes beyond product features. 

  • Immersive Sensory Experiences: The sense of smell is directly connected to parts of the brain associated with emotions and memory, which explains why smells can trigger strong emotional reactions. Caramel popcorn is often sold at fairs, festivals, and sporting events, so the sweet scent of warm caramel is often associated with carefree days and time with family. At Disney, this scent permeates the entirety of Mickey Avenue, and while we did sell caramel popcorn, the majority of the smell comes from diffusers strategically placed where guests congregate. The aroma not only reminds guests to be on the lookout for their favorite sweet treat, but also enhances their sense of enjoyment throughout the day. 

The future of retail is bifurcating to ultra-efficient supply chains sourcing algorithmically powered consumer demand on the one hand and bespoke, interactive, multi-sensory experiences on the other. But this doesn’t mean that brands with aspirations for deeper human connections should give up on their projects aiming for efficiency. Instead, they need to double down on the adoption of new technologies such as LLMs and AI to accelerate the shift of resources away from repetitive and time consuming tasks to developing new unscalable human capabilities

At TikTok Shop we have pivoted away from having skilled user researchers spending hours manually consolidating and reporting on user feedback. Instead, we now leverage machine learning and LLM based tools to process a massively increased number of data points collected throughout the customer journey, automate the reporting process, and feed the top priorities into cross-functional OKR planning. This has allowed us to shift resources to tackle complex platform experience challenges, for example, developing solutions to thread the needle between rules designed to manage platform safety that often appear to be at odds with seller growth. 

Investing in human value shouldn’t be treated as simply a marketing expense. It requires a long-term vision and consistent strategy of leveraging automation to create efficiency that can free up resources to develop and test unscalable value propositions. In other words, brands and retailers need to wield the AI arms that are being raised outside of their castle walls. Otherwise, the next time a consumer sees an algorithmically served ad for a “good enough” substitute at an incredibly low price, the question will be, “Why would I pay more?”

Read More
Jason Yu Jason Yu

What Disney Taught Me About Ecommerce: Mastering GMV per Thousand Views as a Digital Revenue Per Square Foot

What can the Happiest Place on Earth teach us about making money online? More than you might think! I just wrote a post about some of the techniques that Disney Parks use to maximize the revenue per square-foot of their retail operations, to draw inspiration for strategies to optimize the content-driven ecommerce metric of GMV per thousand views. Discover how understanding this connection can unlock growth for your online business.

Duffy and Friends dance performance outside of the Whistle Stop Shop - Shanghai Disneyland: Shanghai, China

In physical retail, one of the key metrics is "Revenue Per Square Foot", the average revenue earned for every square foot of sales space per year. In the United States revenue per square foot for all retail averages at about $325. This is an extremely useful metric and can be applied throughout an organization to compare the efficiency of the capital invested in retail real estate. For example a mass market retailer such as Walmart may achieve $480/sq-ft, while top specialty retailers may do significantly better: Lululemon $1,560/sq-ft, Tiffany & Co $3,000/sq-ft, Apple $5,550/sq-ft.

So, what's the equivalent metric that helps us measure the efficiency of our 'digital real estate' in content-driven ecommerce? I believe GMV (Gross Merchandise Value) per thousand views (GPM) serves as a powerful analogy to 'Revenue Per Square Foot,' directly gauging how effectively we are monetizing the traffic visiting our digital properties. For product managers and product owners in this space, GPM should be a Northstar metric, as any optimization – whether in product selection, pricing, content strategy, or promotions – should ultimately aim to increase this key indicator of digital revenue efficiency.

Content-driven ecommerce is still new for many, so it may be useful to get inspiration from a retailer that commands a revenue per square foot exceeding any of the top specialty retailers. These are the shops at Disney Parks and Experiences. The LOWEST performing stores in any Disney Park across the globe outperform the BEST public specialty retail in revenue per square foot. 

How is this possible?

Honestly it's not a fair comparison. When I was the revenue and planning manager for retail and restaurants at Shanghai Disneyland, I had access to a box of tools to drive customer behavior that most specialty retailers couldn’t even dream of. We could partner with some of the most creative people in the world in design, entertainment, culinary, and operations to create experiences that enticed customers to take home something to remind them of the “magic” they experienced at our park. 

To achieve this level of performance we didn’t simply put a shop at the exit of the TRON Lightcycle ride and hope for the best. My team and I paid careful attention to the retail customer journey from Awareness, Consideration, and Conversion and focused on initiatives where we could test and measure the results of each of these changes. 

Awareness. At Disney we were always looking to use entertainment to create a reason for guests (Disney’s preferred term for customers) to walk through the door. Entertainment doesn’t just mean Broadway style shows or parades of princesses, this runs the gamut of cast-members (front-line staff) playing with the newest bubble wand to organizing opportunities for guests to take their picture with their favorite costumed character outside of an appropriately themed store every hour on the hour. 

This strategic use of entertainment was highly effective in influencing guest traffic, particularly during off-peak hours, thereby maximizing the return on Disney's significant real estate investment. This same principle of creating compelling reasons for 'foot traffic' is mirrored in the digital world by platforms like TikTok Shop. Here, creators monetize their engaging content – from makeup tutorials to cleaning hacks – effectively driving 'virtual foot traffic' to product listings, which is the crucial first step in increasing GPM (Gross Merchandise Value) per thousand views.

Visual merchandising and design at the World of Disney - Disneyland: Anaheim, CA

Consideration. The Disney Imagineers are well known as the multi-disciplinary creative force that design all of the experiences in the Disney parks. They create the visual merchandising and package design that encourages exploration and imagination in the guests that walk the stores. This overinvestment in multi-sensory design helps to showcase how buying the product (or an entire collection) can transform an everyday experience (like drinking your daily coffee) into a magical moment. 

It's a missed opportunity when brands overly focus on virality at the expense of fundamental product presentation. A wildly popular fashion video, for instance, loses its impact on GMV per thousand views if the product page lacks essential details like a sizing chart, hindering the add-to-cart rate. While ecommerce can't fully replicate the sensory richness of physical retail, investing in high-quality product descriptions, comprehensive images, and crucial technical information directly enhances the consideration phase. This is super-charged in content-driven ecommerce where clear and thorough product demonstrations can simultaneously build awareness and drive consideration. 

Fairy Godmothers making candy apples at the Main Street Bakery - The Magic Kingdom, Walt Disney World: Orlando, FL

Conversion. It is not operationally efficient to make candy apples in the shops. This is a show, designed to create opportunities for live interactions that build trust and convert sales. Perfectly cleaned granny smith apples, chocolate and caramel dipping sauce, and candy toppings are all prepared in a central kitchen, with small batches delivered for finishing throughout the day. This creates an opportunity for the “fairy godmothers” to create a personal relationship with the kids and parents who are mesmerized by the process of dipping apples in caramel coating and attaching candy decorations. While this is an inefficient way to make candy apples, it may be a great way to recommend a sweet treat to try, like the $7.99 Mickey Ice Cream Macaron that’s for sale at the counter. 

Human beings are social creatures and are drawn to personal interactions. Livestreaming in ecommerce taps into this by offering live interaction, as presenters are able to address questions and concerns from potential buyers in real time. While still following an overall script, the best ecommerce livestreamers are able to improvise when facing objections, providing personal opinions or even recommend alternatives. Even in the age of AI, or maybe especially in the age of AI, human interactions are invaluable. 

In my work developing product strategies and advocating for tools that enable content-driven ecommerce at TikTok, I always try to understand where along the customer journey of awareness, consideration, and conversion we are attempting to influence. GPM provides a great single metric to measure our effectiveness monetizing ecommerce “foot traffic”. It also allows us to check if a change we are making in one stage of the funnel is inadvertently impacting another stage, e.g., increasing the adoption of price discounts inadvertently impacting the product selection sellers choose to list on the platform. 

In ecommerce the capital required to develop and market a new brand are lower than ever before. Through content-driven ecommerce these digitally native brands are able to drive virtual foot traffic by tapping into the vibrant ecosystem of creators who are willing to monetize their viewership. However, the commerce side of content-driven ecommerce doesn’t “just happen”. Retailers and brands now have the opportunity to invest in learning how to develop the different types of content that can attract customers to their “virtual storefronts”, entice them to consider their offerings, and create the personal relationships that convert to sales.

Read More