Story

Building the future of interactive learning

The Vision

I'm building a video-based learning platform where users can explore a product's supply chain, environmental impact, and health effects—all in an interactive, gamified way. Think Alchemy, where players combine “elements” to create products, or multiplayer quizzes with leaderboards to keep engagement high.

Instead of social media reels, people will be checking out product reels—discovering the fascinating stories behind everyday items through engaging, bite-sized video content that educates while it entertains.

Initially, I planned to generate videos using Veo 3, but with costs ranging from $15–$45 for a single one-minute video, mass adoption wasn't realistic. Video generation will eventually become cheaper, but I didn't want to wait. Instead, I designed a cost-optimized architecture that works now and can seamlessly upgrade to new tech later.

Optimization #1 – RAG for Video Reuse

Rather than regenerating similar videos, every generated video is indexed with a detailed description. A semantic search system retrieves and reuses relevant footage, so as the library grows, generation costs approach zero.

Optimization #2 – Step-based Segmentation

Most products share many steps. For example, two chocolate products might have 80% identical supply chain footage. By segmenting videos into reusable steps or effects, I can assemble complete videos from existing clips without regenerating them.

Example: Chocolate Products Sharing Steps

🍫 Dark Chocolate Bar

Cocoa Bean HarvestingShared ✓

Fermentation & DryingShared ✓

RoastingShared ✓

GrindingShared ✓

Dark Chocolate MixingUnique

Molding & PackagingShared ✓

🥛 Milk Chocolate Bar

Cocoa Bean HarvestingShared ✓

Fermentation & DryingShared ✓

RoastingShared ✓

GrindingShared ✓

Milk Chocolate MixingUnique

Molding & PackagingShared ✓

Shared Steps: 5/6 (83%)

🎯 Cost Savings: 83% reuse rate

Optimization #3 – Modular Media

I separate audio, text, and charts from the main visuals. This makes the video clips more reusable—translations become trivial, and real-time data-driven charts can be overlaid, making the learning experience richer than static, fully generated videos.

Optimization #4 – Image-first Video Generation

Even with segmentation, generating every clip as video is costly. To start, I use image sequences + text-to-speech to create videos at a fraction of the cost. Over time, I track which products and segments are most popular, then selectively replace those with higher-quality generated video.

Platform Architecture

How the Product Knowledge Platform works

👤

User Input

Product Name

Analysis Type

🏭

Supply Chain

🌍

Environmental

🏥

Health Effects

📚

History

⚡

Groq LLaMA-4

Content Generation

🎨

Parallel Processing

🖼️

Gemini Imagen

Images

🔊

OpenAI TTS

Audio

🎥

Video Presentation

Assembly

▶️

Interactive Player

with Navigation

Input & Analysis

User enters any product name

Choose analysis dimension

AI generates structured overview

Content Generation

Detailed step analysis

Parallel image & audio generation

Interactive video assembly

Prototype Status

What we've built so far

I've already built a working pipeline:

User enters a product name.

The system generates a detailed plan.

Multiple LLMs process different steps in parallel.

Images are generated with Gemini Imagen.

Audio is generated with OpenAI TTS.

Both are combined into a complete video.

Cost-Effective Technology Stack

This is powered by Groq LLaMA-4 for ultra-fast inference, and the result is incredibly cost-effective—$0.10 for a first-time generation, and $0 for replays or indexed reuse.

Over time, the platform will evolve from an image-based MVP into a fully cinematic, dynamically assembled video learning system—without ever losing cost efficiency.

Ready to Explore?

Experience the future of product learning today. Try our Product Knowledge Platform and see how we make complex supply chains accessible and engaging.