Story

Building the future of interactive learning

The Vision

I'm building a video-based learning platform where users can explore a product's supply chain, environmental impact, and health effects—all in an interactive, gamified way. Think Alchemy, where players combine “elements” to create products, or multiplayer quizzes with leaderboards to keep engagement high.

Instead of social media reels, people will be checking out product reels—discovering the fascinating stories behind everyday items through engaging, bite-sized video content that educates while it entertains.

Initially, I planned to generate videos using Veo 3, but with costs ranging from $15–$45 for a single one-minute video, mass adoption wasn't realistic. Video generation will eventually become cheaper, but I didn't want to wait. Instead, I designed a cost-optimized architecture that works now and can seamlessly upgrade to new tech later.

Optimization #1 – RAG for Video Reuse

Rather than regenerating similar videos, every generated video is indexed with a detailed description. A semantic search system retrieves and reuses relevant footage, so as the library grows, generation costs approach zero.

Optimization #2 – Step-based Segmentation

Most products share many steps. For example, two chocolate products might have 80% identical supply chain footage. By segmenting videos into reusable steps or effects, I can assemble complete videos from existing clips without regenerating them.

Example: Chocolate Products Sharing Steps

🍫 Dark Chocolate Bar
Cocoa Bean HarvestingShared ✓
Fermentation & DryingShared ✓
RoastingShared ✓
GrindingShared ✓
Dark Chocolate MixingUnique
Molding & PackagingShared ✓
🥛 Milk Chocolate Bar
Cocoa Bean HarvestingShared ✓
Fermentation & DryingShared ✓
RoastingShared ✓
GrindingShared ✓
Milk Chocolate MixingUnique
Molding & PackagingShared ✓
Shared Steps: 5/6 (83%)
🎯 Cost Savings: 83% reuse rate
Optimization #3 – Modular Media

I separate audio, text, and charts from the main visuals. This makes the video clips more reusable—translations become trivial, and real-time data-driven charts can be overlaid, making the learning experience richer than static, fully generated videos.

Optimization #4 – Image-first Video Generation

Even with segmentation, generating every clip as video is costly. To start, I use image sequences + text-to-speech to create videos at a fraction of the cost. Over time, I track which products and segments are most popular, then selectively replace those with higher-quality generated video.

Platform Architecture
How the Product Knowledge Platform works
👤
User Input
Product Name
Analysis Type
🏭
Supply Chain
🌍
Environmental
🏥
Health Effects
📚
History
Groq LLaMA-4
Content Generation
🎨
Parallel Processing
🖼️
Gemini Imagen
Images
🔊
OpenAI TTS
Audio
🎥
Video Presentation
Assembly
▶️
Interactive Player
with Navigation

Input & Analysis

User enters any product name
Choose analysis dimension
AI generates structured overview

Content Generation

Detailed step analysis
Parallel image & audio generation
Interactive video assembly
Prototype Status
What we've built so far

I've already built a working pipeline:

User enters a product name.
The system generates a detailed plan.
Multiple LLMs process different steps in parallel.
Images are generated with Gemini Imagen.
Audio is generated with OpenAI TTS.
Both are combined into a complete video.
Cost-Effective Technology Stack

This is powered by Groq LLaMA-4 for ultra-fast inference, and the result is incredibly cost-effective—$0.10 for a first-time generation, and $0 for replays or indexed reuse.

Over time, the platform will evolve from an image-based MVP into a fully cinematic, dynamically assembled video learning system—without ever losing cost efficiency.

Ready to Explore?

Experience the future of product learning today. Try our Product Knowledge Platform and see how we make complex supply chains accessible and engaging.