The Secret to Easy Vibe Coding? Stop Building for the Cloud

Note: This is a copy of an article I posted on LinkedIn in June 2026. I’m sure many of my fellow instructional designers have struggled with vibe coding. I share your pain. The endless prompting and re-prompting, the tedious wait between responses, and the hopeless feeling that the project will never be finished. Despite these irritants, I continue to push through vibe-coding projects because the result is usually worth the effort. But a recent project reminded me why I fell in love with the vibe coding concept. Unlike previous slogs, this one came together in an effortless afternoon. The “secret sauce” didn’t have anything to do with prompts. Instead, I changed where the app lives. I stopped building for the cloud and brought everything local. (Note: This is also my first GitHub project!) Background While playing with AI video tools, I followed common Internet advice to concentrate on getting good first and last frames for each video clip. If I supplied these two frames along with direction instructions (lighting, camera movement, character movement, dialog, sound effects, et cetera) then the AI model combines this information and returns the video clip. Here’s an example of first and last frame images, and the resulting video. So, since the first and last frames are critical, it’s important to have consistent characters and locations across all the shots. The usual workflow is to create a starting image from a description in a generative AI model, then use that image as a “seed” to generate additional images. For example, if you want an over-the-shoulder shot of one character looking at another, you’d use a prompt like this: “Cinematic over-the-shoulder shot. The camera is positioned closely behind [Foreground Subject/Character], looking past their shoulder to focus sharply on [Background Subject/Action]. Match the exact visual style, lighting, and color palette of the attached reference image. Depth of field with the foreground shoulder slightly blurred. [Optional: Add specific vibe or lighting, e.g., dramatic studio lighting, bright daylight].” The AI Lift Gemini easily created additional prompts for other common cinematic shots, like a close-up, bird’s eye view, two-shot, or three-quarter view. I’d cut and paste the prompts as needed, customizing them on the fly to make sure they matched the mood and lighting of the seed image. And I’d repeat some prompts multiple times, because the first AI attempt didn’t match the vision in my head, or because AI made poor choices, or didn’t keep the characters consistent. So a lot of tedious cut-and-paste, trying to get shots that worked. CTRL+C ➡️ CTRL+V, over and over and over. So, after about three hours of cut/paste, I decided to vibe code a tool that would create multiple shots from one reference (aka seed) image. The finished app, ShotSmith, works on a “wizard” style workflow with three main tabs. First the user uploads an image. Next the user selects what type of shots they want, and how many of each shot they want to generate. (Up to ten shots per shot type). … Read more