The Ultimate Midjourney Prompt Formula (With Examples)
May 29, 2026
Affiliate disclosure: some links below are affiliate links. If you sign up through them, captainsmeta may earn a small commission at no extra cost to you.
The Ultimate Midjourney Prompt Formula (With Examples)
Most Midjourney prompts read like Google searches: “cat in space.” Then people wonder why the output looks like everyone else’s. The pros write prompts that read like art direction — and the difference in output is enormous.
There’s no magic. There’s a structure. Once you internalize it, you stop “trying random words” and start directing an image. Here’s the six-layer formula and exactly how to use it.
The formula
Fresh Test Product
[Subject] + [Action / Pose] + [Setting] + [Lighting] + [Camera / Lens] + [Style + Finish]
That’s it. Stack the layers in roughly that order, be specific in each, and your output quality jumps a full level — without changing tools or paying more.
Layer 1: Subject (who/what)
Be specific. Not “a woman” → “a woman in her 30s, short dark hair, focused expression, wearing a navy wool coat.” Detail in the subject = a defined image instead of a generic one. (For ongoing characters, lock the subject across images with the workflow in How to Get Consistent Characters in Midjourney.)
Layer 2: Action / pose
Give the subject one clear action. “Reaching for a coffee cup,” “leaning against a doorway, looking out a window,” “running through wet streets at night.” One action beats vague atmosphere — Midjourney renders single intentions far better than chaotic scenes.
Layer 3: Setting
Where this happens, with sensory detail. “A quiet café with brass fixtures and afternoon sun through tall windows.” “A neon-soaked Tokyo alley with rain on the pavement.” Setting words do enormous lifting.
Layer 4: Lighting (the secret sauce)
This is where “cinematic” actually lives. Real lighting language: golden hour, blue hour, soft diffused window light, hard rim light, neon glow, volumetric god rays, overcast. Just adding a precise lighting phrase often saves a flat image.
Layer 5: Camera / lens
This is what separates a snapshot from cinema. Medium shot, close-up, wide establishing shot, low-angle, over-the-shoulder, 35mm lens, 50mm, 85mm portrait lens, shallow depth of field, anamorphic. Camera words shape framing and feel as much as content.
Layer 6: Style + finish
Land the look: cinematic, editorial, documentary, painterly, retro film, hyperrealistic, 35mm film grain, teal-and-orange color grade, soft contrast. Reuse the same style words across a set so images feel related.
Worked example
Stack all six layers:
Subject: a woman in her 30s, short dark hair, focused expression, wearing a navy wool coat Action: walking briskly down a wet sidewalk Setting: empty downtown street at dawn, glass storefronts reflecting light Lighting: cool blue-hour light, soft mist Camera: medium tracking shot, 35mm lens, shallow depth of field Style: cinematic, slight film grain, muted teal-and-amber grade
Glue it into one prompt:
“A woman in her 30s with short dark hair and a focused expression, wearing a navy wool coat, walking briskly down a wet downtown sidewalk at dawn, empty glass storefronts reflecting light, cool blue-hour light with soft mist, medium tracking shot, 35mm lens, shallow depth of field, cinematic, slight film grain, muted teal-and-amber grade.”
Compare that to “woman walking in the city.” Same image idea — completely different result.
The cheat sheet
| Layer | Example words |
|---|---|
| Subject | Age, build, hair, expression, clothing, defining feature |
| Action | One clear verb + posture |
| Setting | Specific place + sensory detail |
| Lighting | Golden hour, soft side-light, rim light, neon glow |
| Camera | Wide / medium / close, 35mm/50mm/85mm, depth of field |
| Style | Cinematic, editorial, painterly, film grain, color grade |
How to iterate well
- Generate with the full 6-layer prompt.
- Identify the one thing that’s off (lighting flat? framing wrong? motion mood?).
- Change only that layer and regenerate.
- Repeat until it sings.
Changing one variable at a time is how you learn what Midjourney responds to. Change five and you learn nothing.
API Check Product 2
Common problems and fixes
- Generic, flat result → add specific lighting + camera layers. They lift average prompts the most.
- Faces/hands distorted → pull back to a medium or wide shot; tight close-ups stress detail.
- Ignored instructions → trim the prompt; lead with the most important layers (Subject + Lighting + Camera).
- Inconsistent images in a series → reuse the same Lighting, Camera, and Style words across every prompt.
When to break the formula
You can omit layers — for example, a tight studio portrait might not need a setting beyond “clean gradient background.” The formula isn’t a checkbox list; it’s a menu of the levers that matter. Pull the ones that lift this specific image.
FAQ
Does this work in other generators too? The structure works in any generator (Flux, Ideogram, DALL·E, etc.). Each tool weighs words slightly differently — test and adjust.
How long should a prompt be? Long enough to be specific, short enough to focus. Trim filler words. If a layer doesn’t change the result when removed, it wasn’t doing anything.
Do I need to know real photography terms? A handful — shot types, depth of field, common lighting and lens phrases. It’s the highest-ROI vocabulary you can pick up for AI image work.
What about negative prompts? Use them sparingly to remove a specific recurring problem (e.g., “no text, no watermarks”). Over-using negatives can confuse the model. Always lead positive.
The bottom line
A great Midjourney prompt isn’t a sentence — it’s six stacked layers of art direction: Subject, Action, Setting, Lighting, Camera, Style. Be specific in each, iterate one variable at a time, and reuse style words across a set. Internalize the formula and you stop hoping for good outputs — you direct them.
👉 Next: keep characters on-model with How to Get Consistent Characters in Midjourney, and grab ready prompts in 50 Copy-Paste Prompts for Stunning Social Media Graphics.