# Cloud GPU for Image Generation — Price Report Generated 2026-03-19. Junwon's use case: run CivitAI models (Flux, SDXL, SD1.5) locally on rented GPUs via ComfyUI. ~5 days/month, many images per session, interactive prompt refinement. Privacy: raw GPU access, no prompt logging. ## Best Metric: Images per Dollar The right metric is **images/$** = (images/hour) / ($/hour). A slower GPU at half the price can beat a faster one. ### GPU Speed Benchmarks (SDXL 1024x1024, 30 steps) | GPU | VRAM | Sec/image | Images/hr | Relative speed | |-----|------|-----------|-----------|----------------| | RTX 4090 | 24 GB | ~3s | ~1,200 | 1.0x (baseline) | | RTX 3090 | 24 GB | ~9s | ~400 | 0.33x | | RTX 4080 | 16 GB | ~7s | ~514 | 0.43x | | RTX 4070 Ti | 16 GB | ~11s | ~327 | 0.27x | | RTX 3080 | 12 GB | ~13s | ~277 | 0.23x | | RTX 3060 | 12 GB | ~25s | ~144 | 0.12x | ### Cheapest Hourly Rates by GPU (Vast.ai / Salad) | GPU | Vast.ai $/hr | Salad $/hr | |-----|-------------|------------| | RTX 4090 | ~$0.20 | $0.20 | | RTX 3090 | ~$0.09 | $0.12 | | RTX 4080 | ~$0.13 | $0.15 | | RTX 4070 Ti | ~$0.10 | $0.12 | | RTX 3080 | ~$0.08 | $0.11 | | RTX 3060 | ~$0.05 | $0.08 | ### The Answer: Images per Dollar (SDXL, cheapest provider) | GPU | $/hr | Images/hr | **Images/$** | Monthly cost (40 hrs) | |-----|------|-----------|-------------|----------------------| | **RTX 3090** | **$0.09** | **~400** | **~4,444** | **$3.60** | | RTX 4090 | $0.20 | ~1,200 | ~6,000 | $8.00 | | RTX 4080 | $0.13 | ~514 | ~3,954 | $5.20 | | RTX 4070 Ti | $0.10 | ~327 | ~3,270 | $4.00 | | RTX 3080 | $0.08 | ~277 | ~3,463 | $3.20 | | RTX 3060 | $0.05 | ~144 | ~2,880 | $2.00 | **Winner on pure images/$: RTX 4090 at $0.20/hr (~6,000 images/$).** The 4090 is so much faster that even at 2x the hourly rate of a 3090, it produces more images per dollar. Speed dominance overcomes the price gap. **BUT — that's pure throughput. For interactive use, what matters is wait time per image:** | GPU | Wait per image (SDXL) | How it feels | |-----|----------------------|-------------| | RTX 4090 | ~3s | Instant — click, see result, iterate | | RTX 3090 | ~9s | Noticeable wait each generation | | RTX 4080 | ~7s | Tolerable | | RTX 3080 | ~13s | Slow enough to break flow | | RTX 3060 | ~25s | Painfully slow for refinement | ### Flux.1 Schnell (same analysis) | GPU | $/hr | Sec/img | Images/hr | **Images/$** | |-----|------|---------|-----------|-------------| | **RTX 4090** | **$0.20** | **~5.5s** | **~655** | **~3,275** | | RTX 3090 | $0.09 | ~14s | ~257 | ~2,857 | | RTX 4080 | $0.13 | ~11s | ~327 | ~2,515 | RTX 4090 wins on images/$ for Flux too. ### Verdict: Which GPU, Which Service **GPU: RTX 4090.** It wins on images/$ for both SDXL and Flux despite being the most expensive per hour. The speed advantage (3-4x over 3090) more than compensates for the 2x price. And for interactive prompt refinement, 3 seconds vs 9 seconds per image is the difference between flow state and frustration. **Service: Vast.ai at ~$0.20/hr** for RTX 4090. Cheapest reliable provider with SSH access and privacy. **If budget is absolute priority over speed: RTX 3090 on Vast.ai at $0.09/hr.** Half the monthly cost ($3.60 vs $8.00), still decent images/$ (~4,444), but each image takes 3x longer. Acceptable for batch generation, frustrating for interactive refinement. **The cheapest GPU (RTX 3060 at $0.05/hr) is NOT the best value.** It's so slow that images/$ is actually worse than the 4090 despite costing 4x less per hour. ## Provider Comparison (18 providers evaluated) Sorted by RTX 4090 price. Only providers offering raw SSH/Docker access (no managed APIs). | # | Provider | RTX 4090 $/hr | RTX 3090 $/hr | Type | Privacy | ComfyUI Template | Reliability | Notes | |---|----------|--------------|--------------|------|---------|-----------------|-------------|-------| | 1 | **Salad** | $0.18 | $0.10 | P2P marketplace | High (your container) | No (DIY) | Low — consumer GPUs, random drops | Cheapest raw rate. Batch tier even lower. No SSH, container-only. | | 2 | **Vast.ai** | $0.20–0.35 | $0.10–0.20 | P2P marketplace | High (Docker isolation) | Yes | Medium — varies by host | Best balance of cheap + usable. SSH access. Wide GPU selection. | | 3 | **Hivenet** | €0.20 (~$0.22) | — | P2P marketplace | High | No | Medium | EU-focused. Newer entrant. | | 4 | **TensorDock** | $0.25–0.40 | $0.15–0.25 | Marketplace | High (raw VM) | No | Medium | Budget-focused. Bidding model. | | 5 | **RunPod Community** | $0.34–0.39 | — | Marketplace | High (Docker) | Yes (1-click) | Medium-High | Best UX. ComfyUI/A1111 templates. Slight premium over Vast. | | 6 | **CloudRift** | $0.33+ | — | Cloud | High | Yes | Medium-High | Newer. Image gen focused. | | 7 | **RunPod Secure** | $0.44–0.59 | — | Verified DCs | High | Yes (1-click) | High | Same as Community but verified data centers. | | 8 | **Fluence** | $0.44 | — | Cloud | High | No | Medium | Decentralized. | | 9 | **Spheron** | $0.55–0.58 | — | Cloud | High | No | Medium-High | Web3-adjacent. | | 10 | **DataCrunch** | ~$0.50 | — | Cloud | High | No | High | EU data centers. | | 11 | **FluidStack** | ~$0.50 | — | Aggregator | High | No | Medium | Aggregates underutilized DCs. | | 12 | **Lambda Labs** | $0.50+ | — | Cloud | High | No | High | Well-known. Often out of stock. | | 13 | **Paperspace** | $0.80+ | — | Cloud (DigitalOcean) | High | Notebook env | High | Overpriced for this use case. | | 14 | **Google Cloud** | $1.40+ | — | Hyperscaler | Low (logging) | No | Very High | 5-7x more expensive. Logging by default. | | 15 | **AWS** | $1.50+ | — | Hyperscaler | Low (logging) | No | Very High | Same. Enterprise overhead. | | 16 | **Azure** | $1.60+ | — | Hyperscaler | Low (logging) | No | Very High | Same. | **Not included:** Managed inference APIs (Replicate, fal.ai, Together, RunComfy, Comfy Cloud) — they see your prompts and images. Defeated providers (#14-16) are hyperscalers with 5-10x markup and default activity logging. ## Monthly Cost Estimate (5 days, 8 hrs/day = 40 hrs) | Provider | RTX 4090 rate | Monthly cost | Images at SDXL rate | |----------|--------------|-------------|---------------------| | Salad | $0.18/hr | **$7.20** | ~23,200 | | Vast.ai | $0.22/hr | **$8.80** | ~23,200 | | Hivenet | $0.22/hr | **$8.80** | ~23,200 | | TensorDock | $0.30/hr | **$12.00** | ~23,200 | | RunPod Community | $0.36/hr | **$14.40** | ~23,200 | | RunPod Secure | $0.50/hr | **$20.00** | ~23,200 | | Lambda | $0.50/hr | **$20.00** | ~23,200 | Image count is the same because GPU performance is identical — you're renting the same hardware. The only variable is hourly rate. ## Verdict **Cheapest: Salad ($0.18/hr RTX 4090)**. But Salad is container-only (no SSH), GPUs drop mid-session, and there's no ComfyUI template — you'd need to build your own container image. Good for batch jobs, bad for interactive prompt refinement where a GPU dropping mid-session kills your flow. **Best for Junwon's use case: Vast.ai ($0.20–0.25/hr RTX 4090)**. SSH access, Docker isolation, ComfyUI templates available, marketplace competition keeps prices low. ~$9/month for 40 hours. Privacy is strong: you run your own software in an isolated container, Vast.ai doesn't see your prompts, and destroying the instance deletes everything. **If UX matters more than $5/month: RunPod Community ($0.34–0.39/hr)**. 1-click ComfyUI and A1111 templates, polished dashboard, slightly more reliable hosts. ~$14/month. The convenience premium is real — zero setup time means more time generating. ## Billing Models: Hourly vs Per-Second vs Per-Image This is the critical question for sparse usage (5 days/month). Three billing models exist: ### 1. Hourly on-demand (Vast.ai, RunPod Pods, TensorDock) You rent a GPU instance. The clock starts when you create it and stops when you stop/destroy it. **You pay for every second the instance is running, including:** - Cold start (container boot): ~1-2 min - Model loading (first gen): ~30s-2min depending on model size and whether cached - Idle time between generations (thinking about prompts, tweaking settings) - The actual generation (~5-20s per image) **The hidden cost is idle time.** If you generate an image every 30 seconds for an hour, you're productive. If you generate one image, stare at it for 5 minutes, tweak a prompt, generate again — you're paying $0.20-0.35/hr for thinking time. In an 8-hour "session," realistic active generation might be 3-4 hours; the rest is idle. **Mitigation:** On Vast.ai, you can **stop** the instance between sessions (stops GPU billing, keeps disk at ~$0.01/GB/month). But you can't stop/start within a session without a 1-2 min cold restart each time. **Realistic cost:** If 50% of session time is idle → effective rate doubles. $0.22/hr Vast.ai → ~$0.44/hr of actual generation time. ### 2. Per-second serverless (RunPod Serverless) You deploy a worker endpoint. It spins up only when you send a request, bills per-second of active compute, then spins down after an idle timeout (default 5s). **You don't pay for thinking time.** - Cold start (first request): 10-60s (model loading from disk to GPU) - Execution: billed per-second at ~$0.00039/s for A100 (~$1.40/hr equivalent) - Idle timeout: 5s default (configurable) — you pay for this - After idle timeout: worker scales to zero, no charge until next request **For sparse, bursty usage this can be cheaper** because you're not paying for the 5 minutes between generations. But the per-second rate is 2-3x higher than on-demand hourly, and cold starts add latency. **The catch for interactive use:** If the idle timeout is too short, every generation has a cold start (10-60s wait). If too long, you're paying for idle time anyway. Tuning the timeout is key. **Cost math:** 200 SDXL images in a session, 6s each = 1,200s of compute. At $0.00039/s = **$0.47 per session**. Compare to 4 hours on Vast.ai = **$0.88**. Serverless wins if you're generating in bursts with long pauses. Hourly wins if you're generating continuously. **Privacy concern:** RunPod Serverless still gives you your own container/endpoint — they don't see your prompts. But it's more managed than raw SSH. ### 3. Per-image managed (Comfy Cloud, RunComfy) Comfy Cloud: 0.39 credits/second of GPU time (211 credits = $1, so ~$0.0018/s = ~$6.65/hr equivalent). Only charges during active workflow execution — not during editing or model downloads. **This is the most expensive option by far** when measured per GPU-second. The value proposition is zero setup (native ComfyUI in browser, no Docker, no SSH). But at 30x the hourly rate of Vast.ai, it only makes sense if you generate <50 images/month. **Privacy:** They see everything — your workflows, prompts, and generated images pass through their servers. ### Which billing model wins for 5 days/month? | Scenario | Best model | Why | |----------|-----------|-----| | Long continuous sessions (4-8 hrs, always generating) | Hourly (Vast.ai) | Idle time is minimal, lowest $/hr wins | | Short bursty sessions (1-2 hrs, lots of pauses) | Serverless (RunPod) | Don't pay for thinking time | | Very sparse (<50 images/month) | Per-image (Comfy Cloud) | High rate but near-zero idle cost | | **Junwon's case: 5 heavy days, interactive refinement** | **Hourly (Vast.ai)** | Heavy days = long sessions where hourly is cheapest. Interactive refinement means generating frequently enough that idle % stays low. | **Bottom line:** For heavy interactive sessions, hourly is still cheapest. The low $/hr dominates when you're actively using the GPU most of the time. Serverless only wins if your "5 days" means short bursts with long gaps — but you said "many many images" and "fast interactive refinement," which means sustained sessions where hourly billing is optimal. ## RTX 4090 vs RTX 3090 for This Use Case RTX 3090 is ~40% slower at SDXL/Flux but costs ~50% less on Vast.ai ($0.10–0.15/hr). For interactive refinement where you're waiting on each image, the 4090's speed is worth the premium. For batch generation overnight, 3090 wins on cost-per-image. ## Setup (Vast.ai, one-time ~15 min) 1. Create account at vast.ai, deposit $10 2. Search GPU marketplace → filter RTX 4090, sort by price 3. Select a machine with a ComfyUI Docker template (or use `pytorch/pytorch` base and install ComfyUI yourself) 4. Launch instance → SSH in or open the web UI URL 5. Download models from CivitAI into `/workspace/models/` 6. Open ComfyUI in browser → generate 7. When done for the day: stop instance (keeps disk, stops billing GPU) or destroy (deletes everything, stops all billing) **Tip:** Use "stop" instead of "destroy" between sessions within the same week. Disk storage is ~$0.01/GB/month — keeping your models loaded saves 10-15 min of re-downloading next session.