# Cloud GPU for Image Generation — Price Report

Generated 2026-03-19. Junwon's use case: run CivitAI models (Flux, SDXL, SD1.5) locally on rented GPUs via ComfyUI. ~5 days/month, many images per session, interactive prompt refinement. Privacy: raw GPU access, no prompt logging.

## Best Metric: Images per Dollar

The right metric is **images/$** = (images/hour) / ($/hour). A slower GPU at half the price can beat a faster one.

### GPU Speed Benchmarks (SDXL 1024x1024, 30 steps)

| GPU | VRAM | Sec/image | Images/hr | Relative speed |
|-----|------|-----------|-----------|----------------|
| RTX 4090 | 24 GB | ~3s | ~1,200 | 1.0x (baseline) |
| RTX 3090 | 24 GB | ~9s | ~400 | 0.33x |
| RTX 4080 | 16 GB | ~7s | ~514 | 0.43x |
| RTX 4070 Ti | 16 GB | ~11s | ~327 | 0.27x |
| RTX 3080 | 12 GB | ~13s | ~277 | 0.23x |
| RTX 3060 | 12 GB | ~25s | ~144 | 0.12x |

### Cheapest Hourly Rates by GPU (Vast.ai / Salad)

| GPU | Vast.ai $/hr | Salad $/hr |
|-----|-------------|------------|
| RTX 4090 | ~$0.20 | $0.20 |
| RTX 3090 | ~$0.09 | $0.12 |
| RTX 4080 | ~$0.13 | $0.15 |
| RTX 4070 Ti | ~$0.10 | $0.12 |
| RTX 3080 | ~$0.08 | $0.11 |
| RTX 3060 | ~$0.05 | $0.08 |

### The Answer: Images per Dollar (SDXL, cheapest provider)

| GPU | $/hr | Images/hr | **Images/$** | Monthly cost (40 hrs) |
|-----|------|-----------|-------------|----------------------|
| **RTX 3090** | **$0.09** | **~400** | **~4,444** | **$3.60** |
| RTX 4090 | $0.20 | ~1,200 | ~6,000 | $8.00 |
| RTX 4080 | $0.13 | ~514 | ~3,954 | $5.20 |
| RTX 4070 Ti | $0.10 | ~327 | ~3,270 | $4.00 |
| RTX 3080 | $0.08 | ~277 | ~3,463 | $3.20 |
| RTX 3060 | $0.05 | ~144 | ~2,880 | $2.00 |

**Winner on pure images/$: RTX 4090 at $0.20/hr (~6,000 images/$).** The 4090 is so much faster that even at 2x the hourly rate of a 3090, it produces more images per dollar. Speed dominance overcomes the price gap.

**BUT — that's pure throughput. For interactive use, what matters is wait time per image:**

| GPU | Wait per image (SDXL) | How it feels |
|-----|----------------------|-------------|
| RTX 4090 | ~3s | Instant — click, see result, iterate |
| RTX 3090 | ~9s | Noticeable wait each generation |
| RTX 4080 | ~7s | Tolerable |
| RTX 3080 | ~13s | Slow enough to break flow |
| RTX 3060 | ~25s | Painfully slow for refinement |

### Flux.1 Schnell (same analysis)

| GPU | $/hr | Sec/img | Images/hr | **Images/$** |
|-----|------|---------|-----------|-------------|
| **RTX 4090** | **$0.20** | **~5.5s** | **~655** | **~3,275** |
| RTX 3090 | $0.09 | ~14s | ~257 | ~2,857 |
| RTX 4080 | $0.13 | ~11s | ~327 | ~2,515 |

RTX 4090 wins on images/$ for Flux too.

### Verdict: Which GPU, Which Service

**GPU: RTX 4090.** It wins on images/$ for both SDXL and Flux despite being the most expensive per hour. The speed advantage (3-4x over 3090) more than compensates for the 2x price. And for interactive prompt refinement, 3 seconds vs 9 seconds per image is the difference between flow state and frustration.

**Service: Vast.ai at ~$0.20/hr** for RTX 4090. Cheapest reliable provider with SSH access and privacy.

**If budget is absolute priority over speed: RTX 3090 on Vast.ai at $0.09/hr.** Half the monthly cost ($3.60 vs $8.00), still decent images/$ (~4,444), but each image takes 3x longer. Acceptable for batch generation, frustrating for interactive refinement.

**The cheapest GPU (RTX 3060 at $0.05/hr) is NOT the best value.** It's so slow that images/$ is actually worse than the 4090 despite costing 4x less per hour.

## Provider Comparison (18 providers evaluated)

Sorted by RTX 4090 price. Only providers offering raw SSH/Docker access (no managed APIs).

| # | Provider | RTX 4090 $/hr | RTX 3090 $/hr | Type | Privacy | ComfyUI Template | Reliability | Notes |
|---|----------|--------------|--------------|------|---------|-----------------|-------------|-------|
| 1 | **Salad** | $0.18 | $0.10 | P2P marketplace | High (your container) | No (DIY) | Low — consumer GPUs, random drops | Cheapest raw rate. Batch tier even lower. No SSH, container-only. |
| 2 | **Vast.ai** | $0.20–0.35 | $0.10–0.20 | P2P marketplace | High (Docker isolation) | Yes | Medium — varies by host | Best balance of cheap + usable. SSH access. Wide GPU selection. |
| 3 | **Hivenet** | €0.20 (~$0.22) | — | P2P marketplace | High | No | Medium | EU-focused. Newer entrant. |
| 4 | **TensorDock** | $0.25–0.40 | $0.15–0.25 | Marketplace | High (raw VM) | No | Medium | Budget-focused. Bidding model. |
| 5 | **RunPod Community** | $0.34–0.39 | — | Marketplace | High (Docker) | Yes (1-click) | Medium-High | Best UX. ComfyUI/A1111 templates. Slight premium over Vast. |
| 6 | **CloudRift** | $0.33+ | — | Cloud | High | Yes | Medium-High | Newer. Image gen focused. |
| 7 | **RunPod Secure** | $0.44–0.59 | — | Verified DCs | High | Yes (1-click) | High | Same as Community but verified data centers. |
| 8 | **Fluence** | $0.44 | — | Cloud | High | No | Medium | Decentralized. |
| 9 | **Spheron** | $0.55–0.58 | — | Cloud | High | No | Medium-High | Web3-adjacent. |
| 10 | **DataCrunch** | ~$0.50 | — | Cloud | High | No | High | EU data centers. |
| 11 | **FluidStack** | ~$0.50 | — | Aggregator | High | No | Medium | Aggregates underutilized DCs. |
| 12 | **Lambda Labs** | $0.50+ | — | Cloud | High | No | High | Well-known. Often out of stock. |
| 13 | **Paperspace** | $0.80+ | — | Cloud (DigitalOcean) | High | Notebook env | High | Overpriced for this use case. |
| 14 | **Google Cloud** | $1.40+ | — | Hyperscaler | Low (logging) | No | Very High | 5-7x more expensive. Logging by default. |
| 15 | **AWS** | $1.50+ | — | Hyperscaler | Low (logging) | No | Very High | Same. Enterprise overhead. |
| 16 | **Azure** | $1.60+ | — | Hyperscaler | Low (logging) | No | Very High | Same. |

**Not included:** Managed inference APIs (Replicate, fal.ai, Together, RunComfy, Comfy Cloud) — they see your prompts and images. Defeated providers (#14-16) are hyperscalers with 5-10x markup and default activity logging.

## Monthly Cost Estimate (5 days, 8 hrs/day = 40 hrs)

| Provider | RTX 4090 rate | Monthly cost | Images at SDXL rate |
|----------|--------------|-------------|---------------------|
| Salad | $0.18/hr | **$7.20** | ~23,200 |
| Vast.ai | $0.22/hr | **$8.80** | ~23,200 |
| Hivenet | $0.22/hr | **$8.80** | ~23,200 |
| TensorDock | $0.30/hr | **$12.00** | ~23,200 |
| RunPod Community | $0.36/hr | **$14.40** | ~23,200 |
| RunPod Secure | $0.50/hr | **$20.00** | ~23,200 |
| Lambda | $0.50/hr | **$20.00** | ~23,200 |

Image count is the same because GPU performance is identical — you're renting the same hardware. The only variable is hourly rate.

## Verdict

**Cheapest: Salad ($0.18/hr RTX 4090)**. But Salad is container-only (no SSH), GPUs drop mid-session, and there's no ComfyUI template — you'd need to build your own container image. Good for batch jobs, bad for interactive prompt refinement where a GPU dropping mid-session kills your flow.

**Best for Junwon's use case: Vast.ai ($0.20–0.25/hr RTX 4090)**. SSH access, Docker isolation, ComfyUI templates available, marketplace competition keeps prices low. ~$9/month for 40 hours. Privacy is strong: you run your own software in an isolated container, Vast.ai doesn't see your prompts, and destroying the instance deletes everything.

**If UX matters more than $5/month: RunPod Community ($0.34–0.39/hr)**. 1-click ComfyUI and A1111 templates, polished dashboard, slightly more reliable hosts. ~$14/month. The convenience premium is real — zero setup time means more time generating.

## Billing Models: Hourly vs Per-Second vs Per-Image

This is the critical question for sparse usage (5 days/month). Three billing models exist:

### 1. Hourly on-demand (Vast.ai, RunPod Pods, TensorDock)

You rent a GPU instance. The clock starts when you create it and stops when you stop/destroy it. **You pay for every second the instance is running, including:**

- Cold start (container boot): ~1-2 min
- Model loading (first gen): ~30s-2min depending on model size and whether cached
- Idle time between generations (thinking about prompts, tweaking settings)
- The actual generation (~5-20s per image)

**The hidden cost is idle time.** If you generate an image every 30 seconds for an hour, you're productive. If you generate one image, stare at it for 5 minutes, tweak a prompt, generate again — you're paying $0.20-0.35/hr for thinking time. In an 8-hour "session," realistic active generation might be 3-4 hours; the rest is idle.

**Mitigation:** On Vast.ai, you can **stop** the instance between sessions (stops GPU billing, keeps disk at ~$0.01/GB/month). But you can't stop/start within a session without a 1-2 min cold restart each time.

**Realistic cost:** If 50% of session time is idle → effective rate doubles. $0.22/hr Vast.ai → ~$0.44/hr of actual generation time.

### 2. Per-second serverless (RunPod Serverless)

You deploy a worker endpoint. It spins up only when you send a request, bills per-second of active compute, then spins down after an idle timeout (default 5s). **You don't pay for thinking time.**

- Cold start (first request): 10-60s (model loading from disk to GPU)
- Execution: billed per-second at ~$0.00039/s for A100 (~$1.40/hr equivalent)
- Idle timeout: 5s default (configurable) — you pay for this
- After idle timeout: worker scales to zero, no charge until next request

**For sparse, bursty usage this can be cheaper** because you're not paying for the 5 minutes between generations. But the per-second rate is 2-3x higher than on-demand hourly, and cold starts add latency.

**The catch for interactive use:** If the idle timeout is too short, every generation has a cold start (10-60s wait). If too long, you're paying for idle time anyway. Tuning the timeout is key.

**Cost math:** 200 SDXL images in a session, 6s each = 1,200s of compute. At $0.00039/s = **$0.47 per session**. Compare to 4 hours on Vast.ai = **$0.88**. Serverless wins if you're generating in bursts with long pauses. Hourly wins if you're generating continuously.

**Privacy concern:** RunPod Serverless still gives you your own container/endpoint — they don't see your prompts. But it's more managed than raw SSH.

### 3. Per-image managed (Comfy Cloud, RunComfy)

Comfy Cloud: 0.39 credits/second of GPU time (211 credits = $1, so ~$0.0018/s = ~$6.65/hr equivalent). Only charges during active workflow execution — not during editing or model downloads.

**This is the most expensive option by far** when measured per GPU-second. The value proposition is zero setup (native ComfyUI in browser, no Docker, no SSH). But at 30x the hourly rate of Vast.ai, it only makes sense if you generate <50 images/month.

**Privacy:** They see everything — your workflows, prompts, and generated images pass through their servers.

### Which billing model wins for 5 days/month?

| Scenario | Best model | Why |
|----------|-----------|-----|
| Long continuous sessions (4-8 hrs, always generating) | Hourly (Vast.ai) | Idle time is minimal, lowest $/hr wins |
| Short bursty sessions (1-2 hrs, lots of pauses) | Serverless (RunPod) | Don't pay for thinking time |
| Very sparse (<50 images/month) | Per-image (Comfy Cloud) | High rate but near-zero idle cost |
| **Junwon's case: 5 heavy days, interactive refinement** | **Hourly (Vast.ai)** | Heavy days = long sessions where hourly is cheapest. Interactive refinement means generating frequently enough that idle % stays low. |

**Bottom line:** For heavy interactive sessions, hourly is still cheapest. The low $/hr dominates when you're actively using the GPU most of the time. Serverless only wins if your "5 days" means short bursts with long gaps — but you said "many many images" and "fast interactive refinement," which means sustained sessions where hourly billing is optimal.

## RTX 4090 vs RTX 3090 for This Use Case

RTX 3090 is ~40% slower at SDXL/Flux but costs ~50% less on Vast.ai ($0.10–0.15/hr). For interactive refinement where you're waiting on each image, the 4090's speed is worth the premium. For batch generation overnight, 3090 wins on cost-per-image.

## Setup (Vast.ai, one-time ~15 min)

1. Create account at vast.ai, deposit $10
2. Search GPU marketplace → filter RTX 4090, sort by price
3. Select a machine with a ComfyUI Docker template (or use `pytorch/pytorch` base and install ComfyUI yourself)
4. Launch instance → SSH in or open the web UI URL
5. Download models from CivitAI into `/workspace/models/`
6. Open ComfyUI in browser → generate
7. When done for the day: stop instance (keeps disk, stops billing GPU) or destroy (deletes everything, stops all billing)

**Tip:** Use "stop" instead of "destroy" between sessions within the same week. Disk storage is ~$0.01/GB/month — keeping your models loaded saves 10-15 min of re-downloading next session.