Stable Diffusion Local Installation and ComfyUI Setup Practical Guide • Meteora Web Agency

Have you ever paid for Midjourney or DALL·E subscriptions, only to realize you have no control over the model, privacy, or costs? Prompt limits, generation caps, your data sent to unknown servers. If this sounds familiar, it's time to go local: install Stable Diffusion on your own machine with ComfyUI.

Here at Meteora Web, we chose the local path for one simple reason: ownership and control. No monthly fees, no restrictions on what you generate, no dependence on third-party servers. Just your GPU and your creativity. This guide takes you from installation to your first generated image, step by step, with real code examples.

This guide is a spoke of our pillar on AI for Images. If you're new to Stable Diffusion, start here. If you already know it, you'll find tips on ComfyUI and performance optimizations.

Why Install Stable Diffusion Locally?

Stable Diffusion is an open-source image generation model. Running it on your PC instead of a paid cloud service gives you:

Full privacy: your prompts and images stay local. No cloud, no unwanted eyes.
Zero ongoing cost: once you have the hardware, there are no subscription fees.
No limits: generate as many images as you want, no queues or usage caps.
Maximum customization: use any model (SD 1.5, SDXL, Flux), add LoRAs, ControlNet, custom textures.

The only downside is the initial setup, but with this guide you'll be done in about 20 minutes.

Hardware and Prerequisites

GPU

Stable Diffusion runs best on NVIDIA GPUs with at least 6 GB VRAM. 8 GB is comfortable for SDXL, 12 GB gives you headroom. AMD (Radeon) and Intel Arc users can try ROCm or OpenVINO, but the experience may be less smooth. No GPU? You can still run on CPU, but expect 5-10 minutes per image.

Practical advice: if your card has less than 4 GB VRAM, consider cloud services like RunPod or Google Colab. But most modern gaming PCs (RTX 2060 and up) handle local installation well.

Required Software

Python 3.10 or 3.11 (not 3.12+ due to library compatibility). Get it from python.org.
Git for cloning repositories. Download from git-scm.com.
Disk space: at least 20 GB for models and generated images.

Installing ComfyUI

ComfyUI is the most powerful node-based interface for Stable Diffusion. It lets you build visual workflows combining models, prompts, samplers, and post-processing. It's our tool of choice.

Step 1: Clone the Repository

Open a terminal (PowerShell on Windows, bash on Linux/Mac) and run:

git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

Step 2: Create a Python Virtual Environment

To keep your system Python clean:

python -m venv venv
# Activate:
# Windows:
venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

If you have an NVIDIA GPU, also install xformers to reduce VRAM usage (highly recommended):

pip install xformers

For AMD or Intel, follow the official ComfyUI instructions on their GitHub repo.

Step 4: Download a Base Model

ComfyUI ships without models. You need to download them manually. Two main sources:

Hugging Face: official models like sd-v1-4.ckpt or sd_xl_base_1.0.safetensors.
CivitAI: thousands of community models (realistic, anime, stylized).

We recommend starting with a simple model like Realistic Vision V5.1 from CivitAI. Download the .safetensors file and copy it to ComfyUI/models/checkpoints/.

Example using wget to fetch SD 2.1 from Hugging Face:

# Install wget if needed
wget -P models/checkpoints/ https://huggingface.co/stabilityai/stable-diffusion-2-1/resolve/main/v2-1_768-ema-pruned.safetensors

Step 5: Launch ComfyUI

python main.py

You should see:

 ComfyUI start with normal settings 
Starting server
To see the GUI go to: http://127.0.0.1:8188

Open your browser and go to http://127.0.0.1:8188.

First Workflow: Text-to-Image

ComfyUI uses nodes. Each node is a function: load model, process prompt, run sampler, save image.

For the first generation, use the default workflow. Click Load Default in the menu (it may appear automatically).

If not, create it manually:

Add node → loaders → Load Checkpoint: select your model.
Connect the model output to a CLIP Text Encode (Prompt) node. Enter positive and negative prompts.
Connect the conditioning output to a KSampler node. Set seed, steps, CFG scale (7 is a good start).
Connect the latent output to a VAEDecode node (use the checkpoint's VAE or a separate one).
Connect to a Save Image node.
Press Queue Prompt (or Ctrl+Enter).

You'll see the image being generated in real time. If nothing happens, check the console for errors.

Key Optimizations

With limited VRAM (6-8 GB), these tweaks save you:

--lowvram: start ComfyUI with python main.py --lowvram. Slower but reduces VRAM by ~30%.
xformers: if not already installed, do it now. It automatically optimizes memory during inference.
Reduce resolution: for SD 1.5 use 512x512, for SDXL try 768x768 instead of 1024x1024.
Batch size = 1: generating one image at a time uses less VRAM.

Extending ComfyUI: ControlNet, LoRA, Upscalers

Once you master the basic flow, add more power:

ControlNet: nodes that condition generation on reference images (pose, depth, edges). Download ControlNet models from Hugging Face and place them in models/controlnet/.
LoRA: lightweight weights for specific styles or characters. Put files in models/loras/ and use the Load LoRA node between the checkpoint and CLIP.
Upscaler: double resolution after generation. Install extra nodes like Ultimate SD Upscale or use a model like 4x_NMKD.

The community has created thousands of pre-made workflows. Find them on CivitAI or GitHub. Drag and drop a .json file into ComfyUI's window.

In a Nutshell — What to Do Now

Install ComfyUI by following the steps above. It should take 15-20 minutes.
Get a base model from CivitAI or Hugging Face (e.g., Realistic Vision or Dreamshaper).
Launch and generate your first image with the default workflow. Don't worry if it's not perfect — just make sure it works.
Experiment with parameters: change CFG, steps, sampler. You'll see differences immediately.
Explore pre-built workflows and add ControlNet or LoRA to level up.

Hardware issues? Consider a cloud GPU instance. But if you have a GTX 1060 6GB, you can still generate SD 1.5 images in about 30 seconds each. That's enough to get started.

Want to dive deeper into AI image tools? Check our complete guide on Midjourney, Firefly, and AI for Images. And if you're interested in using AI for your SME's marketing, we have a practical guide for you.