Commit Graph

2015 Commits

Author SHA1 Message Date
comfyanonymous 76a7fa96db Make built in lora training work on anima. (#12402) 2026-02-10 22:04:32 -05:00
Kohaku-Blueleaf cdcf4119b3 [Trainer] training with proper offloading (#12189)
* Fix bypass dtype/device moving

* Force offloading mode for training

* training context var

* offloading implementation in training node

* fix wrong input type

* Support bypass load lora model, correct adapter/offloading handling
2026-02-10 21:45:19 -05:00
rattus 123a7874a9 ops: Fix vanilla-fp8 loaded lora quality (#12390)
This was missing the stochastic rounding required for fp8 downcast
to be consistent with model_patcher.patch_weight_to_device.

Missed in testing as I spend too much time with quantized tensors
and overlooked the simpler ones.
2026-02-10 13:38:28 -05:00
rattus f719f9c062 sd: delay VAE dtype archive until after override (#12388)
VAEs have host specific dtype logic that should override the dynamic
_model_dtype. Defer the archiving of model dtypes until after.
2026-02-10 13:37:46 -05:00
rattus fe053ba5eb mp: dont deep-clone objects from model_options (#12382)
If there are non-trivial python objects nested in the model_options, this
causes all sorts of issues. Traverse lists and dicts so clones can safely
overide settings and BYO objects but stop there on the deepclone.
2026-02-10 13:37:17 -05:00
comfyanonymous a4be04c5d7 Ace step prompts match now. (#12376) 2026-02-09 19:45:56 -05:00
blepping baf8c87455 Iimprovements to ACE-Steps 1.5 text encoding (part 2) (#12350) 2026-02-09 19:41:49 -05:00
rattus 62315fbb15 Dynamic VRAM fixes - Ace 1.5 performance + a VRAM leak (#12368)
* revert threaded model loader change

This change was only needed to get around the pytorch 2.7 mempool bugs,
and should have been reverted along with #12260. This fixes a different
memory leak where pytorch gets confused about cache emptying.

* load non comfy weights

* MPDynamic: Pre-generate the tensors for vbars

Apparently this is an expensive operation that slows down things.

* bump to aimdo 1.8

New features:
watermark limit feature
logging enhancements
-O2 build on linux
2026-02-09 16:16:08 -05:00
comfyanonymous f350a84261 Disable prompt weights for ltxv2. (#12354) 2026-02-07 19:16:28 -05:00
comfyanonymous 17e7df43d1 Pad ace step 1.5 ref audio if not long enough. (#12341) 2026-02-07 00:02:11 -05:00
comfyanonymous 039955c527 Some fixes to previous pr. (#12339) 2026-02-06 20:14:52 -05:00
tdrussell 6a26328842 Support fp16 for Cosmos-Predict2 and Anima (#12249) 2026-02-06 20:12:15 -05:00
comfyanonymous 204e65b8dc Fix bug with last pr (#12338) 2026-02-06 19:48:20 -05:00
asagi4 a831c19b70 Fix return_word_ids=True with Anima tokenizer (#12328) 2026-02-06 19:38:04 -05:00
comfyanonymous eba6c940fd Make ace step 1.5 base model work properly with default workflow. (#12337) 2026-02-06 19:14:56 -05:00
comfyanonymous c2d7f07dbf Fix issue when using disable_unet_model_creation (#12315) 2026-02-05 19:24:09 -05:00
comfyanonymous 458292fef0 Fix some lowvram stuff with ace step 1.5 (#12312) 2026-02-05 19:15:04 -05:00
comfyanonymous 6555dc65b8 Make ace step 1.5 work without the llm. (#12311) 2026-02-05 16:43:45 -05:00
comfyanonymous 35183543e0 Add VAE tiled decode node for audio. (#12299) 2026-02-05 01:12:04 -05:00
blepping a246cc02b2 Improvements to ACE-Steps 1.5 text encoding (#12283) 2026-02-05 00:17:37 -05:00
comfyanonymous a50c32d63f Disable sage attention on ace step 1.5 (#12297) 2026-02-04 22:15:30 -05:00
comfyanonymous 6125b80979 Add llm sampling options and make reference audio work on ace step 1.5 (#12295) 2026-02-04 21:29:22 -05:00
comfyanonymous c8fcbd66ee Try to fix ace text encoder slowness on some configs. (#12290) 2026-02-04 19:37:05 -05:00
comfyanonymous 26dd7eb421 Fix ace step nan issue on some hardware/pytorch configs. (#12289) 2026-02-04 18:25:06 -05:00
rattus ef73070ea4 mp: Fix checkpoint saving (#12268)
Fix regression in the recent model saving refactor. Pass the non unet
pieces down the layers so that checkpoints are complete.
2026-02-04 02:08:45 -05:00
rattus d30c609f5a utils: safetensors: dont slice data on torch level (#12266)
Torch has alignment enforcement when viewing with data type changes
but only relative to itself. Do all tensor constructions straight
off the memory-view individually so pytorch doesnt see an alignment
problem.

The is needed for handling misaligned safetensors weights, which are
reasonably common in third party models.

This limits usage of this safetensors loader to GPU compute only
as CPUs kernnel are very likely to bus error. But it works for
dynamic_vram, where we really dont want to take a deep copy and we
always use GPU copy_ which disentangles the misalignment.
2026-02-04 01:48:47 -05:00
comfyanonymous a31681564d Fix crash with ace step 1.5 (#12264) 2026-02-04 00:03:21 -05:00
rattus 855849c658 mm: Remove Aimdo exemption for empty_cache (#12260)
Its more important to get the torch caching allocator GC up and running
than supporting the pyt2.7 bug. Switch it on.

Defeature dynamic_vram + pyt2.7.
2026-02-03 21:39:19 -05:00
comfyanonymous fe2511468d Support the 4B ace step 1.5 lm model. (#12257)
Can be used as an alternative to the 1.7B
2026-02-03 19:01:38 -05:00
comfyanonymous b8315e66cb Fix tiled vae for ace step 1.5 (#12253) 2026-02-03 14:40:45 -05:00
comfyanonymous ab1050bec3 Support ace step 1.5 base model loras. (#12252) 2026-02-03 13:54:23 -05:00
comfyanonymous 85fc35e8fa Fix mac issue. (#12250) 2026-02-03 12:19:39 -05:00
comfyanonymous 223364743c llama: cast logits as a comfy-weight (#12248)
This is using a different layers weight with .to(). Change it to use
the ops caster if the original layer is a comfy weight so that it picks
up dynamic_vram and async_offload functionality in full.

Co-authored-by: Rattus <rattus128@gmail.com>
2026-02-03 11:31:36 -05:00
comfyanonymous affe881354 Fix some issues with mac. (#12247) 2026-02-03 11:07:04 -05:00
comfyanonymous f5030e26fd Add progress bar to ace step. (#12242) 2026-02-03 04:09:30 -05:00
comfyanonymous 3c1a1a2df8 Basic support for the ace step 1.5 model. (#12237) 2026-02-03 00:06:18 -05:00
comfyanonymous c05a08ae66 Add back function. (#12234) 2026-02-02 19:52:07 -05:00
rattus de9ada6a41 Dynamic VRAM unloading fix (#12227)
* mp: fix full dynamic unloading

This was not unloading dynamic models when requesting a full unload via
the unpatch() code path.

This was ok, i your workflow was all dynamic models but fails with big
VRAM leaks if you need to fully unload something for a regular ModelPatcher

It also fices the "unload models" button.

* mm: load models outside of Aimdo Mempool

In dynamic_vram mode, escape the Aimdo mempool and load into the regular
mempool. Use a dummy thread to do it.
2026-02-02 17:35:20 -05:00
rattus 37f711d4a1 mm: Fix cast buffers with intel offloading (#12229)
Intel has offloading support but there were some nvidia calls in the
new cast buffer stuff.
2026-02-02 17:34:46 -05:00
comfyanonymous dd86b15521 Enable embeddings for some qwen 3 models. (#12218) 2026-02-02 03:51:09 -05:00
comfyanonymous 021ba20719 Fix issue with parameters on root model object. (#12216) 2026-02-01 20:12:52 -05:00
rattus 2b5da3b72e dynamic_vram: silence pytorch buffer warning (#12210)
This is log clutter and concerning to users. Its a false alarm.
2026-02-01 20:09:55 -05:00
rattus 794d05bdb1 dynamic_vram: respect argument cast dtypes in non-comfy weights (#12209)
This function has a dtype argument that allows the caller to set the
dtype in the cast. TIL Some models override this on weight casts, which
means its the highest priority.

Priority scheme is: argument > model dtype > state dict dtype
2026-02-01 20:09:21 -05:00
rattus 361b9a82a3 fix pinning with model defined dtype (#12208)
pinned memory was converted back to pinning the CPU side weight without
any changes. Fix the pinner to use the CPU weight and not the model defined
geometry. This will either save RAM or stop buffer overruns when the types
mismatch.

Fix the model defined weight caster to use the [ s.weight, s.bias ]
interpretation, as xfer_dest might be the flattened pin now. Fix the detection
of needing to cast to not be conditional on !pin.
2026-02-01 08:42:32 -08:00
comfyanonymous 667a1b8878 Fix some custom nodes breaking. (#12203) 2026-02-01 01:55:18 -05:00
rattus f8acd9c402 Reduce RAM usage, fix VRAM OOMs, and fix Windows shared memory spilling with adaptive model loading (#11845) 2026-02-01 01:01:11 -05:00
comfyanonymous 873de5f37a KV cache implementation for using llama models for text generation. (#12195) 2026-01-31 21:11:11 -05:00
comfyanonymous b8f848bfe3 Fix model not working with any res. (#12186) 2026-01-31 00:12:48 -05:00
comfyanonymous c9b633d84f Add missing spacial downscale ratios. (#12146) 2026-01-28 20:52:51 -05:00
guill dcff27fe3f Add support for dev-only nodes. (#12106)
When a node is declared as dev-only, it doesn't show in the default UI
unless the dev mode is enabled in the settings. The intention is to
allow nodes related to unit testing to be included in ComfyUI
distributions without confusing the average user.
2026-01-27 13:03:29 -08:00