Commit Graph

1356 Commits

Author SHA1 Message Date
comfyanonymous 1cd6cd6080 Disable pytorch attention in VAE for AMD. 2025-02-14 05:42:14 -05:00
comfyanonymous d7b4bf21a2 Auto enable mem efficient attention on gfx1100 on pytorch nightly 2.7
I'm not not sure which arches are supported yet. If you see improvements in
memory usage while using --use-pytorch-cross-attention on your AMD GPU let
me know and I will add it to the list.
2025-02-14 04:18:14 -05:00
comfyanonymous 019c7029ea Add a way to set a different compute dtype for the model at runtime.
Currently only works for diffusion models.
2025-02-13 20:34:03 -05:00
comfyanonymous 8773ccf74d Better memory estimation for ROCm that support mem efficient attention.
There is no way to check if the card actually supports it so it assumes
that it does if you use --use-pytorch-cross-attention with yours.
2025-02-13 08:32:36 -05:00
comfyanonymous 1d5d6586f3 Fix ruff. 2025-02-12 06:49:16 -05:00
zhoufan2956 35740259de mix_ascend_bf16_infer_err (#6794) 2025-02-12 06:48:11 -05:00
comfyanonymous ab888e1e0b Add add_weight_wrapper function to model patcher.
Functions can now easily be added to wrap/modify model weights.
2025-02-12 05:55:35 -05:00
comfyanonymous d9f0fcdb0c Cleanup. 2025-02-11 17:17:03 -05:00
HishamC b124256817 Fix for running via DirectML (#6542)
* Fix for running via DirectML

Fix DirectML empty image generation issue with Flux1. add CPU fallback for unsupported path. Verified the model works on AMD GPUs

* fix formating

* update casual mask calculation
2025-02-11 17:11:32 -05:00
comfyanonymous af4b7c91be Make --force-fp16 actually force the diffusion model to be fp16. 2025-02-11 08:33:09 -05:00
comfyanonymous 4027466c80 Make lumina model work with any latent resolution. 2025-02-10 00:24:20 -05:00
comfyanonymous 095d867147 Remove useless function. 2025-02-09 07:02:57 -05:00
Pam caeb27c3a5 res_multistep: Fix cfgpp and add ancestral samplers (#6731) 2025-02-08 19:39:58 -05:00
comfyanonymous 3d06e1c555 Make error more clear to user. 2025-02-08 18:57:24 -05:00
catboxanon 43a74c0de1 Allow FP16 accumulation with --fast (#6453)
Currently only applies to PyTorch nightly releases. (>=20250208)
2025-02-08 17:00:56 -05:00
comfyanonymous 079eccc92a Don't compress http response by default.
Remove argument to disable it.

Add new --enable-compress-response-body argument to enable it.
2025-02-07 03:29:21 -05:00
comfyanonymous 14880e6dba Remove some useless code. 2025-02-06 05:00:37 -05:00
comfyanonymous 37cd448529 Set the shift for Lumina back to 6. 2025-02-05 14:49:52 -05:00
comfyanonymous 94f21f9301 Upcasting rope to fp32 seems to make no difference in this model. 2025-02-05 04:32:47 -05:00
comfyanonymous 60653004e5 Use regular numbers for rope in lumina model. 2025-02-05 04:17:25 -05:00
comfyanonymous a57d635c5f Fix lumina 2 batches. 2025-02-04 21:48:11 -05:00
comfyanonymous 8ac2dddeed Lower the default shift of lumina to reduce artifacts. 2025-02-04 06:50:37 -05:00
comfyanonymous 3e880ac709 Fix on python 3.9 2025-02-04 04:20:56 -05:00
comfyanonymous e5ea112a90 Support Lumina 2 model. 2025-02-04 04:16:30 -05:00
comfyanonymous 44e19a28d3 Use maximum negative value instead of -inf for masks in text encoders.
This is probably more correct.
2025-02-02 09:46:00 -05:00
Dr.Lt.Data 0a0df5f136 better guide message for sageattention (#6634) 2025-02-02 09:26:47 -05:00
KarryCharon 24d6871e47 add disable-compres-response-body cli args; add compress middleware; (#6672) 2025-02-02 09:24:55 -05:00
comfyanonymous 9e1d301129 Only use stable cascade lora format with cascade model. 2025-02-01 06:35:22 -05:00
comfyanonymous 8d8dc9a262 Allow batch of different sigmas when noise scaling. 2025-01-30 06:49:52 -05:00
filtered 222f48c0f2 Allow changing folder_paths.base_path via command line argument. (#6600)
* Reimpl. CLI arg directly inside folder_paths.

* Update tests to use CLI arg mocking.

* Revert last-minute refactor.

* Fix test state polution.
2025-01-29 08:06:28 -05:00
comfyanonymous 13fd4d6e45 More friendly error messages for corrupted safetensors files. 2025-01-28 09:41:09 -05:00
comfyanonymous 255edf2246 Lower minimum ratio of loaded weights on Nvidia. 2025-01-27 05:26:51 -05:00
comfyanonymous 67feb05299 Remove redundant code. 2025-01-25 19:04:53 -05:00
comfyanonymous 14ca5f5a10 Remove useless code. 2025-01-24 06:15:54 -05:00
comfyanonymous 96e2a45193 Remove useless code. 2025-01-23 05:56:23 -05:00
Chenlei Hu dfa2b6d129 Remove unused function lcm in conds.py (#6572) 2025-01-23 05:54:09 -05:00
comfyanonymous d6bbe8c40f Remove support for python 3.8. 2025-01-22 17:04:30 -05:00
chaObserv e857dd48b8 Add gradient estimation sampler (#6554) 2025-01-22 05:29:40 -05:00
comfyanonymous fb2ad645a3 Add FluxDisableGuidance node to disable using the guidance embed. 2025-01-20 14:50:24 -05:00
comfyanonymous d8a7a32779 Cleanup old TODO. 2025-01-20 03:44:13 -05:00
Sergii Dymchenko ebf038d4fa Use torch.special.expm1 (#6388)
* Use `torch.special.expm1`

This function provides greater precision than `exp(x) - 1` for small values of `x`.

Found with TorchFix https://github.com/pytorch-labs/torchfix/

* Use non-alias
2025-01-19 04:54:32 -05:00
catboxanon b1a02131c9 Remove comfy.samplers self-import (#6506) 2025-01-18 17:49:51 -05:00
comfyanonymous 507199d9a8 Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
comfyanonymous 2f3ab40b62 Add warning when using old pytorch versions. 2025-01-17 18:47:27 -05:00
comfyanonymous 0aa2368e46 Fix some cosmos fp8 issues. 2025-01-16 17:45:37 -05:00
comfyanonymous cca96a85ae Fix cosmos VAE failing with videos longer than 121 frames. 2025-01-16 16:30:06 -05:00
comfyanonymous 31831e6ef1 Code refactor. 2025-01-16 07:23:54 -05:00
comfyanonymous 88ceb28e20 Tweak hunyuan memory usage factor. 2025-01-16 06:31:03 -05:00
comfyanonymous 23289a6a5c Clean up some debug lines. 2025-01-16 04:24:39 -05:00
comfyanonymous 9d8b6c1f46 More accurate memory estimation for cosmos and hunyuan video. 2025-01-16 03:48:40 -05:00