rattus128 653ceab414
Some checks failed
Python Linting / Run Ruff (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.9, [self-hosted Linux], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, --use-pytorch-cross-attention, macos, 3.10, [self-hosted macOS], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, --use-pytorch-cross-attention, macos, 3.11, [self-hosted macOS], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, --use-pytorch-cross-attention, macos, 3.12, [self-hosted macOS], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-stable (12.1, --use-pytorch-cross-attention, macos, 3.9, [self-hosted macOS], stable) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Has been cancelled
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, --use-pytorch-cross-attention, macos, 3.11, [self-hosted macOS], nightly) (push) Has been cancelled
Execution Tests / test (macos-latest) (push) Has been cancelled
Execution Tests / test (ubuntu-latest) (push) Has been cancelled
Execution Tests / test (windows-latest) (push) Has been cancelled
Test server launches without errors / test (push) Has been cancelled
Unit Tests / test (macos-latest) (push) Has been cancelled
Unit Tests / test (ubuntu-latest) (push) Has been cancelled
Unit Tests / test (windows-2022) (push) Has been cancelled
Reduce Peak WAN inference VRAM usage - part II (#10062)
* flux: math: Use _addcmul to avoid expensive VRAM intermediate

The rope process can be the VRAM peak and this intermediate
for the addition result before releasing the original can OOM.
addcmul_ it.

* wan: Delete the self attention before cross attention

This saves VRAM when the cross attention and FFN are in play as the
VRAM peak.
2025-09-27 18:14:16 -04:00
..
2024-12-20 16:24:55 -05:00
2024-06-27 18:43:11 -04:00
2025-09-02 15:36:22 -04:00
2025-01-24 06:15:54 -05:00
2025-07-06 07:07:39 -04:00
2025-09-15 18:10:55 -04:00
2025-09-02 15:36:22 -04:00
2025-07-24 20:59:19 -04:00