[Issue]: SVD scripts are not working in directml #3342

rodrigoandrigo · 2024-07-16T23:13:46Z

Issue Description

I tried using the 3 video generator scripts with directml, but none of them worked

Text-to-Video in Text
models: Potat v1, ZeroScope v2 Dark, ModelScope 1.7b

Image-to-Video in Image
models: VGen

Stable Video Diffusion in Image
models: SVD XT 1.1

Version Platform Description

2024-07-16 12:38:07,164 | sd | INFO | launch | Starting SD.Next
2024-07-16 12:38:07,169 | sd | INFO | installer | Logger: file="C:\StabilityMatrix\Data\Packages\SD.Next\sdnext.log" level=INFO size=899852 mode=append
2024-07-16 12:38:07,171 | sd | INFO | installer | Python version=3.10.11 platform=Windows bin="C:\StabilityMatrix\Data\Packages\SD.Next\venv\Scripts\python.exe" venv="C:\StabilityMatrix\Data\Packages\SD.Next\venv"
2024-07-16 12:38:07,474 | sd | INFO | installer | Version: app=sd.next updated=2024-07-10 hash=2ec6e9ee branch=master url=https://github.com/vladmandic/automatic/tree/master ui=main
2024-07-16 12:38:08,050 | sd | INFO | launch | Platform: arch=AMD64 cpu=AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.11
2024-07-16 12:38:08,053 | sd | DEBUG | installer | Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
2024-07-16 12:38:08,054 | sd | DEBUG | installer | Torch overrides: cuda=False rocm=False ipex=False diml=True openvino=False
2024-07-16 12:38:08,054 | sd | DEBUG | installer | Torch allowed: cuda=False rocm=False ipex=False diml=True openvino=False
2024-07-16 12:38:08,054 | sd | INFO | installer | Using DirectML Backend
2024-07-16 09:35:37,397 | sd | DEBUG | launch | Starting module: <module 'webui' from 'C:\StabilityMatrix\Data\Packages\SD.Next\webui.py'>
2024-07-16 09:35:37,397 | sd | INFO | launch | Command line args: ['--medvram', '--autolaunch', '--use-directml'] medvram=True autolaunch=True use_directml=True
2024-07-16 09:35:37,399 | sd | DEBUG | launch | Env flags: []
2024-07-16 09:37:38,790 | sd | INFO | loader | Load packages: {'torch': '2.3.1+cpu', 'diffusers': '0.29.1', 'gradio': '3.43.2'}
2024-07-16 09:37:42,767 | sd | DEBUG | shared | Read: file="config.json" json=35 bytes=1548 time=0.000
2024-07-16 09:37:42,821 | sd | INFO | shared | Engine: backend=Backend.DIFFUSERS compute=directml device=privateuseone:0 attention="Dynamic Attention BMM" mode=no_grad
2024-07-16 09:37:42,979 | sd | INFO | shared | Device: device=AMD Radeon RX 6600M n=1 directml=0.2.2.dev240614
2024-07-16 09:37:42,987 | sd | DEBUG | shared | Read: file="html\reference.json" json=45 bytes=25986 time=0.006
2024-07-16 09:38:04,704 | sd | DEBUG | init | ONNX: version=1.18.1 provider=DmlExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider']

Relevant log output

Text-to-Video
Model: Potat v1
12:47:35-275745 ERROR    Arguments: args=('task(c5jnmnvhq3xjo9w)', 'woman,    
                         sitting on couch, female curvy, detailed face,        
                         perfect face, correct eyes, hairstyles, detailed     
                         muzzle, detailed mouth, five fingers, proper hands,   
                         proper shading, proper lighting, detailed character,  
                         high quality,', 'worst quality, bad quality, (text),  
                         ((signature, watermark)), extra limb, deformed hands, 
                         deformed feet, multiple tails, deformed, disfigured,  
                         poorly drawn face, mutated, extra limb, ugly, face out
                         of frame, oversaturated, sketch, comic, no pupils,    
                         simple background, ((blurry)), mutation, intersex, bad
                         anatomy, disfigured,', [], 20, 0, 26, True, False,    
                         False, False, 1, 1, 6, 6, 0.7, 0, 0.5, 1, 1, -1.0,    
                         -1.0, 0, 0, 0, 512, 512, False, 0.3, 2, 'None', False,
                         20, 0, 0, 10, 0, '', '', 0, 0, 0, 0, False, 4, 0.95,  
                         False, 0.6, 1, '#000000', 0, [], 11, 1, 'None',       
                         'None', 'None', 'None', 0.5, 0.5, 0.5, 0.5, None,     
                         None, None, None, 0, 0, 0, 0, 1, 1, 1, 1, None, None, 
                         None, None, False, '', 'None', 16, 'None', 1, True,   
                         'None', 2, True, 1, 0, True, 'none', 3, 4, 0.25, 0.25,
                         3, 1, 1, 0.8, 8, 64, True, True, 0.5, 600.0, 1.0, 1,  
                         1, 0.5, 0.5, 'OpenGVLab/InternVL-14B-224px', False,   
                         False, 'positive', 'comma', 0, False, False, '',      
                         'None', '', 1, '', 'None', 1, True, 10, 'Potat v1',   
                         True, 24, 'GIF', 2, True, 1, 0, 0, '', [], 0, '', [], 
                         0, '', [], False, True, False, False, False, False, 0,
                         'None', [], 'FaceID Base', True, True, 1, 1, 1, 0.5,  
                         False, 'person', 1, 0.5, True) kwargs={}              
12:47:35-284260 ERROR    gradio call: AttributeError                           
┌───────────────────── Traceback (most recent call last) ─────────────────────┐
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\call_queue.py:31 in f      │
│                                                                             │
│   30 │   │   │   try:                                                       │
│ > 31 │   │   │   │   res = func(*args, **kwargs)                            │
│   32 │   │   │   │   progress.record_results(id_task, res)                  │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\txt2img.py:89 in txt2img   │
│                                                                             │
│   88 │   p.script_args = args                                               │
│ > 89 │   processed = scripts.scripts_txt2img.run(p, *args)                  │
│   90 │   if processed is None:                                              │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\scripts.py:483 in run      │
│                                                                             │
│   482 │   │   parsed = p.per_script_args.get(script.title(), args[script.ar │
│ > 483 │   │   processed = script.run(p, *parsed)                            │
│   484 │   │   s.record(script.title())                                      │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\scripts\text2video.py:88 in run    │
│                                                                             │
│    87 │   │   │   shared.opts.sd_model_checkpoint = checkpoint              │
│ >  88 │   │   │   sd_models.reload_model_weights(op='model')                │
│    89                                                                       │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\sd_models.py:1572 in reloa │
│                                                                             │
│   1571 │   from modules import lowvram, sd_hijack                           │
│ > 1572 │   checkpoint_info = info or select_checkpoint(op=op) # are we sele │
│   1573 │   next_checkpoint_info = info or select_checkpoint(op='dict' if lo │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\sd_models.py:248 in select │
│                                                                             │
│    247 │   │   return None                                                  │
│ >  248 │   checkpoint_info = get_closet_checkpoint_match(model_checkpoint)  │
│    249 │   if checkpoint_info is not None:                                  │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\sd_models.py:197 in get_cl │
│                                                                             │
│    196 def get_closet_checkpoint_match(search_string):                      │
│ >  197 │   if search_string.startswith('huggingface/'):                     │
│    198 │   │   model_name = search_string.replace('huggingface/', '')       │
└─────────────────────────────────────────────────────────────────────────────┘
AttributeError: 'CheckpointInfo' object has no attribute 'startswith'


Text-to-Video
Model: ZeroScope v2 Dark
12:50:00-451738 ERROR    Arguments: args=('task(yfgrwdtd3i1wg4r)', 'woman,        
                         sitting on couch, female curvy, detailed face,        
                         perfect face, correct eyes, hairstyles, detailed     
                         muzzle, detailed mouth, five fingers, proper hands,   
                         proper shading, proper lighting, detailed character,  
                         high quality,', 'worst quality, bad quality, (text),  
                         ((signature, watermark)), extra limb, deformed hands, 
                         deformed feet, multiple tails, deformed, disfigured,  
                         poorly drawn face, mutated, extra limb, ugly, face out
                         of frame, oversaturated, sketch, comic, no pupils,    
                         simple background, ((blurry)), mutation, intersex, bad
                         anatomy, disfigured,', [], 20, 7, 26, True, False,    
                         False, False, 1, 1, 6, 6, 0.7, 0, 0.5, 1, 1, -1.0,    
                         -1.0, 0, 0, 0, 512, 512, False, 0.3, 2, 'None', False,
                         20, 0, 0, 10, 0, '', '', 0, 0, 0, 0, False, 4, 0.95,  
                         False, 0.6, 1, '#000000', 0, [], 11, 1, 'None',       
                         'None', 'None', 'None', 0.5, 0.5, 0.5, 0.5, None,     
                         None, None, None, 0, 0, 0, 0, 1, 1, 1, 1, None, None, 
                         None, None, False, '', 'None', 16, 'None', 1, True,   
                         'None', 2, True, 1, 0, True, 'none', 3, 4, 0.25, 0.25,
                         3, 1, 1, 0.8, 8, 64, True, True, 0.5, 600.0, 1.0, 1,  
                         1, 0.5, 0.5, 'OpenGVLab/InternVL-14B-224px', False,   
                         False, 'positive', 'comma', 0, False, False, '',      
                         'None', '', 1, '', 'None', 1, True, 10, 'ZeroScope v2 
                         Dark', True, 24, 'GIF', 2, True, 1, 0, 0, '', [], 0,  
                         '', [], 0, '', [], False, True, False, False, False,  
                         False, 0, 'None', [], 'FaceID Base', True, True, 1, 1,
                         1, 0.5, False, 'person', 1, 0.5, True) kwargs={}      
12:50:00-459258 ERROR    gradio call: TypeError                                
┌───────────────────── Traceback (most recent call last) ─────────────────────┐
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\call_queue.py:31 in f      │
│                                                                             │
│   30 │   │   │   try:                                                       │
│ > 31 │   │   │   │   res = func(*args, **kwargs)                            │
│   32 │   │   │   │   progress.record_results(id_task, res)                  │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\txt2img.py:89 in txt2img   │
│                                                                             │
│   88 │   p.script_args = args                                               │
│ > 89 │   processed = scripts.scripts_txt2img.run(p, *args)                  │
│   90 │   if processed is None:                                              │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\scripts.py:483 in run      │
│                                                                             │
│   482 │   │   parsed = p.per_script_args.get(script.title(), args[script.ar │
│ > 483 │   │   processed = script.run(p, *parsed)                            │
│   484 │   │   s.record(script.title())                                      │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\scripts\text2video.py:75 in run    │
│                                                                             │
│    74 │   │                                                                 │
│ >  75 │   │   if model['path'] in shared.opts.sd_model_checkpoint:          │
│    76 │   │   │   shared.log.debug(f'Text2Video cached: model={shared.opts. │
└─────────────────────────────────────────────────────────────────────────────┘
TypeError: argument of type 'CheckpointInfo' is not iterable


Text-to-Video
Model: ModelScope 1.7b
13:02:06-745445 ERROR    Processing: args={'prompt': ['woman, sitting on      
                         couch, female curvy, detailed eyes, perfect eyes,     
                         detailed face, perfect face, perfectly rendered face, 
                         correct eyes, hairstyles, detailed muzzle, detailed   
                         mouth, five fingers, proper hands, proper shading,    
                         proper lighting, detailed character, high quality,'], 
                         'negative_prompt': ['worst quality, bad quality,      
                         (text), ((signature, watermark)), extra limb, deformed
                         hands, deformed feet, multiple tails, deformed,       
                         disfigured, poorly drawn face, mutated, extra limb,   
                         ugly, face out of frame, oversaturated, sketch, comic,
                         no pupils, simple background, ((blurry)), mutation,   
                         intersex, bad anatomy, disfigured,'],                 
                         'guidance_scale': 6, 'generator': [<torch._C.Generator
                         object at 0x0000017C89FBA530>], 'callback_steps': 1,  
                         'callback': <function diffusers_callback_legacy at    
                         0x0000017C8BF3ECB0>, 'num_inference_steps': 20, 'eta':
                         1.0, 'output_type': 'latent', 'width': 320, 'height': 
                         320, 'num_frames': 16} input must be 4-dimensional    
13:02:06-750699 ERROR    Processing: RuntimeError                              
┌───────────────────── Traceback (most recent call last) ─────────────────────┐
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\processing_diffusers.py:12 │
│                                                                             │
│   121 │   │   else:                                                         │
│ > 122 │   │   │   output = shared.sd_model(**base_args)                     │
│   123 │   │   if isinstance(output, dict):                                  │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\utils │
│                                                                             │
│   114 │   │   with ctx_factory():                                           │
│ > 115 │   │   │   return func(*args, **kwargs)                              │
│   116                                                                       │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │
│                                                                             │
│   596 │   │   │   │   # predict the noise residual                          │
│ > 597 │   │   │   │   noise_pred = self.unet(                               │
│   598 │   │   │   │   │   latent_model_input,                               │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │
│                                                                             │
│   1531 │   │   else:                                                        │
│ > 1532 │   │   │   return self._call_impl(*args, **kwargs)                  │
│   1533                                                                      │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │
│                                                                             │
│   1540 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hook │
│ > 1541 │   │   │   return forward_call(*args, **kwargs)                     │
│   1542                                                                      │
│                                                                             │
│                          ... 12 frames hidden ...                           │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │
│                                                                             │
│   1540 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hook │
│ > 1541 │   │   │   return forward_call(*args, **kwargs)                     │
│   1542                                                                      │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │
│                                                                             │
│    609 │   def forward(self, input: Tensor) -> Tensor:                      │
│ >  610 │   │   return self._conv_forward(input, self.weight, self.bias)     │
│    611                                                                      │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │
│                                                                             │
│    604 │   │   │   )                                                        │
│ >  605 │   │   return F.conv3d(                                             │
│    606 │   │   │   input, weight, bias, self.stride, self.padding, self.dil │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\dml\amp\autocast_mode.py:4 │
│                                                                             │
│   42 │   │   op = getattr(resolved_obj, func_path[-1])                      │
│ > 43 │   │   setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: f │
│   44                                                                        │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\dml\amp\autocast_mode.py:1 │
│                                                                             │
│   14 │   if not torch.dml.is_autocast_enabled:                              │
│ > 15 │   │   return op(*args, **kwargs)                                     │
│   16 │   args = list(map(cast, args))                                       │
└─────────────────────────────────────────────────────────────────────────────┘
RuntimeError: input must be 4-dimensional


Image-to-Video
Model: VGen
13:08:33-673173 WARNING  Pipeline class change failed:                         
                         type=DiffusersTaskType.IMAGE_2_IMAGE                  
                         pipeline=I2VGenXLPipeline AutoPipeline can't find a   
                         pipeline linked to I2VGenXLPipeline for None          
13:08:34-378645 INFO     Base: class=I2VGenXLPipeline                          
13:08:47-883849 ERROR    Processing: args={'prompt': ['woman, sitting on      
                         couch, female curvy, detailed eyes, perfect eyes,     
                         detailed face, perfect face, perfectly rendered face, 
                         correct eyes, hairstyles, detailed muzzle, detailed   
                         mouth, five fingers, proper hands, proper shading,    
                         proper lighting, detailed character, high quality,'], 
                         'negative_prompt': ['worst quality, bad quality,      
                         (text), ((signature, watermark)), extra limb, deformed
                         hands, deformed feet, multiple tails, deformed,       
                         disfigured, poorly drawn face, mutated, extra limb,   
                         ugly, face out of frame, oversaturated, sketch, comic,
                         no pupils, simple background, ((blurry)), mutation,   
                         intersex, bad anatomy, disfigured,'],                 
                         'guidance_scale': 6, 'generator': [<torch._C.Generator
                         object at 0x0000026E161C7150>], 'num_inference_steps':
                         20, 'eta': 1.0, 'output_type': 'pil', 'width': 512,   
                         'height': 512, 'image': <PIL.Image.Image image        
                         mode=RGB size=512x512 at 0x26E118AE500>, 'num_frames':
                         16, 'target_fps': 8, 'decode_chunk_size': 8} the      
                         dimesion of at::Tensor must be 4 or lower, but got 5  
13:08:47-888378 ERROR    Processing: RuntimeError                              
┌───────────────────── Traceback (most recent call last) ─────────────────────┐
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\processing_diffusers.py:12 │
│                                                                             │
│   121 │   │   else:                                                         │
│ > 122 │   │   │   output = shared.sd_model(**base_args)                     │
│   123 │   │   if isinstance(output, dict):                                  │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\utils │
│                                                                             │
│   114 │   │   with ctx_factory():                                           │
│ > 115 │   │   │   return func(*args, **kwargs)                              │
│   116                                                                       │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │
│                                                                             │
│   639 │   │   image = self.video_processor.preprocess(resized_image).to(dev │
│ > 640 │   │   image_latents = self.prepare_image_latents(                   │
│   641 │   │   │   image,                                                    │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │
│                                                                             │
│   465 │   │   # duplicate image_latents for each generation per prompt, usi │
│ > 466 │   │   image_latents = image_latents.repeat(num_videos_per_prompt, 1 │
│   467                                                                       │
└─────────────────────────────────────────────────────────────────────────────┘
RuntimeError: the dimesion of at::Tensor must be 4 or lower, but got 5


Stable Video Diffusion
Model: SVD XT 1.1
13:12:03-607975 ERROR    Processing: args={'generator': <torch._C.Generator    
                         object at 0x000001F873C34810>, 'callback_on_step_end':
                         <function diffusers_callback at 0x000001F84F665D80>,  
                         'callback_on_step_end_tensor_inputs': ['latents'],    
                         'num_inference_steps': 20, 'output_type': 'pil',      
                         'image': <PIL.Image.Image image mode=RGB size=1024x576
                         at 0x1F8531FC610>, 'width': 1024, 'height': 576,      
                         'num_frames': 14, 'decode_chunk_size': 6,             
                         'motion_bucket_id': 128, 'noise_aug_strength': 0.1,   
                         'min_guidance_scale': 1, 'max_guidance_scale': 3} the 
                         dimesion of at::Tensor must be 4 or lower, but got 5  
13:12:03-611978 ERROR    Processing: RuntimeError                              
┌───────────────────── Traceback (most recent call last) ─────────────────────┐
│ C:\StabilityMatrix\Data\Packages\SD.Next\modules\processing_diffusers.py:12 │
│                                                                             │
│   121 │   │   else:                                                         │
│ > 122 │   │   │   output = shared.sd_model(**base_args)                     │
│   123 │   │   if isinstance(output, dict):                                  │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\utils │
│                                                                             │
│   114 │   │   with ctx_factory():                                           │
│ > 115 │   │   │   return func(*args, **kwargs)                              │
│   116                                                                       │
│                                                                             │
│ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │
│                                                                             │
│   523 │   │   # image_latents [batch, channels, height, width] ->[batch, nu │
│ > 524 │   │   image_latents = image_latents.unsqueeze(1).repeat(1, num_fram │
│   525                                                                       │
└─────────────────────────────────────────────────────────────────────────────┘
RuntimeError: the dimesion of at::Tensor must be 4 or lower, but got 5
13:12:03-690490 WARNING  Pipeline class change failed:                         
                         type=DiffusersTaskType.TEXT_2_IMAGE                   
                         pipeline=StableVideoDiffusionPipeline AutoPipeline    
                         can't find a pipeline linked to                       
                         StableVideoDiffusionPipeline for None

Backend

Diffusers

UI

Standard

Branch

Master

Model

StableDiffusion 1.5

Acknowledgements

I have read the above and searched for existing issues
I confirm that this is classified correctly and its not an extension issue

genewitch · 2024-11-02T20:57:29Z

i am unsure of what directml is, appears to be an alternative to --use-cuda; but these don't work with cuda, either, same errors - not iterable, "startswith".

hope that helps tracking down the issue!

16:01:05-437841 DEBUG    Text2Video: model={'name': 'Potat v1', 'path': 'camenduru/potat1', 'params': [24, 1024, 576]}
                         defaults=True frames=21, video=MP4 duration=2 loop=True pad=1 interpolate=0
16:01:05-439840 ERROR    Exception: argument of type 'CheckpointInfo' is not iterable
16:01:05-440840 ERROR    Arguments: args=('task(4tglye2wwa8ff2n)', '', ' there are trees and foliage visible in the
                         background on the right side, sunlight casts shadows across the scene creating a contrast
                         between light and dark areas, photograph taken during daytime. <lora:chas-notext-42e:1.4>',
                         'child, kid, baby, infant, low quality, ugly', [], 70, 20, 31, True, False, False, False, 1, 1,
                         7, 6, 0.7, 0, 0.5, 1, 1, -1.0, -1.0, 0, 0, 0, 253, 450, False, 0.3, 1, 1, 'Add with forward',
                         'None', False, 20, 0, 0, 20, 0, '', '', 0, 0, 0, 0, False, 4, 0.95, False, 0.6, 1, '#000000',
                         0, [], 20, 1, 'None', 'None', 'None', 'None', 0.5, 0.5, 0.5, 0.5, None, None, None, None,
                         False, False, False, False, 0, 0, 0, 0, 1, 1, 1, 1, None, None, None, None, False, '', False,
                         0, '', [], 0, '', [], 0, '', [], False, True, False, True, False, False, False, 0, 'None', [],
                         'FaceID Base', True, True, 1, 1, 1, 0.5, True, 'person', 1, 0.5, True, 'None', 16, 'None', 1,
                         True, 'None', 2, True, 1, 0, True, 'none', 3, 4, 0.25, 0.25, 1, -0.5, 0, 'THUDM/CogVideoX-2b',
                         'DDIM', 49, 6, 'balanced', True, 'None', 8, True, 1, 0, None, None, '', 0.5, 5, None, '', 0.5,
                         5, None, 3, 1, 1, 0.8, 8, 64, True, 0.65, True, False, 1, 1, 1, '', True, 0.5, 600.0, 1.0,
                         True, None, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0.5, 0.5, 'OpenGVLab/InternVL-14B-224px', False,
                         0.7, 1.2, 128, False, False, 'positive', 'comma', 0, False, False, '', 'None', '', 1, '',
                         'None', 1, True, 10, 'Potat v1', True, 21, 'MP4', 2, True, 1, 0, 0, '', [], 0, '', [], 0, '',
                         [], False, True, False, True, False, False, False, 0) kwargs={}
16:01:05-445841 ERROR    gradio call: TypeError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ D:\opt\automatic\modules\call_queue.py:31 in f                                                                       │
│                                                                                                                      │
│   30 │   │   │   try:                                                                                                │
│ ❱ 31 │   │   │   │   res = func(*args, **kwargs)                                                                     │
│   32 │   │   │   │   progress.record_results(id_task, res)                                                           │
│                                                                                                                      │
│ D:\opt\automatic\modules\txt2img.py:92 in txt2img                                                                    │
│                                                                                                                      │
│    91 │   p.state = state                                                                                            │
│ ❱  92 │   processed = scripts.scripts_txt2img.run(p, *args)                                                          │
│    93 │   if processed is None:                                                                                      │
│                                                                                                                      │
│ D:\opt\automatic\modules\scripts.py:502 in run                                                                       │
│                                                                                                                      │
│   501 │   │   if hasattr(script, 'run'):                                                                             │
│ ❱ 502 │   │   │   processed = script.run(p, *parsed)                                                                 │
│   503 │   │   else:                                                                                                  │
│                                                                                                                      │
│ D:\opt\automatic\scripts\text2video.py:75 in run                                                                     │
│                                                                                                                      │
│    74 │   │                                                                                                          │
│ ❱  75 │   │   if model['path'] in shared.opts.sd_model_checkpoint:                                                   │
│    76 │   │   │   shared.log.debug(f'Text2Video cached: model={shared.opts.sd_model_checkpoint}')                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: argument of type 'CheckpointInfo' is not iterable

vladmandic · 2024-11-03T10:19:38Z

@genewitch i've fixed your issue, but this is totally different - please check when posting issues. a) error is completely different, b) model used is completely different. issue is about svd, you're posting about t2v script.

genewitch · 2024-11-03T19:16:34Z

I don't really want to argue. i tried all three of the text-to-video that the OP said had that error and saw the same "CheckpointInfo is not iterable" as well as the error ending with "object has no attribute 'startswith'".

These diverge, i assume, because of the different backends or whatever, directml vs cuda, but the error has the same verbiage and lines 31 and 75 are in common in errors.

Sorry. I was trying to give context.

vladmandic · 2024-11-03T19:55:00Z

original error is not in the same module, even if the error may look the same to you.
issue was created for SVD - you're tring T2V - very different.
anyhow, its fixed in dev branch and will be included in the next release.

vladmandic changed the title ~~[Issue]: the Text2Video, image2video and Stable Video Diffusion scripts are not working in directml~~ [Issue]: SVD scripts are not working in directml Oct 30, 2024

vladmandic added the platform Platform specific problem label Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: SVD scripts are not working in directml #3342

[Issue]: SVD scripts are not working in directml #3342

rodrigoandrigo commented Jul 16, 2024

genewitch commented Nov 2, 2024 •

edited

Loading

vladmandic commented Nov 3, 2024

genewitch commented Nov 3, 2024

vladmandic commented Nov 3, 2024

[Issue]: SVD scripts are not working in directml #3342

[Issue]: SVD scripts are not working in directml #3342

Comments

rodrigoandrigo commented Jul 16, 2024

Issue Description

Version Platform Description

Relevant log output

Backend

UI

Branch

Model

Acknowledgements

genewitch commented Nov 2, 2024 • edited Loading

vladmandic commented Nov 3, 2024

genewitch commented Nov 3, 2024

vladmandic commented Nov 3, 2024

genewitch commented Nov 2, 2024 •

edited

Loading