Skip to content

Attention Processor Implementations for SDXL #4

@dibbla

Description

@dibbla

Hi!

I am working on migrating this great work to SDXL, starting with AFA. However, I found neither direct cascade nor AFA work.

I am using the Clip-ViT-big-G (project to 1280) and Clip-ViT-Large (project to 768) for image embedding and concatenate them as 768+1280=text embedding size. That would be considered as a token. And I parse them to the attention processors. Inside the attention, the image embedding is repeated to 77 to match text features.

However, even when using the direct concate method, I still find it no working. Any suggestion?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions