Skip to content

Can RotorQuant/TurboQuant and dflash support be added? #2184

@Sourajit1234

Description

@Sourajit1234

Hi!
This project is very useful for using with llama.cpp in python. However, it would be great if we could have some features even before the main llama.cpp has full support for them. By that I mean TurboQuant/RotorQuant and dFlash speculative decoding. It would make this entire library AMAZING to use.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions