• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 hours ago

    It is! 143GB last I checked. I’m on 128GB RAM + 3090, 1 NUMA node, so I think it’s juuust barely too tight. But it should be perfect with a few of the “sparsest” MoEs quantized.

    If KTransformers supports something like that, I may have to finally check it out, since v4 won’t need many esoteric features.