Performance on Intel QYFS, 512GB DDR5 and 96GB VRAM

#3
by phakio - opened

Thanks for the quantizations! I was very excited to try this model out! I downloaded the Q4_X quant just to test out the best perplexity I could on my system, but realistically after playing a bit I'll redownload Q3 quant to get good context length.

ik_llama.cpp
1x4090, 3x 3090

image

@ubergarm will be publishing his ik-specific quants maybe next week or so, if you use ik_llama primarily then you'd definitely benefit from the KS / KT quants supported there and you'd get better PPL / KLD. But thanks for trying out mine!

Sign up or log in to comment