Thanks for the quantizations! I was very excited to try this model out! I downloaded the Q4_X quant just to test out the best perplexity I could on my system, but realistically after playing a bit I'll redownload Q3 quant to get good context length.
@ubergarm will be publishing his ik-specific quants maybe next week or so, if you use ik_llama primarily then you'd definitely benefit from the KS / KT quants supported there and you'd get better PPL / KLD. But thanks for trying out mine!