Performance on Intel QYFS, 512GB DDR5 and 96GB VRAM

by phakio - opened 21 days ago

•

Thanks for the quantizations! I was very excited to try this model out! I downloaded the Q4_X quant just to test out the best perplexity I could on my system, but realistically after playing a bit I'll redownload Q3 quant to get good context length.

ik_llama.cpp
1x4090, 3x 3090

AesSedai

Owner 21 days ago

@ubergarm will be publishing his ik-specific quants maybe next week or so, if you use ik_llama primarily then you'd definitely benefit from the KS / KT quants supported there and you'd get better PPL / KLD. But thanks for trying out mine!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment