Quantized LLM on Hugging Face Spaces

Run a 4-bit quantized Vicuna-13B model on CPU using llama.cpp

50 512
0.1 2
0.1 1