Quantized LLM on Hugging Face Spaces
Run a 4-bit quantized Vicuna-13B model on CPU using llama.cpp
Prompt
Max Tokens
↺
50
512
Temperature
↺
0.1
2
Top P
↺
0.1
1
Clear
Submit
Generated Text