100K visitors in the past month..
. Description This repo contains GGUF format model files for Meta Llama 2s Llama 2 70B Chat About GGUF GGUF is a new format introduced by the llamacpp team on August 21st 2023. AWQ model s for GPU inference GPTQ models for GPU inference with multiple quantisation parameter options 2 3 4 5 6 and 8-bit GGUF models for CPUGPU inference. 3 min read Aug 5 2023 Photo by Miranda Salzgeber on Unsplash On Medium I mainly discussed QLoRa to run large language models LLM on consumer hardware. I was testing llama-2 70b q3_K_S at 32k context with the following arguments -c 32384 --rope-freq-base 80000 --rope-freq-scale 05 These seem to be settings for 16k..
100K visitors in the past month..
Comments