# llm-hosting This is an extended article to not have to write everything in the main readme. This chapter takes care of hosting llm models on the server. ## deploy ```sh kubectl apply -f llm/llama_cpp_hosting.yaml ``` ## development ```sh ``` ## links Two examples of model files that are currently tried out: * [https://huggingface.co/MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/resolve/main/Meta-Llama-3-70B-Instruct.IQ1_S.gguf?download=true](https://huggingface.co/MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/resolve/main/Meta-Llama-3-70B-Instruct.IQ1_S.gguf?download=true) * From [this page](https://huggingface.co/MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/tree/main). * [https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q8_0.gguf?download=true](https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q8_0.gguf?download=true) * From [this page](https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/tree/main).