updated default model used to ministral 8b

minstral ai just released their latest generation of open source models. Ministral 8b has great performance even with very small quantizations. So for now the ministral 8b q2 will be used as the new default. This significantly dereases the size of the container while improving performance
2025-12-08 00:39:26 +01:00 · 2025-12-08 00:39:26 +01:00 · b3ad72a7a2
commit b3ad72a7a2
parent 642ffc60c6
2 changed files with 5 additions and 2 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ## [Unreleased]

+### Changed
+- Changed default model shipped with paperless-llm-workflow to ministral 8b base (smaller model with better results)
+
 ### Fixed
 - increase default num gpu layers to 1024 for better performance with gpu
 - updated llama-cpp bindings to version b7314 2025-12-07
--- a/distribution/docker/Dockerfile
+++ b/distribution/docker/Dockerfile
@ -1,7 +1,7 @@
 ARG INFERENCE_BACKEND="vulkan"
 # using quantized version of qwen3 8b for more resource efficiency
-ARG MODEL_URL="https://huggingface.co/unsloth/Qwen3-8B-GGUF/resolve/main/Qwen3-8B-UD-Q2_K_XL.gguf?download=true"
-ARG MODEL_LICENSE_URL="https://huggingface.co/Qwen/Qwen3-8B-GGUF/resolve/main/LICENSE?download=true"
+ARG MODEL_URL="https://huggingface.co/robolamp/Ministral-3-8B-Base-2512-GGUF/resolve/main/Ministral-3-8B-Base-2512-Q2_K.gguf?download=true"
+ARG MODEL_LICENSE_URL="https://www.apache.org/licenses/LICENSE-2.0.txt"

 FROM docker.io/rust:latest as builder
 ARG INFERENCE_BACKEND