Framework 13 + Ryzen AI + Linux Distro + LLM

Thought about a new thread but figure this is relevant here …has anyone successfully gotten llama.cpp or similar to work with the Framework 13 in linux?

  • Using a AI 5 340 board in DIY form with Fedora 42
  • I’ve followed a number of guides, and llama.cpp builds but llama.cpp consistently gives the same errors and will not load models:
    • load_backend: failed to find ggml_backend_init in ~/opt/llama.cpp-vulkan/build/bin/libggml-rpc.so
      load_backend: failed to find ggml_backend_init in ~/opt/llama.cpp-vulkan/build/bin/libggml-vulkan.so
      load_backend: failed to find ggml_backend_init in ~/opt/llama.cpp-vulkan/build/bin/libggml-cpu.so
      
      
    • the compilation doesn’t error and trying to build with -DGGML_BACKEND_DL=ON does cause errors so I have not tested that
    • no public help has actually pointed to a method to fix this
    • I know this seems like a llama.cpp specific git issue, but it is heavily tied to the hardware configuration so asking the community.

I’ve followed the desktop guides from @lhl and @Lars_Urban with contributions by @kyuz0 (though I’d rather not use an arbitrary container and the toolbox tool was not found in the repo anyway).

https://community.frame.work/t/amd-strix-halo-llama-cpp-installation-guide-for-fedora-42/75856

Trying to use vulcan not ROCm. I think I’ve done all the things:

  • setup the environment and build
  • check devices
  • try to load a model

See code below

# root user
dnf install gcc.x86_64    # says installed 15.2.1-1
dnf install gcc-c++       # says installed 15.2.1-1
dnf install libstdc++     # says installed 15.2.1-1
dnf install python3-devel
dnf install python3-pip
dnf install mesa-vulkan-drivers.x86_64
dnf install vulkan-tools.x86_64
dnf install vulkan-headers.noarch
dnf install vulkan-loader-devel
dnf install curl.x86_64     # but already there
dnf install curlpp.x86_64	# A C++ wrapper for libcURL
dnf install curlpp-devel	# Development files for curlpp
dnf install libcurl-devel   # says already installed
dnf install glslc.x86_64
dnf install rocm.noarch
dnf install cmake
sudo grubby --update-kernel=ALL --args='amd_iommu=off amdgpu.gttsize=98304 ttm.pages_limit=25165824' # for 96 GB

# regular user
cd ~/opt/llama.cpp-vulkan/
git pull
cmake -B build -DGGML_VULKAN=ON && cmake --build build --config Release -j 11
~/opt/llama.cpp-vulkan/build/bin/llama-cli --list-devices
        ggml_vulkan: Found 1 Vulkan devices:
        ggml_vulkan: 0 = AMD Radeon 840M Graphics (RADV GFX1152) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
        register_backend: registered backend Vulkan (1 devices)
        register_device: registered device Vulkan0 (AMD Radeon 840M Graphics (RADV GFX1152))
        register_backend: registered backend RPC (0 devices)
        register_backend: registered backend CPU (1 devices)
        register_device: registered device CPU (AMD Ryzen AI 5 340 w/ Radeon 840M)
        load_backend: failed to find ggml_backend_init in ~/opt/llama.cpp-vulkan/build/bin/libggml-rpc.so
        load_backend: failed to find ggml_backend_init in ~/opt/llama.cpp-vulkan/build/bin/libggml-vulkan.so
        load_backend: failed to find ggml_backend_init in ~/opt/llama.cpp-vulkan/build/bin/libggml-cpu.so
        Available devices:
          Vulkan0: AMD Radeon 840M Graphics (RADV GFX1152) (65877 MiB, 65707 MiB free)

~/opt/llama.cpp-vulkan/build/bin/llama-cli -m /home/<username>/opt/llm_models/models/mistral_models/7B-Instruct-v0.3/model.q8_0.gguf -ngl 99
        bunch of output that includes the same 3 errors above and a failure to load model message

        load_tensors: loading model tensors, this can take a while... (mmap = true)
        llama_model_load: error loading model: missing tensor 'token_embd.weight'
        llama_model_load_from_file_impl: failed to load model
        common_init_from_params: failed to load model '~/opt/llm_models/models/mistral_models/7B-Instruct-v0.3/model.q8_0.gguf', try reducing --n-gpu-layers if you're running out of VRAM
        main: error: unable to load model