The following is intended as a supplement to this link, which should be construed as a prerequisite, all else equal: The Best Way to Get Deepseek-R1: By Using an LXC in Proxmox
I recommend Ubuntu 22.04, Jammy Jelly, for less drama. I used to prefer Debian, but support of repos is falling off a cliff.
gpu.sh
#!/bin/bash
if [ -z "$1" ]; then
echo "Usage: $0 <LXC_CONTAINER_ID>"
exit 1
fi
LXC_ID="$1"
LXC_CONF="/etc/pve/lxc/$LXC_ID.conf"
GPU_DEVICES=("card0" "renderD128") # Adjust based on your GPU setup
echo "Stopping LXC container $LXC_ID..."
pct stop "$LXC_ID"
echo "Updating LXC config at $LXC_CONF..."
{
echo "unprivileged: 1"
echo "lxc.apparmor.profile: unconfined"
echo "lxc.cap.drop: "
echo "lxc.cgroup.devices.allow: c 226:* rwm"
echo "lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir"
echo "lxc.mount.entry: /dev/kfd dev/kfd none bind,optional,create=file"
echo "#lxc.idmap: u 0 100000 65536"
echo "#lxc.idmap: g 0 100000 65536"
echo "#lxc.idmap: u 100000 0 1"
echo "#lxc.idmap: g 100000 0 1"
echo "#lxc.idmap: g 100001 44 1"
} >> "$LXC_CONF"
echo "Detecting GPU devices..."
ls -l /dev/dri
echo "Setting up UDEV rules for GPU passthrough..."
cat <<EOF > /etc/udev/rules.d/99-gpu-passthrough.rules
KERNEL=="card0", SUBSYSTEM=="drm", MODE="0660", OWNER="100000", GROUP="100000"
KERNEL=="renderD128", SUBSYSTEM=="drm", MODE="0660", OWNER="100000", GROUP="100000"
EOF
echo "Reloading UDEV rules..."
udevadm control --reload-rules
udevadm trigger
echo "Creating media_group for GPU access..."
groupadd -f media_group
usermod -aG media_group root
echo "Creating GPU permission fix script at /usr/local/bin/gpu_permission_fix.sh..."
cat <<EOF > /usr/local/bin/gpu_permission_fix.sh
#!/bin/bash
chown root:media_group /dev/dri/renderD128
chmod 660 /dev/dri/renderD128
EOF
chmod +x /usr/local/bin/gpu_permission_fix.sh
echo "Creating systemd service for automatic GPU permission fix..."
cat <<EOF > /etc/systemd/system/gpu_permission_fix.service
[Unit]
Description=Run GPU permission fix at startup
After=network.target
[Service]
ExecStart=/usr/local/bin/gpu_permission_fix.sh
Restart=no
[Install]
WantedBy=multi-user.target
EOF
systemctl enable gpu_permission_fix.service
systemctl start gpu_permission_fix.service
echo "Starting LXC container $LXC_ID..."
pct start "$LXC_ID"
echo "Applying GPU group inside the container..."
pct exec "$LXC_ID" -- bash -c "
groupadd -f media_group
usermod -aG media_group root
chown root:media_group /dev/dri/renderD128
chmod 660 /dev/dri/renderD128
"
echo "Verifying GPU access inside the container..."
pct exec "$LXC_ID" -- ls -l /dev/dri
echo "GPU passthrough setup complete! 🎉"
chmod +x gpu.sh
sudo deploy gpu.sh <CONTAINER#>
It looks like the GPU devices inside the LXC container are owned by nobody:nogroup
, and the script is failing to change ownership and permissions due to restrictions in an unprivileged LXC container.
Fix: Use lxc.idmap
to Remap GPU Device Ownership
Since the container is unprivileged, the root
inside the container is mapped to an unprivileged user on the Proxmox host, preventing permission changes. We need to explicitly map the GPU devices to the container’s user namespace.
Step 1: Modify the LXC Configuration
Edit /etc/pve/lxc/100.conf
and add the following lines, IF the script failed to do so AND it’s not working:
unprivileged: 1
lxc.apparmor.profile: unconfined
lxc.cap.drop:
lxc.cgroup.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/kfd dev/kfd none bind,optional,create=file
# lxc.idmap: u 0 100000 65536
# lxc.idmap: g 0 100000 65536
# lxc.idmap: u 100000 0 1
# lxc.idmap: g 100000 0 1
# lxc.idmap: g 100001 44 1
This ensures that inside the container:
root
maps to an actual privileged root.- The
media_group
inside the container can access the GPU.
Step 2: Create a Media Group and Assign GPU Access
On the Proxmox Host, run:
groupadd -g 44 media_group
chown root:media_group /dev/dri/renderD128
chmod 660 /dev/dri/renderD128
This ensures the container’s group 44 (media_group) has access to the GPU.
Step 3: Apply Fix Inside the Container
Start the container:
pct start 100
Then, inside the container (pct exec 100 -- bash
), run:
groupadd -g 44 media_group
usermod -aG media_group root
chown root:media_group /dev/dri/renderD128
chmod 660 /dev/dri/renderD128
This will apply the correct permissions.
Step 4: Verify GPU Access
Inside the container, check:
ls -l /dev/dri
It should now show something like:
crw-rw---- 1 root media_group 226, 128 Feb 19 10:55 renderD128
This should fix the permission errors.