Local And Distributed AI Redux (Proxmox)

Josef_Founder · January 23, 2025, 4:29am

Continued from Proxmox Symposium - Hybrid Cloud LOCAL AI - LLM + SDXL + LXC Containers + Kubernetes Fabric for AMD GPU's

In loving memory of the founder of Proxmox Helper Scripts, tteck/tteckster.

Why is Local and TRULY distributed AI SO IMPORTANT? Glad you asked… Bill? Take it away..

If it was not clear, the importance of TTeck’s work was and will be profound in this space. Scripts ARE the future of TRULY distributed. Remain vigilant against those whom we are developing away from.

SAFETY NOTICE: I wouldn’t personally recommend playing around with MESH wifi. One instance is probably fine, but clusters… who knows? Be careful out there. God bless every one of you. LONG LIVE DISTRIBUTED TECH!

Josef_Founder · January 25, 2025, 7:43am

Now, these virtualizations are not necessary, but I do consider the creature comforts of a good user experience to be just as, if not more important than our actual LLM. After all, if it’s not fun, why bother?

Josef_Founder · January 25, 2025, 9:39pm

Some Great 2.5G and/or 10G Switches - Nice for Ceph

You’re gonna like the way you network, I guarantee it…

https://www.amazon.com/s?k=10g+sfp+switch&s=price-asc-rank&qid=1737841282

Ethernet over power!

Power filtering and data? Power filter, why yes, thanks for considering my health, and data, well sure, for a limited number of nodes until tech improves. Speaking, have you heard of POE? Or was it called EoP, which do you like to think will win the arms race?

Mmm single cable nodes, so clean and so fresh..

And nobody seems to be talking about porting pci express x16, why is that? Connecting motherboards with a pci e x16 tap? Insane in the membrane…

Oopsie poopsie, the cat is out of the bag..

Josef_Founder · January 26, 2025, 12:07am

LVM vs. LVM-Thin vs. ZFS vs. RBD Ceph

https://www.reddit.com/r/Proxmox/comments/11zbw1d/lvmthin_vs_zfs/
Does Proxmox with ZFS support thin provisioning of vm disks? - Proxmox - Practical ZFS
https://www.reddit.com/r/Proxmox/comments/nhiebe/pros_cons_of_ceph_vs_zfs/
[SOLVED] - Cannt get snapshot branches: TASK ERROR: can't rollback, ____ is not most recent snapshot | Proxmox Support Forum

Differentiate Ceph Pools for SSD and HDD

Extra useful commands to resolve errors presented by Ceph summary page?

ceph osd pool application enable ssdpool rbd

ceph osd pool application enable hddpool rbd

… then navigate to Datacenter->Storage->Add->RBD for each of the two newly created pools to establish our RBD’s which should now be recognized across our cluster.

Now I don’t know if it is procedurally correct, but you can now add a Cephfs for storing ISO’s by clicking a given PVE->Ceph->CephFS->Create CephFS.

You can also navigate to a given PVE->Ceph->Pools->ssdpool->edit-> and reduce size using the toggle from 3 to 2 to increase disk capacity at the expense of redundancy or inversely increase from 3 to 4 to decrease disk capacity at the benefit of redundancy, depending upon your application.

I’ve heard it said that 3 is 99.95% reliable, while 2 is 98.5% reliable, all else equal. And I’ve heard it said that 2 is underrated in non-production environments as you can achieve a level of error correction with only 2 if you have BlueStore which utilizes checksum -
https://www.reddit.com/r/ceph/comments/zkksud/replica_3_vs_replica_2/

Now then, from personal experience I can tell you that if you only have three nodes and one of them goes down, your containers and VM’s will crawl. So while my data survived a bad memory stick I had, for example, I’m personally going to stick with 3/2 for non-production not only to make sure I don’t lose data but so that my HA (Highly Available) instances run strong even when an entire node goes down.

Starting Over Orphaned CEPH

Wiping an orphaned CEPH Disk:

List Disks (in hard drive’s host shell)

lsblk

This also provides valuable insights

fdisk -l

Use this to obtain drive path for the following command:

fdisk /dev/REPLACEWITHDISK

Delete a partition:

and/or

Create a new partition:

Now you will be prompted for things like your starting block (probably default enter), ending block (probably default enter), one other thing just follow directions carefully in the text wizard.

Save your work (PERMANENT!!):

Now you can navigate to Disks → Select Disk → Wipe Disk after making sure you’re in the correct node’s control panel. Don’t forget to create something with your new and cleaned partition!

M.2 NVMe - Those fast little guys

List Drives

lsblk

Use this to obtain what ceph name you need for:

dmsetup remove ceph-REALLY-LONG-NAME-OF-OLD-CEPH-PARTITION-FROM-LSBLK

Now you are free to navigate to Disks → select disk → Wipe disk. Don’t forget to do something with your shiny new Drive!

Deleting a Cephfs (Ceph filesystem):

*replace NAME with your Cephfs’s name

You can try if you like to remove, it may be protected both from this command and in the UI (User Interface - the graphical website)…

ceph fs rm NAME --yes-i-really-mean-it

WARNING: DELETES THINGS, USE AT YOUR OWN RISK

… Ergo, you probably need to remove protections:

pveceph stop --service mds.NAME

ceph fs set NAME down true

ceph fs fail NAME

ceph fs set NAME joinable false

ceph fs rm NAME --yes-i-really-mean-it

A more sanitized approach (necessary to delete underlying data also):

umount /mnt/pve/NAME

pveceph stop --service mds.NAME

pveceph mds destroy NAME

pveceph fs destroy NAME --remove-storages --remove-pools

If applicable, navigate to Datacenter → Storage → Select the Cephfs → Remove to delete from User Interface (UI)

Now you can navigate to the applicable pools / metadata on the UI-side and delete them if they’re still haunting the UI. If applicable, navigate to NODE → Ceph → Pools → Select the corresponding _data and _metadata → Destroy

Still having problems? Have you tried turning it off and on again?

reboot

https://docs.ceph.com/en/latest/cephfs/administration/
https://pve.proxmox.com/pve-docs/chapter-pveceph.html#_destroy_cephfs

Josef_Founder · January 26, 2025, 10:47pm

How about a more powerful version of CasaOS for you with Local AI support? Nice..

ZIMA FOR PROXMOX:

There must be a zillion alternatives out there right now..

Josef_Founder · January 29, 2025, 2:53am

Is Deepseek compromised? Probably. Is it faster than ChatGPT? Probably. Keep looking for better and better LLM’s with less and less capturability, I always said…

This is THE source! https://huggingface.co/

Josef_Founder · January 30, 2025, 12:03am

What about Upstream DNS? Virtual routers, anyone?

Pihole (basic DNS and DHCP, no upstream DNS) - Installing Pi-Hole on Proxmox – Natural Born Coder

Cloudflared + Pihole to dodge that nosy ISP - https://youtu.be/OfcuP01JyOE?si=TRSEbssf6j-MdzBe
… but not that nosy Cloudflare

Local Recursive DNS? Why yes - https://www.crosstalksolutions.com/the-worlds-greatest-pi-hole-and-unbound-tutorial-2023/#Unbound_Setup

Even MOAR Privacy -

Can PiHole act as a DoH or DoT server? - Customizing Pi-hole - Pi-hole Userspace
GitHub - DNSCrypt/doh-server: Fast, mature, secure DoH and ODoH server proxy written in Rust. Previously known as doh-proxy and rust-doh.

MAXIMUIM Privacy - I couldn’t find a good TOR + Pihole + Unbound + DNSECC DNS LXC I could trust, so I built my own

DNS Over TLS With Unbound - JWillikers

Stand-alone Router w/ Native Wireless Support: Docker support - RaspAP Documentation

LXC > Docker (Router): Install Pi-hole on Proxmox and Use OPNsense Unbound DNS as Upstream DNS

Josef_Founder · February 2, 2025, 11:39pm

Josef_Founder · February 4, 2025, 7:31am

I’ll just leave this here for now…

Wonder how many Tiny Core Linux instances a decent homelab could run? Hundreds? Thousands?!

Josef_Founder · February 7, 2025, 11:38am

.

Josef_Founder · February 16, 2025, 6:12pm

Would you like to SIP..

.. some TEA with me privately? How about your own call center? Press one to speak to the operator. tee hee.

DROID:

Josef_Founder · February 17, 2025, 11:22pm

Graphics Card (GPU) Passthrough

Josef_Founder · February 18, 2025, 8:58pm

https://stealthmachines.com/products/parts-2/

Josef_Founder · February 19, 2025, 9:17pm

Imagine one homelab to power virtually unlimited instances?

It’s a fundamental principle of teaching others that we do not have a way of understanding exactly how much more we know about a given subject than another person. Try as we might to presume, we are all possessing of our own unique experiences.

I often forget I’m at a different level than you, just as someone else is at a higher level of understanding than me. And noobs might know a trick or two that I never thought of. Only fools assume they know everything about a given topic; or one another.

I might have known about the following for a decade, while I hope the following is new and exciting for someone else. After all, we were all once amateurs at literally everything you could think of. Except maybe eating. Food is great, KWIM?

We fill in the unknown blanks by way of communication.

Why we are in the practice of hoarding knowledge is beyond me, those concepts are the way of the dodo. Most millionaires and billionaires seem pretty miserable to me. Maybe they hoarded too much knowledge and resources in thinking they got ahead? Lay not up your treasures where moths and dust collect!

It’s time to open up our sources so we can combine forces and build more. The whole is greater than the sum of the parts, all else equal.

I used to work at a HIVE farm in Iceland, who would have thought mining, birthed by gaming, would birth the next era of gaming?

It’s all about scaling limited computing resources

Can ONE x16 slot scale to produce 16x1? Perhaps even MORE?

Etc. Etc.

Corporate tried to bury SLi, I wonder why?

SLi or Crossfire scales old cards, memory channels also scale and memory timing can really help defeat newer gear. SHH…, corporate doesn’t want you to know that.

Source: StealthMachines.com

Josef_Founder · March 2, 2025, 8:15pm

Josef_Founder · March 3, 2025, 3:49am

Josef_Founder · March 8, 2025, 3:36am

Josef_Founder · April 2, 2025, 5:56pm

This is my friend Qain. Not sure if he feels the same way, don’t really care, I think he’s the bees knees at networking - a deplorable topic for cavemen like me . I met him at Pax. An underrated talent in this space. He comes from the same cave Level1Techs and tek syndicate crawled out of. I say this with much love! Check him out, like, share and subscribe for some of the best content out there!

EVEN MOAR POWER!!!11!!

Nearly too much power..

And as before I had mentioned PCI-E can be tapped directly, my lil’ secret, scroll up…

Josef_Founder · May 2, 2025, 1:02am

Josef_Founder · June 14, 2025, 9:38am