Yes. This is my ansible role that deploys it
- 1 Post
- 4 Comments
- vegetaaaaaaa@lemmy.worldtoSelfhosted@lemmy.world•Does anyone self host Kiwix (offline wikipedia)?English1·2 hours ago
- vegetaaaaaaa@lemmy.worldtoSelfhosted@lemmy.world•Ollama Server Component RecommendationsEnglish14·3 days ago
I suggest using llama.cpp instead of ollama, you can easily squeeze +10% in inference speed and other memory optimizations from llama.cpp. With hardware prices nowadays I think every % saved on resources matters. Here is a simple ansible role to setup llama.cpp, it should give you a good idea of how to deploy it.
A dedicated inference rig is not gonna be cheap. What I did, since I need a gaming rig; is getting 32GB DDR5 (this was before the current RAMpocalypse, if I had known I would have bought 64) and an AMD 9070 (16GB VRAM - again if I had known how crazy prices would get I’d probably ahve bought a 24GB VRAM card). The home server runs the usual/non-AI stuff, and llamacpp runs on the gaming desktop (the home server just has a proxy to it). Yeah the gaming desktop has to be powered up when I want to run inference, this is my main desktop so it’s powered on most of the time, no big deal
Email
Most applications/services offer mail as notification channel. Even old school unix utilities such as cron support sending mail (through the system MTA). I use
msmtp. Then configure K-9 mail or any decent mail client on your phone, setup filters so that mail from your services ends up in a high priority folder in your mailbox with notifications enabled.I want to be able to receive notifications both on mobile and desktop, this is the only reasonable option I found and have been running with it for > 10 years.
Damn their website has become a mess. Anyway