Not exactly self hosting but maintaining/backing it up is hard for me. So many “what if”s are coming to my mind. Like what if DB gets corrupted? What if the device breaks? If on cloud provider, what if they decide to remove the server?
I need a local server and a remote one that are synced to confidentially self-host things and setting this up is a hassle I don’t want to take.
So my question is how safe is your setup? Are you still enthusiastic with it?
It doesn’t have to be hard - you just need to think methodically through each of your services and assess the cost of creating/storing the backup strategy you want versus the cost (in time, effort, inconvenience, etc) if you had to rebuild it from scratch.
For me, that means my photo and video library (currently Immich) and my digital records (Paperless) are backed up using a 2N+C strategy: a copy on each of 2 NASes locally, and another copy stored in the cloud.
Ditto for backups of my important homelab data. I have some important services (like Home Assistant, Node-RED, etc) that push their configs into a personal Gitlab instance each time there’s a change. So, I simply back that Gitlab instance up using the same strategy. It’s mainly raw text in files and a small database of git metadata, so it all compresses really nicely.
For other services/data that I’m less attached to, I only backup the metadata.
Say, for example, I’m hosting a media library that might replace my personal use of services that rhyme with “GetDicks” and “Slime Video”. I won’t necessarily backup the media files themselves - that would take way more space than I’m prepared to pay for. But I do backup the databases for that service that tells me what media files I had, and even the exact name of the media files when I “found” them.
In a total loss of all local data, even though the inconvenience factor would be quite high, the cost of storing backups would far outweigh that. Using the metadata I do backup, I could theoretically just set about rebuilding the media library from there. If I were hosting something like that, that is…
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
Fewer Letters More Letters DNS Domain Name Service/System Git Popular version control system, primarily for code HA Home Assistant automation software ~ High Availability HTTP Hypertext Transfer Protocol, the Web IP Internet Protocol LVM (Linux) Logical Volume Manager for filesystem mapping LXC Linux Containers NAS Network-Attached Storage PSU Power Supply Unit Plex Brand of media server package RAID Redundant Array of Independent Disks for mass storage RPi Raspberry Pi brand of SBC SBC Single-Board Computer SSH Secure Shell for remote terminal access VPS Virtual Private Server (opposed to shared hosting) ZFS Solaris/Linux filesystem focusing on data integrity nginx Popular HTTP server
15 acronyms in this thread; the most compressed thread commented on today has 3 acronyms.
[Thread #821 for this sub, first seen 21st Jun 2024, 17:05] [FAQ] [Full list] [Contact] [Source code]
My setup is pretty safe. Every day it copies the root file system to its RAID. It copies them into folders named after the day of the week, so I always have 7 days of root fs backups. From there, I manually backup the RAID to a PC at my parents’ house every few days. This is started from the remote PC so that if any sort of malware infects my server, it can’t infect the backups.
I feel like there are better ways to do this but it isn’t terrible
Better than losing it all
Off-site backups that are still local is brilliant.
Pretty solid backup strategy :) I like it.
Absurdly safe.
Proxmox cluster, HA active. Ceph for live data. Truenas for long term/slow data.
About 600 pounds of batteries at the bottom of the rack to weather short power outages (up to 5 hours). 2 dedicated breakers on different phases of power.
Dual/stacked switches with lacp’d connections that must be on both switches (one switch dies? Who cares). Dual firewalls with Carp ACTIVE/ACTIVE connection…
Basically everything is as redundant as it can be aside from one power source into the house… and one internet connection into the house. My “single point of failures” are all outside of my hands… and are all mitigated/risk assessed down.
I do not use cloud anything… to put even 1/10th of my shit onto the cloud it’s thousands a month.
Absurdly safe.
[…] Ceph
For me these two things are exclusive of each other. I had nothing but trouble with Ceph.
Ceph has been FANTASTIC for me. I’ve done the dumbest shit to try and break it and have had great success recovering every time.
The key in my experience is OODLES of bandwidth. It LOVES fat pipes. In my case 2x 40Gbps link on all 5 servers.
It depends on how you set it up. Most people do it wrong.
It’s quite robust, but it looks like everything will be destroyed when your server room burns down :)
Fire extinguisher is in the garage… literal feet from the server. But that specific problem is actually being addressed soon. My dad is setting up his cluster and I fronted him about 1/2 the capacity I have. I intend to sync longterm/slow storage to his box (the truenas box is the proxmox backup server target, so also collects the backups and puts a copy offsite).
Slow process… Working on it :) Still have to maintain my normal job after all.
Edit: another possible mitigation I’ve seriously thought about for “fire” are things like these…
https://hsewatch.com/automatic-fire-extinguisher/
Or those types of modules that some 3d printer people use to automatically handle fires…
Yeah I really like the “parent backup” strategy from @hperrin@lemmy.world :) This way it costs much less.
The real fun is going to be when he’s finally up and running… I have ~250TB of data on the Truenas box. Initial sync is going to take a hot week… or 2…
Edit: 23 days at his max download speed :(
Fine… a hot month and a half.
I’m doing something similar (with a lot less data), and I’m intending on syncing locally the first time to avoid this exact scenario.
Different phases of power? Did you have 3-phase ran to your house or something?
You could get a Starlink for redundant internet connection. Load balancing / fail over is an interesting challenge if you like to DIY.
Nope 240. I have 2x 120v legs.
I actually had verizon home internet (5g lte) to do that… but i need static addresses for some services. I’m still working that out a bit…
Couldn’t you use a VPS as the public entry point?
I could… But it would be a royal pain in the ass to find a VPS that has a clean address to use (especially for email operations).
In the US at least, most equipment (unless you get into high-and datacenter stuff) runs on 120V. We also use 240V power, but a 240V connection is actually two 120V phases 180-degrees out of sync. The main feed coming into your home is 240V, so your breaker panel splits the circuits evenly between the two phases. Running dual-phase power to a server rack is as simple as just running two 120V circuits from the panel.
My rack only receives a single 120V circuit, but it’s backed up by a dual-conversion UPS and a generator on a transfer switch. That was enough for me. For redundancy, though, dual phases, each with its own UPS, and dual-PSU servers are hard ro beat.
Exactly this. 2 phase into house, batteries on each leg. While it would be exceedingly rare for just one phase to go out… i can in theory weather that storm indefinitely.
What does your internal network look like for ceph?
40 ssds as my osds… 5 hosts… all nodes are all functions (monitor/manager/metadataservers), if I added more servers I would not add any more of those… (which I do have 3 more servers for “parts”/spares… but could turn them on too if I really wanted to.
2x 40gbps networking for each server.
Since upstream internet is only 8gbps I let some vms use that bandwidth too… but that doesn’t eat into enough to starve Ceph at all. There’s 2x1gbps for all the normal internet facing services (which also acts as an innate rate limiter for those services).
You should edit you post to make this sound simple.
“just a casual self hoster with no single point of failure”
Nah, that’d be mean. It isn’t “simple” by any stretch. It’s an aggregation of a lot of hours put into it. What’s fun is that when it gets that big you start putting tools together to do a lot of the work/diagnosing for you. A good chunk of those tools have made it into production for my companies too.
LibreNMS to tell me what died when… Wazuh to monitor most of the security aspects of it all. I have a gitea instance with my own repos for scripts when it comes maintenance time. Centralized stuff and a cron stub on the containers/vms can mean you update all your stuff in one go
Automate as much as possible. I rsync to both an online and home NAS for all of my hosted stuff, both at home and in the cloud. Updates for the OS and low level libraries are automated. The other updates are generally manual, that allows me to set aside time for fixing problems that updates might cause while still getting most of the critical security updates. And my update schedules are generally during the day, so that if something doesn’t restart properly, I can fix it.
Also, whenever possible I assume a fair amount of time for updates, far beyond what it should actually take. That way I won’t be rushed to fix the problem and end up having to revert to a backup and find time later to redo it. Then most of the time I have extra time for analyzing stats to see if I can improve performance or save money with optimizations.
I’ve never had a remote provider just suddenly vanish though I use fairly well known hosts. And as for local hardware, I just have to do without until I can buy a replacement. Or if it’s going to be some time, I do have old hardware that I could set up as a makeshift, temporary replacement like old desktop computers and some hardware that I use for experimenting like my Le Potato that isn’t powerful enough for much, but ok for the short term.
And finally I’ve been moving to more container-based setups that are easier to get up and running again. I’ve been experimenting with Nomad, Docker Swarm, K3s, etc., along with Traefik and some other reverse proxies so o can keep the workers air-gapped for security.
Not safe at all. I look for robustness. I prefer thinking about things that do not break easily (like ZFS and RAIDZ) instead of “what could possibly go wrong”
And I have never quite figured out how to do restores, so I neglect backups as well.
My incredible hatred and rage for not understanding things powers me on the cycle of trying and failing hundreds of times till I figure it out. Then I screw it all up somehow and the cycle begins again.
I have a rack in my garage.
My advice, keep it simple, keep it virtual.
I dumpster dove for hardware and run proxmox on hosts. Not even clustered, just simple stand alone proxmox hosts. Connect to my Synology storage device and done.
I run next cloud for webDav contacts and calendar (fuck Google), it does photo and do. Storage. The next client is free from F-Droid for Android and works on debian desktops like a charm.
I run Minecraft server
I run home automation server
I run a media server.
Proxmox backs everything up on schedule
All I need to do is get off-site backup setup for Synology important data and I’m all set.
It’s really not as hard as you think if you keep it simple
Not exactly self hosting but maintaining/backing it up is hard for me. So many “what if”s are coming to my mind. Like what if DB gets corrupted? What if the device breaks? If on cloud provider, what if they decide to remove the server?
Backups. If you follow the 3-2-1 backup strategy, you don’t have to worry about anything.
Isn’t it enough to have a single offsite backup?
Yeah, that’s exactly what the 3-2-1 rule says.
- 3 copies of your data
- 2 different kinds of storage media
- 1 off-site backup
My profesional experience is in systems administration, cloud architecture, and automation, with considerations for corporate disaster recovery and regular 3rd party audits.
The short answer to all of your questions boil down to two things;
1: If you’re going to maintain a system, write a script to build it, then use the script (I’ll expand this below).
2: Expect a catastrophic failure. Total loss, server gone. As such; backup all unique or user-generated data regularly, and practice restoring it.
Okay back to #1; I prefer shell scripts (pick your favorite shell, doesn’t matter which), because there are basically zero requirements. Your system will have your preferred shell installed within minutes of existing, there is no possibility that it won’t. But why shell? Because then you don’t need docker, or python, or a specific version of a specifc module/plugin/library/etc.
So okay, we’re gonna write a script. “I should install by hand as I’m taking down notes” right? Hell, “I can write the script as I’m manually installing”, “why can’t that be my notes?”. All totally valid, I do that too. But don’t use the manually installed one and call it done. Set the server on fire, make a new one, run the script. If everything works, you didn’t forget that “oh right, this thing real quick” requirement. You know your script will bring you from blank OS to working server.
Once you have those, the worst case scenario is “shit, it’s gone… build new server, run script, restore backup”. The penalty for critical loss of infrastructure is some downtime. If you want to avoid that, see if you can install the app on two servers, the DB on another two (with replication), and set up a cluster. Worst case (say the whole region is deleted) is the same; make new server, run script, restore backups.
If you really want to get into docker or etc after that, there’s no blocker. You know how the build the system “bare metal”, all that’s left is describing it to docker. Or cloudformation, terraform, etc, etc, etc. I highly recommend doing it with shell first, because A: You learn a lot about the system and B: you’re ready to troubleshoot it (if you want to figure out why it failed and try to mitigate it before it happens again, rather than just hitting “reset” every time).
I just started my mbin instance a week or two ago. When I did, I wrote a guided install script (it’s a long story, but I ended up having to blow away the server like 7 times and re-install).
This might be overkill for your purposes, but it’s the kind of thing I have in mind.
Note1: Sorry, it’s kinda sloppy. I need to clean it up before I submit a PR to the mbin devs for possible inclusion in their documentation. Note2: It assumes that you’re running a single-user instance, and on a single, small server, with no external requirements.
@iso@lemy.lol I think we need to accept that unless self-hosting is your full time job, things can and will break. At some point you have to accept it and let it go.
Finally I know when I die, my spouse won’t take care of my homelab and servers, all of it will go to the recycler.
First of all ignore the trends. Fuck docker, fuck nixos, fuck terraform or whatever tech stack gets shilled constantly.
Find a tech stack that is easy FOR YOU and settle on that. I haven’t changed technologies for 4 years now and feel like everything can fit in my head.
Second of all, look at the other people using commercial services and see how stressed they are. Google banned my account, youtube has ads all the time, the app for service X changed and it’s unusable and so on.
Nothing comes for free in terms of time and mental baggage
Yes, you should use something that makes sense to you but ignoring docker is likely going to cause more aggravation than not in the long term.
Yep, I went in this direction…until I gave in during a bare metal install of something…
Docker is not hassle free but usually most setup guides for apps are much much easier with docker
Docker/Podman or any containerized solution is basically the easiest way to get really nice maintenance properties like: updating one app won’t break others, won’t take down the whole system, can be moved from machine to machine.
Containers are a learning curve but I think very worth it for home setups. Compared to something like Kubernetes which I would say is less worth it unless you already know or want to learn Kubernetes.
Docker takes a lot of the management work out of the equation as many of the containers automatically update. Manual updates are as simple as recreating a container with a new image instead of your local one. I would like to add try running Portainer (a graphical management interface for Docker). Breaking out the various options into a GUI helped me learn the ins and outs of Docker better, plus if you end up expanding to multiple docker hosts you can manage them all from one console. I have a desktop, a laptop, and a RPi 4b all running various dockers and having a single pane for management is such a convenience.
Not to mention the advantage of infrastructure as code. All my docker configs are just a dozen or so text files (compose). I can recreate my server apps from a bare VM in just a few minutes then copy the data over to restore a backup, revert to a previous version or migrate to another server. Massive advantages compared to bare metal.
Docker is not a shill tech stack. It is a core developer tool that is certainly not required, but is certainly not fluff
I work IT for my day job managing a datacenter and cloud infrastructure.
I host mostly Plex, home assistant, and immich. Immich has its data backed up, I don’t care about Plex data. If it all dies, so be it.
I have a server coloed that houses some websites and email, plus some random other things I’ve setup and tested. It’s got backups, and downtime is fine.
If my self hosted stuff dies, it doesn’t matter. Nothing in my life ultimately relies on it.
Immutable Nixos. My entire server deployment from partitioning to config is stored in git on all my machines.
Every time I boot all runtime changes are “wiped”, which is really just BTRFS subvolume swapping.
Persistence is possible, but I’m forced to deal with it otherwise it will get wiped on boot.
I use LVM for mirrored volumes for local redundancy.
My persisted volumes are backed up automatically to B2 Backblaze using rclone. I don’t backup everything. Stuff I can download again are skipped for example. I don’t have anything currently that requires putting a process in “maint mode” like a database getting corrupt if I backup while its being written to. When I did, I’d either script gracefully shutting down the process or use any export functionality if the process supported it.
Others have said this, but it’s always a work in progress.
What started out as just a spare optiplex desktop and needing a dedicated box for Minecraft and valheim servers, to now having a rack in my living room with a few key things I and others rely on. You definitely aren’t alone XD
Regular, proactive work goes a long way. I also stated creating tickets for myself, each with a specific task. This way I could break things down, have reminders of what still needs attention, and track progress.
Do you host your ticketing system? I’d like to try one out. My TODO markings in my notes app don’t end up organized enough to be helpful. My experience is with JIRA, which I despise with every fiber of my being.
Try Vikunja, it might tick the box for you.
We built Vikunja with speed in mind - every interaction takes less than 100ms.
Their heads are certainly in the right place. I’ll check this out, thank you!
I have set up forgejo, which is a fork of gitea. It’s a git forge, but its ticketing system is quite good.
Oh neat, I was actually planning to set that up to store scripts and some projects I’m working on, I’ll give the tickets a try then.
Mostly I just use nextclouds deck extension. It behaves close enough to what I need as a solo operation.