Safety First. Or Maybe After?
It was a completely normal Tuesday morning. Coffee, laptop open, and then: connection failed. Timeout. Try again. Nothing. My usual ssh root@<IP> just hung in the air, as if the request was disappearing into nowhere.
At the time I was on vacation in Croatia, relaxed somewhere between the Adriatic coast and Croatian summer cuisine, and the last thing I wanted to deal with was an unreachable server. My first suspicion was the obvious one: hotel WiFi. Who doesn't know the story? Those strange captive portal situations, application-layer firewalls that selectively block certain ports, some deep packet inspection shenanigans that occasionally turn a hotel network into a black box. SSH on port 22 is definitely something some network admins block in public networks. I pushed the problem aside and enjoyed the rest of the vacation.
But then I got home. Same IP, same command, same timeout. This was no longer hotel WiFi. This was my home network. This was my usual setup. And it still didn't work.
Emergency Solution with Obstacles
What's left when SSH fails? The detour via the remote console of my hosting provider IONOS. Anyone who has worked with it knows: it's no pleasure.
The console is browser-based, the keyboard input runs through some kind of virtual machine, and on top of that the keyboard layout is American (QWERTY), which is particularly annoying with bash commands since the special characters are often in different places than you're used to. The most painful part: no pasting with Ctrl+V. Every command, every IP address, every long config path has to be typed character by character. Still manageable for short commands, a real test of patience for longer configuration blocks. Nevertheless it was the only tool available to me in this situation. So let's get to work.
What the Logs Were Saying
Even the first look at the auth logs was sobering and revealing at the same time. systemctl status sshd quickly showed that something was wrong. A look at the logs confirmed the suspicion: grep "Failed password" /var/log/auth.log spat out hundreds, if not thousands, of entries. Bots. Countless bots. Automated scripts systematically trying usernames and passwords: root, admin, ubuntu, pi, test, deploy. A classic brute-force attack on SSH.
This is not an unusual phenomenon. Every server with a public IP address on the internet running SSH on the default port 22 gets found by bots and attacked within minutes of its first start. Search engines like Shodan continuously index open ports, and botnets use this data to automatically launch login attempts. My server hadn't been targeted specifically, it was simply part of the daily background noise of the internet.
The good news: my root password wasn't root123. The bad news: everything else was still pretty close to the default settings.
But back to the original problem: the timeout. While I was digging through firewall rules and log files, the actual problem was still in the dark. I had already invested several hours in troubleshooting, consulted various AI models, done forensics on the UFW rules and checked the IONOS firewall configuration multiple times. Nothing.
After about six hours an idea came to me that in hindsight was so obvious that I had to briefly laugh at myself: instead of the numerical IP address I simply entered the domain.
root@nicohartmann.dev
Connected.
Immediately. Without delay.
The reason lay in the way IONOS internally handles DNS and routing. The direct IP address of the server was apparently routed differently than the address resolved via the DNS name, a subtle difference in the network configuration that turned out to be the actual culprit behind my timeouts. Not the bots, not the firewall, not the hotel WiFi. Simply a routing problem that was hiding behind the domain resolution. Six hours of debugging. A solution with 24 characters.
Hardening: Now More Than Ever
Now, with a working SSH connection, I could catch up on what should have been done long ago: properly securing the server. And the look at the logs had underlined the urgency once more.
Cleaning up the firewall was the first step. UFW (Uncomplicated Firewall) offers a pleasantly clear abstraction layer over iptables. I checked the existing rules, removed unnecessary open ports and restricted incoming traffic to the essentials. At the same time I also adjusted the IONOS-internal firewall, which sits as an additional layer in front of the actual server and operates at the infrastructure level. Double protection never hurts.
Fail2Ban came next. The tool monitors log files and automatically blocks IP addresses that stand out due to repeated failed login attempts. After three failed SSH logins within a short time, an IP ends up in jail and can't get through for a defined period. This not only reduces the attack surface but also significantly reduces the noise level in the logs. The configuration for SSH is straightforward. Fail2Ban already comes with corresponding filters, they just need to be activated and adjusted to your own needs.
Unattended Upgrades closed another gap that I had simply never actively addressed: automatic security updates. Unpatched packages are one of the most common entry points for attackers, and manually remembering to regularly update a server is unreliable in the long run. With unattended-upgrades the system downloads and installs security-relevant updates on its own, without my involvement, without me having to think about it at three in the morning.
Changing the Locks
The server was running, the connection was established, and Fail2Ban was already watching over the incoming login attempts. But anyone who has looked at the auth logs and seen how tirelessly these bots work, second by second, hour by hour, knows: reactive measures aren't enough. The actual attack surface needs to be reduced. And the biggest attack surface was still SSH itself.
The default configuration of OpenSSH is functional, but not security-oriented. Port 22 is known worldwide, password authentication is active by default, and root login is directly allowed in many distributions. That is a combination that practically invites bots.
The first step was a port change to 2222. This is not a security mechanism in the actual sense. Security through obscurity is rightly not considered a real defense. Anyone who searches specifically will find the port. But in everyday use a non-standard port filters out the absolute majority of automated traffic, because most bots exclusively scan port 22. The logs get quieter, Fail2Ban has less to do, and the signal strength for real attack attempts increases. A pragmatic compromise.
The second and significantly more important step was disabling password authentication in favor of SSH key authentication. Passwords can be guessed, leaked or compromised through brute force. A cryptographic key pair cannot, at least not with any reasonable effort.
The process is straightforward:
ssh-keygen -t ed25519
Ed25519 is the modern standard for SSH keys: compact, fast and cryptographically far more robust than the older RSA with short key lengths. The generated key then needs to be transferred to the server.
An important point I almost overlooked at first: this step needs to be carried out on all machines from which I access the server. Anyone who only deposits the key on one device and then disables password auth locks themselves out of all other devices. An unpleasant experience I was able to spare myself through timely thinking.
Only after the key authentication had been tested and confirmed working on both machines was PasswordAuthentication no set in the config file and the SSH service restarted. From this moment on a login without a matching private key is simply no longer possible, no matter how many passwords a bot tries.
Don't Forget the Deployment Process
One detail that's easy to forget when you change your SSH port: automated processes that also communicate via SSH. In my case this affects the GitHub Actions pipeline that automatically deploys my portfolio website to the server on every push to the main branch.
This pipeline connects to the server via SSH and had been doing so via port 22 up until now. After the port change to 2222 the deployment went nowhere. The solution is simple, but you have to think of it: adjust the port accordingly in the action configuration. A port: 2222 in the SSH parameters of the action, a new commit and the deployment process was running smoothly again.
The Forgotten Neighbor
Alongside the actual server, Mailcow is also running on my machine, a self-hosted mail server solution that I operate for my own domain. And while I had been intensively taking care of SSH, Mailcow had been somewhat neglected in the meantime.
The admin password was one of those passwords that I had set once but then lost track of. So: reset it and replace it with a new, memorable but secure password. Sounds trivial, but it's one of the most common security vulnerabilities in self-hosting: default passwords or forgotten, never-changed initial passwords.
Mailcow comes with its own security mechanisms, but they have to be actively enabled. Fail2Ban is also integrable in Mailcow and monitors failed login attempts on the web interface and the mail protocols there. The setup follows the general Fail2Ban logic, but is directly configurable via the Mailcow interface.
Particularly important was setting up two-factor authentication, both on the admin account and on the actual mailbox. 2FA is by now the minimum standard for every account where a compromise can cause real damage. A mail server admin account definitely falls into this category: anyone who breaks in there can not only read mails, but control the entire mail flow, set up redirects and in the worst case gain access to linked services via password reset mechanisms.
Knowing What's Happening on the Server
With the acute security measures done, a question came up that had been simmering in the back of my mind for a while: how do I find out when something goes wrong again, before I notice it by chance?
The answer was a custom monitoring setup. And since I was already at it, I didn't just want to use the collected data internally, but make it directly accessible: as a public status page on my portfolio website. The result is the project sysstats. an overview that shows me at any time how my server is doing: CPU load, memory usage, and also a listing of the IP addresses banned by Fail2Ban. The latter will presumably get significantly shorter with the new security measures, which I consider a success, even if the long list of banned IPs had its own morbid charm.
Conclusion
What started as an annoying timeout on a Tuesday morning in Croatia ended with a server that is considerably more robust than before. Port change, key authentication, Fail2Ban, automatic updates, hardened mail server, 2FA and monitoring. That is not excessive effort. That is the foundation every self-hosted server should have.
But the actual lesson was a different one: how much background noise there is on the internet, how active and automated the attacks on every public server are, and how little it takes to effectively protect yourself against them. The bots won't stop. But they'll have significantly less fun with my server from now on.