A college student, a company network that blocks everything, and too many SSH tunnels to count.
An NVIDIA DGX Spark — a Grace Blackwell GB10 with 119 GiB of unified RAM — sits in a company office. It has no ethernet. Its only internet is a Windows laptop sharing its WiFi via ICS hotspot. That company network blocks:
Separately, the university network blocks controlplane.tailscale.com — so official Tailscale doesn't work there either. Two different hostile networks, same need for a self-hosted alternative.
Only TCP 443 reliably works. Everything is built around that single constraint.
controlplane.tailscale.comUpgrade: DERP header (only allows WebSocket)Self-hosted Headscale — an open-source Tailscale coordination server, running in Docker on a home Ubuntu VM. Replaces controlplane.tailscale.com with headscale.arthurlin.dev — which neither the company nor the school blocks, because it looks like any other HTTPS domain. One solution for two hostile networks.
Custom DERP relay — Tailscale's relay protocol, self-hosted on the same home server behind nginx. The key insight: the home server has a Hinet business IP (125.229.144.147), which the company network allows. Residential IPs are blocked. One IP range difference made the whole thing possible.
The DERP relay runs with -verify-clients — it queries the local Headscale to confirm connecting nodes are legitimate. Random internet scanners get rejected.
nginx reverse proxy — the home server (jason) accepts HTTPS on port 443, terminates TLS with a Let's Encrypt cert (DNS-01 via Cloudflare API), and proxies the request through the Tailscale mesh to the DGX at 100.64.0.3:8080.
WiFi auto-reconnect watchdog — the company WiFi (iii_wireless, WPA2-Enterprise, PEAP) disconnects every ~3 hours. Group Policy enforces connectionMode=manual and can't be changed even with admin. A PowerShell watchdog runs in the interactive desktop session (the only context with PEAP credentials), checks every 20 seconds, and reconnects with 60-second patience for 802.1X authentication.
This took 5 attempts to get right. Scheduled tasks can't access PEAP credentials. Processes started via SSH die with the session (Windows job objects). The solution: Invoke-WmiMethod to launch from the desktop session, surviving SSH disconnects.
NetworkManager persistence — when the WiFi drops, the ICS hotspot hiccups and the DGX loses its connection too. NetworkManager is set to autoconnect-retries=0 (forever) so it keeps trying to rejoin the hotspot instead of giving up after one failed handshake.
The DGX is enrolled in Headscale with forced_tags: [tag:dgx] — server-enforced, can't be removed by anyone on the machine. The ACL policy:
Translation: personal devices can SSH in (port 22) and reach this web server (port 8080). The DGX has zero outbound access — can't reach the home server, can't reach the laptop, can't use the exit node, can't browse the internet through the mesh. One-way street. Multiple people share this DGX with sudo access, so isolation isn't optional.
Headscale Tailscale DERP nginx Let's Encrypt Cloudflare DNS Docker NetworkManager PowerShell WPA2-Enterprise
Built with: Claude Code (Opus 4.6) + Codex (GPT-5.5) collaboration. Two AI models reviewing each other's work, chatting about security, and arguing about Windows scheduled tasks.