OPNSense High Availability - 1 VM, 1 IP!
HTML-код
- Опубликовано: 11 июл 2024
- In this video I show how to perform OPNSense 'HA' using a single VM and 1 IP. This technique makes use of homogenous network setups across identical nodes where failover in Proxmox comes into effect.
Recommended Hardware: github.com/JamesTurland/JimsG...
Discord: / discord
Twitter: / jimsgarage_
Reddit: / jims-garage
GitHub: github.com/JamesTurland/JimsG...
00:00 - Introduction to High Availability
01:18 - Network Overview
04:05 - Proxmox Overview
11:51 - Physical Overview
13:02 - Testing and Failover
15:12 - Ping During Failover
17:26 - Speed Tests
21:47 - Testing Migration in Real Time
23:43 - Outro Наука
Hey Jim, awesome material as usual. As for the hiccup - switches learn mac addresses and assign them to specific physical port so when you fail-over to a new physical machine there is some timeout happening on both WAN and LAN switches. Additionally many switches have mac spoofing protections so that might explain it as well. Not sure how to walk around this though. I would hope managed switches would have some functionality to allow "jumping" mac addresses.
Thanks. All the VMs and physical machines should still have the same MACs though. Suspect it could be ARP related as the ports for LAN Trunk and WAN do change. I'll do some more digging. Either way, I can deal with a few seconds of outage for the bonus of this setup.
Jeez that seems like a lot of moving parts and nail biting 😉
I went with Unifi shadow mode so i can sleep at night. Proxmox for everything else though. Great video as always Jim
@@substandard649 thanks, yeah I'm really glad they've finally created HA! If I had a udm I would probably go the same route, quite expensive though for me to buy 2 off the bat
@@Jims-Garage you can't put a price on a good nights sleep Jim 😀 And that hair transplant you'll inevitably need will cost way more!
Nice one. I do exactly this also. I agree with plan to eliminate the small switch and trunk the ISP VLAN to the pve hosts. That's what I do.
Awesome, thanks for sharing!
Hi Jim. Would love to see a video of you explaining how you’ve managed to keep all of your hair through this journey with the MS-01 workstations. Keep up the great work 👋
Haha! It hasn't been simple, lots of work has gone into this behind the scenes.
Stop scaring me.. Mine is on backorder.
@@emanuelpersson3168 ha, don't worry. You don't need to go mad like I have 😂
@@Jims-Garage The end game for me is to go down that route just like you. But i don't think i will ever be able to... My dream is to learn Kubernetes and to get a "Proxmox HA CEPH" cluster and in that a "K3s HA Cluster".
@@emanuelpersson3168 awesome, well hopefully I've done enough to document my trials and tribulations and help you along the way!
Outstanding content as usual Jim! way different than regular installation of software/hardware... Thanks for sharing with enough detail to make it understandable and aplicable! Keep good job
My pleasure! Glad it was useful 😃
I'm using basicaly the same setup, but with pfSense and different hardware. It has been rock solid for over a year, no outage of any kind and great performance, can recommend !
@@PlatyBZH that's reassuring to hear, thanks for commenting
F8ck strikes! Your content deserves more likes and attention
Thank you. Just need to keep plugging away
@@Jims-Garage Was it an automated strike for something like "fag packet maths"? I don't think the septics' content filters speak proper English...
Just moved to proxmox. In my previous VMware setup, I used Starwind vSan to HA pfsense. I plan on doing that again in proxmox or just use clustered ZFS and replication to make it even simpler.
Nice, that should work well.
I'm sure you've considered it, but with CARP on the various WAN/LAN segments and using OPNsense's internal HA scheme - inclusive of state via 'pfsync' - you have HA in a way that allows you to patch/reboot/put into maintenance one or the other without taking an outage. OPNsense's HA is pretty tolerant of version disparity, too - allowing you to have the "backup" instance behind / ahead of where the "prod" instance is per your preferences. If you won't want to have *2* instances taking up resources, however, it's not a fit.
@@organon69 thanks for that, it's a good suggestion and something that I considered. Ultimately I wanted to try what I believe to be the easiest option first,. especially given my cluster is identical. Fortunately this seems to work well albeit it's not perfect.
@@Jims-Garage Totally get it. Get "The Now" working, noodle on "The Next". One thing to watch if you consider an OPNsense-driven HA setup is how your ISP device allows DMZ/IP Passthrough to the firewall. Generally they allow a single IP (which would ostensibly be the CARP-based VIP) but sometimes don't like MAC-change shenanigans for the same IP. That is, CARP VIPs aren't discrete MACs - the VIP is an additional IP on the same int/MAC - so in an HA failover scenario the ARP behaviour on the ISP device needs to not freak out that the MAC behind that "DMZ" IP has changed all of a sudden. That dynamic alone may make you stick with the setup you walked through in the vid.
I've been looking for a video series where someone actually uses the MS-01s as their main homelab with proxmox. The more videos i watch and the more i read about all this, the more i want to buy 3 myself and essentially replicate your setup. What has you power consumption been like with all of these? Do you still use a seperate clasic rack server for mass storage?
It's running all 3 at around 150W which is a huge improvement over my old setup. These run my workloads but I also have a TrueNAS NAS attached to the network for long term storage.
A lot of 'yeah-nah-yeah' moments in this one
@@amosgiture not sure what that means, but I did state that it was live.
Hey Jim, can you list the hardware that you have used in this video such as the switch where your ISP is connected to? Awesome video which game me some ideas or just blow up my network😆. Thank you.
@@WilsonVelez hey, please check out my earlier MS-01 videos, I believe it's linked on there, cannot remember off hand. To be honest any basic switch will do for that part.
@@Jims-Garage Yeah, my apologies, after writing the comment I noticed your "Recommended Hardware" link. Again, thank you for your videos.
What is the technology supporting the 10.0.0.1/29? Thunderbolt? Thank you.
Yes, it's a thunderbolt ring network.
Also why didnt you go with a mikrotik switch that has the required SFP ports? Something like the CRS310-1G-5S-4S+IN
Since you have OPNSense, you arent using a dream machine or something like that so would it not be easier/cheaper to go Mikrotik?
The original switch was bought around 5 years ago when I also had a UDM Pro. Cheapest option I could think of was to add the USW-Agg.
Great video. Ur network seems a bit complicated. I'm working on my own atm. I've no clue how I should make some things :D especially thinking about upgrading to 10gig. I saw u had no sophos instance. Are u not using sophos anymore?
@@oli1505 no, this video is about OPNSense. Sophos is still good though
@Jims-Garage so u're using both? That would be an interesting video of how that's working.. I'd also appreciate another sophos video. 🤟 There is not much out there.
It's hard to get things done without any practice.
So general best practice videos how things should be designed/work together would also be nice 😁
@@oli1505 no, I moved completely to OPNSense. Long story but it was to do with my new internet (I explained it in a video). Long story short, I could go back to Sophos now but I'm enjoying OPNSense at the moment.
@@Jims-Garage ohh I guess I missed that one.. I'm gonna watch it 👍
isn't ceph running on your thunderbold connection? last time you showed it had frequent paket loss, I would expect this causing a performance penalty
It is, but even with the retries it was able to hit 2.5GB/s. My understanding is that the performance I see is typical of Ceph as it's not designed with raw performance in mind.
@@Jims-Garage i think an opnsense update is more like 4k iops than sequential writes what 2.5GB/s seems to be
@sku2007 i run ceph and guest vlans on shared 2x10Gbps LACP LAG for each host and not aware of any issues. I would think tb links would outperform, but maybe it's a driver issue?
What happened to the other HA setup you had with the 2 opnsense vms? I am still using that setup, way faster failover.
@@SharkBait_ZA I wanted to avoid double NAT and I only have a single IP.
@@Jims-Garage Sorry, I forgot about that. My setup has public IPs, so only single NAT for me. 🙂
just a quick couple of comments. I am doing something like this but what I do is have a small 4 port switch where I have 1 WAN in and 2 WAN out. I only have a single copy of OPNSense running which I failover to 2 different Proxmox Machines. I can also easily live migrate between the two. One last VERY important note for MS-01 owners. The 2.5G LANport with Management abilities WILL NOT work as the LAN port in OPNSense as it does DHCP does not work on it for some reason.
@russellmm My guess is that is vpro related and a workaround is likely to disable vpro in the bios.
@@johnwalshaw yes, it is related but there is no way to turn that off in the MS-01 BIOS that I am aware of. Minisforum does not have the best BIOS support.
@russellmm On my 3xLenovo P340 towers running Proxmox, in addition to the 2x10Gbps I use for primary, I use the onboard 1Gbps vpro nic. It is configured as a linux bridge. From memory, the vpro and host IP required it to be native vlan and tagged (trunked) vlans for everything else works fine. I also use this as a secondary path for CEPH. I have not tested PCIe passthrough of a vPRO NIC. I think vPRO is configured as static and not DHCP in this case. I checked my notes but not sure where I documented all this. I was very happy with the serial over IP feature and reccommend this.
@@russellmm It is, mine came with vPro off. I tried it, it sucks, so I turné it off again. Unfortunately no time to tell you exactly, where it is, but it's there.
@@MrakCZ i'll check again, thanks
Are you avoiding using LXC containers for any particular reason? The question is unrelated to OPNsense. Also I gotta ask, is your YT guidelines strike related to your AI thumbnails?
I prefer VMs for security and simplicity, although I've covered LXCs in the past and have used them.
The strike was for Plex. Apparently that's against their policy (for me at least).
Is the reason CARP won't work that you can't specify the MAC address of the WAN CARP virtual IP, so the fibre ONT won't talk to the new MAC when it fails over?
kind of a noob question, this doesnt put you under double nat right? even with your future plan by not using the small switch?
No, there's no double NAT here.
It might be helpful for you to say why you think that this would add an extra level of NAT, as it's likely just a misunderstanding.
All this does is add a switch between the incoming WAN connection and the routers, so a packet from WAN hits the switch and whichever node is currently acting as the router receives the packet. The other 2 aren't listening for it and don't respond.
As far as the devices (both ISP on WAN side and on LAN side) using the router are concerned, this is exactly the same as having just one machine permanently acting as the router.
Tell us a little about an defguard -
open-source solution with real WireGuard MFA/2FA & integrated OpenID Connect SSO.
I have a vps with a white address, as well as a domain that is linked to cloudflare.
you are really making this much more complicated than necessary and conflating things - opnsense will run fine on 50 dollar boxes - break down and make the opnsense HA setup on 2 separate boxes and leave proxmox and ceph to do their own thing - running db and load balanced applications - this way you keep things much simpler and discrete