Depends. Whether the sum of features and specs will justify the price. If you manage to convince the audience that the router as a sum is worth it , then people will justify a small increase in price for a better cpu. It depends if subconsciously we will consider it as high or mid router , thus beginning to compare it with other routers in the market.
i am excited about your router, i prefer lower cost, because very expensive consumer-grade routers have around 2GB of memory, like this Asus ROG Rapture GT-BE98 that has 2 10-Gbit ports, i think 4GB of RAM is allready an overkill, i am going to keep watching your videos. Thank you!
@@davidmcken Wireguard is designed to run fast on CPU, using algos that are best suited for CPU. The flip side of this is that nobody is going to make a hardware accelerator for them because they run so fast on CPU anyway
@5:40 DING! Wrong. The GPL is very clear on this. ANY modification done to the kernel source MUST be made available to anyone to whom you've provided a binary. In short, any kernel you compile, your customers must also be able to compile. HOWEVER, external, 3rd party _modules_ can be 100% closed source. I've been here, it can be messy. And yes, giant manufacturers like Broadcom and NXP don't care about breaking the law, or forcing you to. (And there's some debate about making changes to the kernel simply to facilitate external closed modules. For example, undoing the "taint" caused by non-GPL modules, so one can use a GPL-only exported function.)
Agreed, patches and modification to the kernel are covered by GPL and need to be public. A single binary is all that's needed and even then it's still a poor argument for closed sourced code within this project. If he held NXP to the fire and made them comply with GPL in the first place then I would be more open to using their secret sauce.
@@francoisscala417 The issue is that the module must still hook into the kernel APIs. Nettrack hasn't all the necessary APIs for offloading exposed. There is some work in the progress but so far the HW accelleration possible is very very limited. This kind of patches are mainly in the core part of the kernel and not in the external modules. Then they also patched userspace utilities for offloading, meaning you are running more unknown modified software. That have to hook into the kernel in a different way for offloading. I'm thinking about tcpdump that is not able to see the traffic. NXP probably has a patched version. then there are the various VPNs stacks, the ppp daemon etc. it will become a huge mess very easily.
@@francoisscala417 They *could* release the binary module, and then open source a wrapper to load the binary module in the kernel. However this isn't what they do in most cases - and this project isn't anywhere near big enough to convince NXP to change their ways... This just means that without being transportable to newer kernels and being able to be built without relying on NXP for everything, this entire project is lifetime-limited e-waste. Once you don't want to support it anymore, or it gets too hard to keep porting changes - that's the EOL for your product.
@@RobSchofieldLinux is under gpl2 license, and it enforce distributors to also include(make available) source code of the kernel. BSD/apache 2/mit licenses are not enforcing you to publish source code when you distribute binary of your kernel, that’s why apple used freebsd kernel initially.
@@MrMyFaker Apple never used the freeBSD kernel they used a Kernel call XNU witch is open source it uses the freeBSD standard library and many BSD utils but if it's that android is just freeBSD
@@redstone0234 My understanding is that the XNU kernel based on Mach is dealing with hardware and process management, resource allocation,hardware drivers (IOKit) and it is running a FreeBSD compatibility layer so that the FreeBSD source ports tree would compile and work under MacOS like bash, htop, and all the GNU utilities.
I am concerned that relying on the hardware offload capabilities will spell an early end for this device, for example if a bug in handling certain packet types is found (e.g. with new protocols afterwards) or the limitations that this places on additional features. For example the Ubiquiti EdgeRouters can’t do QoS with the hardware offload enabled - just an example that in general you end up being locked in to the hardware capabilities.
And when that happens, it will fall back to the CPU, like on EdgeRouter. The feature will be available, but not offloaded. In that case, performance will be the same as not having HW Offload in the first place. No HW offload for this kind of CPUs means gigabit speeds at best. It is much better to have it, and in the edge cases where someone needs something that it is not offloaded the performance will be the exact same as if HW offload was never implemented. The amount of CPU power you need for 10Gbps routing depends a lot on the features you need. My router has 4x4.1Ghz Core i5 8th gen cores and 8GB DDR4 RAM, with IPS it can only do 7.5Gbps on BSD (pfSense) and 15.5Gbps on Linux. In this particular scenario, 1.2Ghz vs 1.8Ghz won't make a difference.
This is a good point I didn't think of, I've also made a comment that the hardware offloading has no bearing on the CPU choice - We need to know what changing the CPU WILL affect, not what it won't.
@@tomazzaman That's not how drivers work. For example, the syscalls can change from release to release. Binaries that work with BSD will not work for Linux.
Yeap. See Also: Broadcom's "Open Source" NDK. (it's a huge binary blob with a few lines of shim code.) If they have an NDA and access to SDK _source_ then they can build the driver(s) for whatever kernel versions supported - or they're willing to port to. Sadly, this is the way of the modern world... want performance better than a hamster in a wheel, you'll need hardware packet processing, and no one openly documents that stuff.
@@jfbeam I wonder how hard it would be co convince Linus Torwalds or someone in the Linux kernel community to provide an open source hardware acceleration for networking. Especially if you could pay them from the router's development budget. This would effectively create an open source competition to the weird NDA garbage that you just obtained.
I don't want to wait until you release a bug fix kernel for a security issue that has already been patched months ago in existing distros' kernels. This makes the product a non-starter. If having to wait who knows how long for patched security issues in the kernel is a requirement to have decent performance with low power, I would prefer just to build a mini form factor PC with an AMD 5800X, pay the extra cost of the electricity, and do 10Gb line speed using 50% CPU.
You actually don't need that beefy and power hungry machine, there is a BananaPi R-4 which has great mainline Linux and OpenWRT support (with NAT HW flow offloading, WiFi encryption offloading (WED) working in snapshot and even QoS offloading with TC Flower in the making), already able to route 10Gbps with 0% CPU usage, it costs around $120 on Aliexpress. I would rather support companies and chip makers that care about mainline Linux and provide upstream patches for their SOC features (Mediatek in this example).
@@elektronick88 Yeah I was looking into the BananaPi R4 precisely for all the reasons you cited... unfortunately, there's barely any documentation (most I could find was in Chinese, as *pictures*), no information on how to put the OS together, and it's effectively not even close to a finished product. With my 25 years experience with Linux, I would still take a few weeks to build a usable router out of it. But I suppose I can wait a few months until the whole thing has matured more.
@@RuddODragonFear Not sure what documentation do You need, I have a BPI R3 and i just clone OpenWRT and build firmware for it with things I like (like adblock, tools like tcpdump, container runtimes like Podman etc.), same story for BPI R4, You just have to go through menuconfig and choose what You need. There is also active community willing to help if You have any issues or questions. I agree that there are some things not finished yet or things that require some work and configuration, but hey... this is DIY, not a consumer product, but still I think that the work is worth the effort :)
Correct me if I'm wrong but this (5:35) seems like a violation of the GPL to me. I don't think you are allowed to distribute derivative kernels under non compatible (i.e. non copyleft) licenses. That means that the proprietary software can only run in loadable modules and not patches built into the actual kernel binary. I might have misunderstood what you meant here though.
@@EraYaN closed source kernel patches will violate the GPL anyway and by the time someone packs and sells this thing, that is, distributing as in the kernel, the recipient has all the rights in the worled to get the code for both the kernel patches *and* the drivers. To bypass GPL with dkms etc, you need the user to interactively install the needed stuff. If it's automated, it is as distributed, meaning it breaks GPL.
Umm... Only the parts that are written by others need to be attributed to and source made available. Anything you write custom can be proprietary, that's exactly how Cisco, Oracle and tons of others do it. Patches are arguable because someone else wrote the patch to make the modification, thus one could argue that the patch it's self is an original work. (Similar to claiming your artwork isn't original because you used my canvas material to make it, or Crayola's red paint to accent the sunset....)
@@jamess1787 If you change even a single bit in a GPL'd source file, YOU. MUST. PUBLISH. YOUR. CHANGES. Cisco, Oracle, "and tons of others" _do_ publish their changes to GPL'd code. Their products are built from a sizable amount of non-GPL stuff, and they obviously don't publish any of that. (It used to be easier to find those tarballs, but Cisco being Cisco makes you jump through hoops to get anything these days.)
the problem with closed source, is that lets say, 3 years down the line. what happens when NXP wont update their drivers to work on anything newer than a 3 year old kernel release?
It's not closed source that way. We get the sources as a manufacturers, but can't share them. So in case NXP (which is extremely unlikely) stops working on it, we can pick it up ourselves.
@@tomazzaman that just kicks the problem down the road, how about a test without hw offloading and lets compare the results to make an informed decision, currently we only have the data for 1 side
@@tomazzaman The most likely issue will be the end-of-life for the chip you're using, and thus, newer SDK's dropping support. Once you have the source, you will always have that code. (your license to use it can expire, but that doesn't magically erase it from your drives.)
The Linux kernel see's around 60+ CVE patches a week (2024 Linux Foundation numbers). If you have to manually provide the kernels due to proprietary binary blobs, it takes the customization out of this device no? In fact, there are very interesting scheduling behaviour changes with newer kernels as well.
to be fair, the linux kernel makes a CVE entry for _every_ bug/regression that appears, simply because they can't know if there is any security implications or not. the number of actual CVEs that matter and have any impact is significantly lower than that.
@@commanderguy-rw7tj This isn't totally true. Kernel upstream assigns a CVE for anything that COULD have security implications, not every bug outright. To be fair, a vast majority of bugs in the kernel can be considered a security concern due to the ring level of the relevant code...
I'll echo the other comments that depending on how NXP is integrating code with the Linux kernel, restricting distribution of the source may be a violation of the GPL, so I'd steer clear of any solution that forces you to do that. Furthermore, I've been doing hardware long enough to know that these whiz-bang proprietary hardware solutions promise the world and seldom deliver in practice, and getting updates to fix bugs in their proprietary software and silicon, particularly in the long term, is a virtual impossibility. Went down this road with Broadcom in the early 2000's and they completely lied to us about the maturity of the silicon and proprietary APIs. We were forced to throw away over a year's worth of development and start fresh with our own solution based on a FPGA. Never again.
I'm sorry, but the third-party closed-source drivers would kill the entire product for me. If I have to rely on a chain of third parties to provide kernels - I might as well just get a proprietary solution to begin with, like a Ubiquiti or MikroTik, that give me an entire polished software suite.
really, and even things like mikrotik have their own share of problems despite being "large" Projects like these are cool, no joke, I'd hardly be able to do even a small % of it, but it doesn't make it so practical. Either it has unique features, unique offerings (like OSS or whatever), or at least, cheaper. If it can't do anything dunno why'd one go with this
Raw forwarding throughput is cool, but I'm assuming firewall throughput is probably a more practical data point for most of the ethusiasts following this project.
Go for multiple SKUs. Faster CPU and more RAM should neither require a different PCB layout nor different SW, just different placement. The little extra cost for managing added SKU should be more than offset by larger reached customer base.
Logically, it's a no-brainer to reduce the costs of the cpu and ram, since they're barely being utilized. However, people like extravagance. And, depending on the final price of the product, it may make more sense to keep it as it is. You could also make the cheaper model have those reduced specs while putting the improved specs into the aluminum case model.
I thought the whole point was to save power, not to have processing alongside routing.. normally if your router uses its cpu, it uses it for networking probably, but if you have the cpu totally freed from its purpose, what purpose could you give it anyway(am I missing something)
so you suggest providing only a pre-built kernel and without any option for the user to recompile it or upgrade it without losing acceleration? since doesn't seem like you can redistribute the patches?
I've followed this project for a while, while your other claims I'm fine with I have some questions / concerns about the NAT-P / PAT. I get this is a consumer level device but how much memory is this hardware offload module going to have? Looking at the LS1046A block diagram I guess my question centers on the conntrack monitor module (you show the command reading data from it). Whats the max number of connections this module can handle? This would also be a limiting factor for the SPI firewall as well. A less pressing question would be with the L2 bridging is it vlan aware (yes I do see it supports vlans but is the bridging aware of vlan tags?). As much as I would love to say people will pay for the extra CPU speed money talks so while getting yourself established I would not want "too expensive" to be a reason people reject your product while you are trying to get established.
I'd love this to be an awesome router first and foremost, not another do-it-all homeserver. So I think the lower specs are plentiful with the hardware offloading capabilities you demoed.
I haven't gotten that far, yet. You can configure what exactly you want to offload, so for tcp dump, you'll have to turn it off, and then back on again once you're done debugging or whatever it was you were doing with tcpdump.
@@tomazzaman would that require a restart of network services/interrupt the network? If yes, i think it's a big problem, if not i think you would need to test the cpu while it's not offloaded to see if you can go for the lower cpu / hw
TCPdump is for dumping the packets that traverse the kernel. If the packet doesn't transverse it, you will not see it. That's why, in high-end routers like Juniper or Cisco, you must tell the ASIC to send the packet through the control pane.
I hope that you're aware of that using closed source patches and drivers in Linux breaks the GPLv2 license under which the Linux kernel is licensed. According to the license, if you do so, the drivers and patches are designated as derivative works and thus licensed under GPLv2. This is the ilcense and thus the law!
Well depends. For example zfs code is licensed under cddl which is incompatible with gpl but binaries are not so you can use zfs with linux. I don't know how those patches are applied here but I would presume they know what they doing to bypass gpl, or maybe not :).
@@darukutsu In principle, you can't ship linux with ZFS for that reason, but some distros ignore it and so do people else too, since OpenZFS is, after all, open source.
I'll take the lowest specs assuming that the offloading works correctly, this'll make the device very very affordable and beat everyone in the market. So have 2 versions, a Pro and Lite versions.
Depending on NDA'ed, GPL-violating, proprietary junk makes this a much less interesting product. One of the appeals of a custom project like this is the ability to tinker, or just having the option to keep it alive if you for whatever reason can no longer support it - which is a real risk for a small company. Without acceleration, the router would be almost useless for 10 GbE speeds. Aren't there offloading network chips available, rather than depending on NXPs SoC? I'm not opposed proprietary blobs or a few closed-source tools required, but NXPs patcheroo crazyness and hostility is too much. Doing your own FPGA-based solution would probably be too expensive, both for hardware and development cost 😞
The first packet in a new connection goes through the firewall normally and if it passes, all the subsequent ones from the same connection then bypass it.
I think it is clear from the comments and from my thoughts, it would be good to expand on applications that Hardware offloading would be beneficial and the ones in which it is not. So people can better decide if giving up CPU and memory is a good ideia.
I'm a bit concerned that the packets handled by hardware offloading won't be available in the kernel - will the device support packet inspection or other forms of packet manipulation outside of the dedicated chip?
Simple routing with the iperf3 example is the easy part. Now check NAT functions and other IP functions. Like DSCP marking/rewriting. Or VXLAN/MPLS. Using the full pipe-line of the SoC to check 10Gbps performance. And in regard to CPU, things like Bird/Quagga to run routing protocols like OSPF/BGP. The more prefixes the more CPU/RAM. Maybe overkill voor CPE devices.
@@tomazzaman How many prefix lists, acl rules, nat entries, etc. are possible in hardware? I remember devices which rebooted when the nat table was full - hello bintec! :)
I would say that, at least for the initial launch, using the lower-end components to slightly reduce the cost might be the better route, especially if the system is as resource efficient as you're claiming; maybe after getting yourself established you can release a "pro" version of the router with the higher-end components. Also, is there any chance that PoE will be available to any of the ports?
Without offloading I think you could also push that performance using VPP graph nodes in userspace with a AF_XDP socket as the entrypoint. You can use dhcpd and other apps to make it a great home solution.
Great video again! Just wanted to know : Did you happen to have run some testing without the hardware offloading ? I'm curious about the result. On the price aspect : I think cheaper is the way. But cheaper mean that we rely more on the 'block box' that is the binary and the support that NXP is willing to provide.. If the cheaper variant can still handle some 10G routing capabilities, without hardware offloading, why not !
I would like to see 2 models on the market. Those who want the more powerful one and want to shell out for it can, those who want the cheaper version can have that.
Thanks for sharing the journey of selecting both hardware and software decision. This is very interesting. Will it have a stateful IPv4 and IPv6 firewall builtin ? What size of the arp table and neighbor table it can support? What will be the size of firewall session table?
Go with lower spec device because it will be more affordable and more competitive. It is meant for home usage. Then when you start selling it make upgrade version. Maybe you will ad more ports or add POE+ or something else...
How is performance altered with IP filtering? One feature of pfsense I use is pfBlockerNG, which blocks IP ranges used by known bad actors. In my current config there are ~16000 ranges being blocked. Is this also completely offloaded? Are there limits to the number of ranges we can block with offload?
The number you are showing are impressive, but we are only talking about pure L3 routing, no firewalling. My question to you would be: what CAN'T be offloaded to the CPU and how will it affect the performance? Examples: advanced packet filtering with 1000+ rules, OpenVPN, IDS/IPS, SSL, etc. Are there any kind of encryption/decryption acceleration/offloading built into the CPU such as AES-NI that would benefit those features? In many cases, the CPU sizing for a firewall is directly linked to the resulting throughput of the encrypted tunnels it can deliver (IPSec, OpenVPN, Wireguard, etc.). With that in mind, I don't think that 1.4, 1.6 or 1.8 Ghz will make much of a difference, the 1.8 GHz will be around 30% faster than the 1.4 GHz, this will clearly not double or triple the encrypted tunnel bandwidth. I am curious of seeing the performance of a single OpenVPN tunnel (OpenVPN is single-threaded) on your hardware. For the RAM and CPU, I would go for the most cost-effective option regarding the capacity it provides (biggest bang for the buck), not the cheaper. Which CPU cost less per GHz? What amount of RAM cost less per GB? From what you are saying in the video, the 1.6 GHz CPU with 8 GB of RAM seem to be the most cost-effective combinaison. Choosing a middle ground between the most powerful and the cheapest will probably be attractive to the highest number of early adopters and be commercially more viable initially and accelerate your return on investment. After securing a steady flow of sales, then releasing a lite (less powerful) version and a "beast" version (more powerful) would probably be possible based on which one you get the most requests for.
Not only that, it shows only a single connection with max packet size, what is not a realistic test, that speed with this condition any cpu would have near 0 utilisation (even without any dedicated hardware acceleration). Comparable SoC (not same but comparable by cores and speed and have hardware acceleration made by reputable company that make most SoC used in consumer routees) in comertial product (I will not name company but name starts with A) in real user case it bearly pass 650Mbps speeds even in tests like this it get full 10Gbps speed.
Hi Tomaz Regular follower your project, I do really like the openness and explanations, I really do appreciate it. Maybe in the future revisions, as this device will be a tinkerers heaven, if you want to even further drop the cost, maybe consider adding NVME or SATA storage option, leave the EMMC optional, and maybe you can revise use SO-DIMM options too, so the customer can decide what Storage + RAM configuration they want to use in the future. I don't know if the CPU will have enough PCI-E lanes, but for a router appliance some kind of SATA based storage would be enough too, if the CPU has controller, or perhaps source a sata controller chip, but I have no idea if that makes sense of a cost perspective and how much complexity it would be to develop and revisit the board. Just chiming in, throwing here some ideas, options, I love your work, keep it up!
Thank you! CPU does indeed come with a SATA controller, but we opted to use those SerDes lanes for Wifi 6/7 instead. Will see how the market responds with this first version and then act accordingly.
It depends. Hardware offloading means the CPU and RAM will be free to do "something else". I think it depends on what this means. It would be good to know which kind of configurations, or features break the hardware offload, and what difference does having a slightly better CPU make. Anyway, I think that in most cases, a cheaper CPU with less RAM is a better option. If HW offload is disabled for a feature, is not like 400Mhz CPU clock will amount to a night and day difference in throughput
Nice video :) Personally I would strongly prefer to keep the current specs (1.6 GHz CPU and 8 GBs of RAM), but I would buy it even with the lovered specs.
I liked your venture due to your open sharing and open source numbers. Been watching since the first videos and plan to buy a finished version. It has been fascinating to see your process. To see you not being able to talk about certain parts due to NDA and closed source software is a turn off. I don’t want some closed source module for this to run properly. I won’t be buying this if it’s using closed source OS software. In addition if you add in software for packet inspection, nat, vpn tunneling, does the offloading still work as well? For a modern router there is more to it than just pure routing of packets.
My concern with all this is still security patching and zero day response, especially for 3rd party libraries, OS and SoC drivers etc. Performance is great but I am still not seeing how secure this product's lifecycle will be.
Tomaž, sorry to sound like a noob, but would this router be a drop in replacement for our Slo fiber internet providers supplied routers like T-2 now that the "fiber everywhere" initiative is progressing nicely. To me the CPU/RAM question boils down to how much extra load is anticipated for non routing activities, otherwise you just end up with hardware waiting as fast as it can. I do like the proposal made by others to have a "basic" and "pro" version if you can tolerate the separate SKU and BOM parts. Do you happen to know if the HW offloading allows for port mirroring for analysis purposes? Future marketing slogan for the device "Routing from Slo is Fast!" [OK, I'll go now. No need to push me into Soča :) ]
I'm not a network guy, but I understand that the traffic component is handled out of CPU and Memory which is very nice. But won't there be use cases where having these ressources will be a must. have? Such as running services for traffic analysis, ads remover, maybe VPN tunnels etc... I'm curious.
VPN tunnels can be offloaded to router hardware, typically. There is also a bunch of types here too. If they can’t and yes that will be the case for wireguard for instance, Traffic analysis is usually handled by a separate process, which can gobble up a mirror of whatever is being routed. If you are taking about IPS that’s a slightly more intense task but just getting metrics on the requests it’s easy. Ads removing, due to the nature of HTTPS is usually done based on origin. That’s easy enough to get a list of firewall policies and feed that into the hardware as an ACL
I'd be interested in learning if inter-VLAN routing via tagged interfaces requires bypassing the HW offload of the firewall. Most solutions I've used in this space see a huge performance hit once you try to do this, and you usually get encouraged to separate your firewall from your inter-VLAN routing performance when doing so. I have a stupidly overcomplicated network in my house, and right now my core switch is doing inter-VLAN routing, but I'd love to have everything being done on one device.
It really depends on how much Physical SFP+ and actual 10G compatible ports you want to have in the final version of your product. If it’s going to be a solution up to 8 ports you can definitely go with a smaller requirement, if you go 16 and up you would need the additional power. You should also consider a upgrade path for PoE++ input and PoE+ Output on Multiple Ports. Today that is a essential feature for a more expensive network appliances. Consider the price of you CPU alone I suspect you solution to end up above 250€ or even close to 400€ for that price point I expect a lot more of a device considering what you can get on the market right now.
I wouldn't consider deploying these routers, since I'd be fully dependent on you for upgrading to future kernel versions. Even decades old industry giants mess up device lifecycles on the regular. Asking for that level of trust in a relatively small operation like this sounds unreasonable to me. On the other hand, if the entire software stack was open source I'd be more than willing to evaluate it for serious use, as the features do sound rather compelling. It would make not blatantly violating the GPL easier as well :^)
Are you still planning on this being capable of IDS/IPS inspection? I thought I remembered that was the original plan. I am wondering if all this hardware acceleration and low power goes out the window as soon as you start doing that though? That was a problem on a lot of Ubiquiti earlier gen devices where their 10gb Unifi router went from 10gb throughput to 1~ gb throughput with IPS on because it had to move the traffic to general CPU cores. Newer gen Unifi devices changed that up by using different SoCs to get more IPS speed. This product doesnt have to be any kind of full decrypt/encrypt interception inspection with certificates on end devices sort of thing, but having a standard IPS type system (probably suricata running ET Open ruleset?) with pattern and packet signature matching to detect malware or hack attempts in encrypted traffic would be great to have. I know not everyone is of this opinion, but to me, all firewalls should have that type of traffic inspection as standard now days as a basic stateful firewall just isn't good enough anymore for evolving threats. Id be fine with a 1.4ghz CPU as long as it has the same HW accel blocks in it since the CPU cores dont really matter. Edit: Looking at the Layerscape CPUs, I see that the 1046A does not have a pattern matching HW engine, so I am guessing if you do enable IDS/IPS that the speed will drop a ton like I was thinking. I also see that the next gen LS2084A CPU steps things up a little bit (but for a lot more money) and increases the security engine from 10gb to 20gb, adds a 20gb data compression/decompression engine, and adds on a 10gb pattern matching engine. So this would be the CPU to use in order to get 10gbps IDS/IPS I would think. Sadly it costs $285~, though probably a bit less when buying by the thousand. At that point though IDK if it is even worth using the next gen CPU when you could step up to an even higher end on, the LX2082A that gets you 50gbps HW accel throughput for only an extra $15. lol. Though I dont see the LX2082A having a pattern matching engine like the LS2084A has... So maybe that's why it is not much more for so much more throughput, as it is missing the pattern match part the "lower end" part has.
The whole "FreeBSD is faster at networking tasks" sounds like the same tired old arguments that that community keeps claiming based on some very old benchmarks and "Netflix is using it so it must be fast". 😄 Nobody has provided any modern benchmarks to prove that this is still the case. On the contrary, the few modern benchmarks that I've seen seem to favor modern Linux. 🙂 VPP will obviously totally smoke kernel networking though, as will hardware offloading. Those hardware offloading engines can definitely be very impressively efficient and performant. 😀
Indeed. FreeBSD was faster in the long long ago. Then Linux vastly improved. But since then, let's say "less enlightened people" have messed with linux networking - eg. removing the route cache - so I can't say either is very effective these days. Hardware routing and switching will ALWAYS be faster than software.
It’s not about benchmarks, it’s about the apis and facilities to explicitly support hw acceleration in their networking stack that linux currently lacks
@@l3xforever Not sure exactly what you're referring to. Mind providing some more details? 🙂 My experience is that hardware acceleration is generally better supported in Linux than in FreeBSD, but I'm sure there are some exceptions.
I think the main purpose of the router is to perform flawlessly in the workloads you demonstrated in the video. I think therefore that the 1.6GHz or even the 1.4GHz cpu would not compromise the performance of the router. Similarly 8Gb vs 4Gb given one option or the other do not affect significantly the routing performance you should go with the option that makes the product as much competitive in the market as possible. Consideration should be given to what kind of UI and life comfort features you plan to make available in the product. But I don't think a UI, even with heavy graphical sophistication and maybe historical logging would be affected much by lower clock speed CPU or 4Gb of RAM. My 2 cents.
I think it would make sense to keep the hardware modest. 1.5 to 1.6 Ghz would be fine, 2 or 4 GB of RAM would be fine. But I guess the big part of this is would be the shipping feature set.
I already see two version of this router emerging, depending on what traffic can really be offloaded and what software configurations would affect(and disable) hardware offloading, because that is important on a 10G router... A cheaper version is preferred, yes, for the customers and for you guys... And that's probabably your main goal. But a higher end version might make sense too, maybe the faster CPU or something else added, maybe more connectivity options, or more RAM(example). That might be a different product, one that completes a product line. ;)
I have to say, I've been following this project for months now and I've been extremely excited this entire time. If the device were open to preorders at the end of the last video I may have even pulled the trigger this early. The issues with the ASK have completely deflated my enthusiasm. I'm a security professional by trade, and the inability to patch my systems is a complete deal breaker for me. I sincerely hope that this has been a misunderstanding, or if not then I hope NXP pulls their head out of their rear because this is completely unsatisfactory.
Please do a bandwidth test with slower CPU and less RAM amount (it might take less memory channels to have less RAM and that could really impact bandwidth) to be sure those don't impact hardware packet processing.
I would say the per device unit costs are something to consider. When you are manufacturing 100s or 1000s of units that extra 70 eur per device adds up quickly. So once you have your final design then your challenge is to reduce the cost per unit as much as possible to keep manufacturing costs down and profits up. I have a few thought questions to think about: So I have to wonder at your current cpu/hardware design can you go faster than 10GbE? Is 25, 40, or 100GbE in this hardware’s future where you might need the CPU cores? What impact will full stateful packet inspection have on CPU usage (this is the big question). You can move packet forwarding to asic but how well does the hardware handle layer 3 to layer 7 packet inspection? I would think there would need to be kernel intervention beyond just simple packet forwarding. Before deciding on removing hardware lets see a near final prototype. With that development hardware can you underclock the hardware to see what the impact of 1.0 or 1.4Ghz means to the system’s operations? But from what you have shown there is no need for a 1.6Ghz 4 core cpu. Full packet inspection might change that opinion. On the open source front, I could see two versions (depending on the hardware dev kit licensing). The advanced version of your board design could use the hardware acceleration kit with its licensing fees. And the “basic” version where it does all in kernel routing and packet inspection. The former could use a lesser CPU and ram. The latter one would need the 1.6Ghz and 8GB of ram. Or you can just use the same hardware and the only difference is if you use the close source with the acceleration kit or open source with kernel based packet routing. This isn’t a fully formed plan, only an idea. I realize you have limited hardware for testing, but I wonder what the results would be if you add more senders with the one receiver. The idea is to see if you can flood the 10GbE link until you start to get retrans errors. This would see how the hardware would manage to much data to forward. But all in all your hardware and your design team is doing a great job. You need to find the niche where your hardware has a market.
Aim for €149 retail price, and go with 4GB and 1.4Ghz if you need. With offloading this good, no need to go up in specs, better to create a product first with mass apeal then later launch a more expensive option. If you want to make this more than a one product series, you need mass appeal, like the first EdgeRouter Lite, which really launched Ubiquiti.
Depends if I can run additional software on the machine to make use of the additional RAM and CPU. I am new. Maybe there is already intention for the CPU/RAM i am not aware of. If not I would tend more towards the cheaper option.
i could see a market for undecided users that would purchase the router if it's inexpensive enough and would later purchase a top of the line one if it met expectations
It's difficult to make a choice here: are you saying that due to the efficient traffic offloading, the reduction in fitted RAM size and CPU clock speed will make no difference to overall performance of the unit? If so - sell it cheaper. Do these reductions impact overall routing performance, or future-proofing of the design? If yes, then it has to be the more expensive spec. A long time ago you suggested variants at different price/performance points (I suggested different coloured cases to identify the variants) - is this what you are suggesting, multiple production models or a single high-perf model only? For single model only, then no choice - hi-perf. For multiple models at different spec levels, the question answers itself - but more than one variant in production is inevitably going to have knock-on effects later on (production and manufacturing complexity, support of multiple variants in differing builds of firmware, updates and bugfixes, etc. etc.). I don't have enough info yet to decide what I'd like, but if the buy-in price difference is so great, then I would suggest it's *your* prerogative to make this choice for your launch product. Given the ethos (fastest, best) it implies the expensive variant is the one. A price/performance/resource comparison of the low-price/high price variants would help here. Another one of your excellent charts in a follow up video to show the trade-offs more clearly?
My question is, what would 8GB of RAM even be used for if we kept it? Do you reckon we could setup a home assistant server directly on the router? Would that benefit the extra specs that much? Personally, I am just fine with the reduced specs. Specifically, 8GB RAM is overkill in my opinion, even for a top-of-the-line variant.
Well... speaking about performance vs price, it would be much easier to make a conclusion by seeing the practical tests and seeing approximate price. Otherwise, pretty much interesting project, good luck with your routers!
RE closed source patches: would you be able to hire external security reviewers to verify the safety of the patches and ensure third party verification of the safety claims? I wouldn't mind seriously considering the product if either that or disabling any proprietary binaries/source was an option at runtime.
Seeing how people respond with concerns like bugs, platform deprecation, and so on, I'd say that it would be a good idea to consider making sure that the router is capable of reaching basically the same results in terms of performance with the kernel bypass *disabled*. If you can downsize to 1.4 GHz while keeping CPU utilization under 70-80% on full load, then go for it.
With regards to cost savings, my question would be are the CPU, RAM, and EMMC storage on sockets, so that we could easily upgrade them afterwards? If we can easily upgrade the device ourselves, then I don’t see a problem with shipping lower speed/capacity/cost components. If you’re going to be like Apple and soldering them all onto the logic board, then I think you’re going to end up needing to provide multiple different versions with different capacities and speeds, much like Apple does. Because that would be our only way to upgrade.
Definitely figure out how hard it is to break hardware offload, and what the performance is when you do. But, even assuming the performance is great without hardware offload, I'd still like to see what you can do without the ASK. If you're stuck with the ASK, I would probably just stay within the unifi product line and not bother with fragmenting my network stack.
I'm only slowly entering the world of custom networking gear and not using the ISP provided gear with a couple of the cheapest APs and switches possible, so apologies for the novice questions and random thoughts; Would it make sense to choose a mix of the two, so the 1.4GHz with the 8GB of ram? To me it seems like the 0.2Ghz is much less of a downgrade, than the 50% less memory. I don't know what sort of services would benefit from another 4GB of ram or, perhaps more interestingly, 0.2Ghz faster processor. Would something like WireGuard run noticeably quicker on the 1.8 vs the 1.4Ghz, would the 4GB ram be enough for it? What are some services that you would usually run on a router? Nonetheless I do think the first version should probably be the fastest to get your name out and known as solid, if slightly over built, routers
I mean for comparison, I'm using an off the shelf router with what I believe is a 1.2 GHz CPU and 128 MB of RAM, serving as an edge router for 4 APs and 7 devices actively connected at any given time (give or take). Is the latency good? Absolutely not, but it's held up fine for several years. Pretty much the only issues I've had were some DNS outages (quad9) and some issues with the ISP modem.
@@tomazzamanmaybe hardware packet processing isn't as powerfull in slower CPU (marketing reasons) or it requires full RAM bandwidth. Please also check slower CPU with less RAM (it could mean less memory channels) for hardware accelerated performance.
At the start you mentioned that FreeBSD is better than Linux for routing, but I've found the opposite to be true. I was testing opnsense vs OpenWrt on an old SFF PC with an Intel i5-9500 and X540-T2 NIC. OpenWrt was able to successfully reach 8Gbps over a single iperf connection over the internet. Opnsense struggled to even hit 3Gbps, with one CPU core pegged at 100% (which makes me think that maybe NAT or conntrack is single threaded... not sure)
I know that this will be a significant overhead for logistics , but cant we have 2 versions? For example i want to have the best of the best no mater the cost (in resonable ranges) , while others might not need a faster cpu or more ram
Whether the A72 cores run at 1.4GHz, 1.6GHz or 1.8GHz is irrelevant. This is because if there is an edge case where the traffic cannot be offloaded, it will just be slow. So yeah, save money on the CPU. As for the RAM, what do we need 8GB for? Most routers ship with 256MiB or 512MiB, some with more. Even if you're doing some amount of BGP, 4 gigs is plenty. But what I want to know more about are the offloading capabilities. It's probably wishful thinking but I want good performance with wireguard and traffic shaping (fq_codel, cake).
I figure it out right away. That's why Windows has an advantage with well supported hardware, and Apple makes hardware and software, so there should be no issues there (as always, I have noticed on older hardware that even they control both hardware and software, they removed the very optimized drivers in favor to something which doesn't perform as good as it was, but it's so old people don't care about it). Drivers are what enable full use of the hardware, offloading all the operations to the ASICs (Application-Specific Integrated Circuits). And the software needs to call the driver functions to enable the hardware processing. The same issue described here with Linux is present on "alternative firmware" for home routers / access points like the awful DD-WRT (and OpenWRT for that matter) and other variants which the developers have reverse engineered the hardware of those routers to provide support, but these reverse engineered drivers are basic most of the times and process everything on the CPU. For those wondering what I'm talking about, think this way: you have a pretty decent GPU on your system, but no connection to the Internet or embedded drivers for that GPU. Windows enables it using the "STANDARD VGA ADAPTER" driver. It can enable a high resolution, SVGA, XVGA and the sorts, but window composition (moving windows, drawing) is bad. And it cannot draw video frames fast enough or accurately that way, simple tasks take up all your CPU cores. That's what's happens when you cannot offload processing to the many auxiliary processors, ASICs and utilize the hardware you own to the fullest. Pretty interesting video.
Why not have 2-4 mounting options? RAM is drop in replaceable, same with that CPU. 2 extra reels on a pick and place is not that much tho they are going to occupy a few more slots it is feasible since you may free that much space just from resistor and capacitor optimisations. BUT when it comes to logistics there may be problems since you will need to have some way to mark them visually and for the SW. BUT that is simple , you can use a resistor/led to visually indicate what it is or laser engrave a mark or use a sticker/ marker mark . as for the SW knowing what is has a top command should be enough but you can also use a free IO pin to read a pull up/down. Or simply write it in flash/eeprom
If the routing performance is identical with lower tier parts I don't really see the need for the more expensive hw components an alternative could maybe be keeping the faster processor and larger ram size while going above 10gb/s speeds
How many ports will be supported (depends on PCI lanes). I may see use for 2 WAN ports, and at least 2 LAN ports, plus an out of band management port. Any free slots for expansion. I know I'm dreaming :-)
The hardware offloading has impressive performance and low power consumption. My guess was going to be 6 watts. For it to be less than 2 watts was surprising. However if the offloading stops users from being able to use Wireguard, OpenVPN or rules.. that's gonna be complicated because for the price of this combined with its 10Gb target market.. it's gonna need to do those things and fast. I fear this will be a product that is amazing for everyday users who just need a fast router but I suspect those people have 1Gb or less internet speeds. And so the enthusists who have 10Gb (like myself, I'm very lucky!) I'd want the ability to run all tasks very quickly. I think you need to show some results of what routing on just the CPU can do and what it can do when handling VPN traffic too. This is definitely important for a high-priced router.
Do you prefer a slightly slower version of the CPU if it makes the device cheaper, or do we go all in?
Depends. Whether the sum of features and specs will justify the price. If you manage to convince the audience that the router as a sum is worth it , then people will justify a small increase in price for a better cpu. It depends if subconsciously we will consider it as high or mid router , thus beginning to compare it with other routers in the market.
I'd go for the cheaper one for more possible customers.
I vote keep the cpu the same. means those who want a pure oss build can still get decent performance.
i am excited about your router, i prefer lower cost, because very expensive consumer-grade routers have around 2GB of memory, like this Asus ROG Rapture GT-BE98 that has 2 10-Gbit ports, i think 4GB of RAM is allready an overkill, i am going to keep watching your videos. Thank you!
What about two versions?
It's a tough one, but I would benchmark Wireguard performance and opt for the cheaper alternative if the numbers are still great.
Big plus for Wireguard!
Wireguard is stateless so I would be surprised if it can't be hardware accelerated once the crypto algorithms are supported.
I don't think he can show any tests that would make NXP look bad. NDA and pre-sales agreements are active so all we get is fluff from here on out.
@@davidmcken Wireguard is designed to run fast on CPU, using algos that are best suited for CPU. The flip side of this is that nobody is going to make a hardware accelerator for them because they run so fast on CPU anyway
Wireguard doesn't run in any hardware acceleration - even on full-fat AMD or Intel CPUs... It's highly unlikely that it can be offloaded.
@5:40 DING! Wrong. The GPL is very clear on this. ANY modification done to the kernel source MUST be made available to anyone to whom you've provided a binary. In short, any kernel you compile, your customers must also be able to compile. HOWEVER, external, 3rd party _modules_ can be 100% closed source. I've been here, it can be messy. And yes, giant manufacturers like Broadcom and NXP don't care about breaking the law, or forcing you to.
(And there's some debate about making changes to the kernel simply to facilitate external closed modules. For example, undoing the "taint" caused by non-GPL modules, so one can use a GPL-only exported function.)
Agreed, patches and modification to the kernel are covered by GPL and need to be public. A single binary is all that's needed and even then it's still a poor argument for closed sourced code within this project. If he held NXP to the fire and made them comply with GPL in the first place then I would be more open to using their secret sauce.
Oh thats a netting that noone should cut, GPL my beloved. ;(
The solution is simple: provide NXP as binary modules
@@francoisscala417 The issue is that the module must still hook into the kernel APIs.
Nettrack hasn't all the necessary APIs for offloading exposed. There is some work in the progress but so far the HW accelleration possible is very very limited.
This kind of patches are mainly in the core part of the kernel and not in the external modules.
Then they also patched userspace utilities for offloading, meaning you are running more unknown modified software.
That have to hook into the kernel in a different way for offloading. I'm thinking about tcpdump that is not able to see the traffic. NXP probably has a patched version.
then there are the various VPNs stacks, the ppp daemon etc. it will become a huge mess very easily.
@@francoisscala417 They *could* release the binary module, and then open source a wrapper to load the binary module in the kernel. However this isn't what they do in most cases - and this project isn't anywhere near big enough to convince NXP to change their ways...
This just means that without being transportable to newer kernels and being able to be built without relying on NXP for everything, this entire project is lifetime-limited e-waste.
Once you don't want to support it anymore, or it gets too hard to keep porting changes - that's the EOL for your product.
Unpublished kernel patches for a published kernel binary sounds like an outright GPL2 violation.
Is that not covered by LGPL? And why GPL type licensing only? There are alternatives (look at what Apple and SONY do for FreeBSD under it's license).
@@RobSchofieldLinux is under gpl2 license, and it enforce distributors to also include(make available) source code of the kernel.
BSD/apache 2/mit licenses are not enforcing you to publish source code when you distribute binary of your kernel, that’s why apple used freebsd kernel initially.
@@MrMyFaker
Apple never used the freeBSD kernel they used a Kernel call XNU witch is open source
it uses the freeBSD standard library and many BSD utils but if it's that android is just freeBSD
@@redstone0234 My understanding is that the XNU kernel based on Mach is dealing with hardware and process management, resource allocation,hardware drivers (IOKit) and it is running a FreeBSD compatibility layer so that the FreeBSD source ports tree would compile and work under MacOS like bash, htop, and all the GNU utilities.
It sounds like NXP is out for some GPLv2 violation or is that v3? Anyway, wouldn´t that mean you´d be infringing on GPL as well!!?
I am concerned that relying on the hardware offload capabilities will spell an early end for this device, for example if a bug in handling certain packet types is found (e.g. with new protocols afterwards) or the limitations that this places on additional features. For example the Ubiquiti EdgeRouters can’t do QoS with the hardware offload enabled - just an example that in general you end up being locked in to the hardware capabilities.
there is just tcp/udp so far, everything else is above this.
And when that happens, it will fall back to the CPU, like on EdgeRouter. The feature will be available, but not offloaded. In that case, performance will be the same as not having HW Offload in the first place. No HW offload for this kind of CPUs means gigabit speeds at best. It is much better to have it, and in the edge cases where someone needs something that it is not offloaded the performance will be the exact same as if HW offload was never implemented. The amount of CPU power you need for 10Gbps routing depends a lot on the features you need. My router has 4x4.1Ghz Core i5 8th gen cores and 8GB DDR4 RAM, with IPS it can only do 7.5Gbps on BSD (pfSense) and 15.5Gbps on Linux. In this particular scenario, 1.2Ghz vs 1.8Ghz won't make a difference.
@@andibiront2316yes, so you can't cheap out on cpu and memory. Or the project might fail.
@@MelroyvandenBerg It depends. The performance difference from 200/400Mhz probably is minimal. But we need benchmarks to be sure, I guess.
This is a good point I didn't think of, I've also made a comment that the hardware offloading has no bearing on the CPU choice - We need to know what changing the CPU WILL affect, not what it won't.
A closed source kernel driver from the SoC manufacturer ???
The problem is that you are stuck with a specific Linux kernel release that they support
Not really, no. We get the sources, we just can distribute them, meaning we can patch whichever kernel version we want with them.
@@tomazzaman Hope you have some lawyers onboard, sounds like a GPL violation
@@tomazzaman That's not how drivers work. For example, the syscalls can change from release to release. Binaries that work with BSD will not work for Linux.
Yeap. See Also: Broadcom's "Open Source" NDK. (it's a huge binary blob with a few lines of shim code.) If they have an NDA and access to SDK _source_ then they can build the driver(s) for whatever kernel versions supported - or they're willing to port to. Sadly, this is the way of the modern world... want performance better than a hamster in a wheel, you'll need hardware packet processing, and no one openly documents that stuff.
@@jfbeam I wonder how hard it would be co convince Linus Torwalds or someone in the Linux kernel community to provide an open source hardware acceleration for networking.
Especially if you could pay them from the router's development budget.
This would effectively create an open source competition to the weird NDA garbage that you just obtained.
I don't want to wait until you release a bug fix kernel for a security issue that has already been patched months ago in existing distros' kernels.
This makes the product a non-starter.
If having to wait who knows how long for patched security issues in the kernel is a requirement to have decent performance with low power, I would prefer just to build a mini form factor PC with an AMD 5800X, pay the extra cost of the electricity, and do 10Gb line speed using 50% CPU.
You actually don't need that beefy and power hungry machine, there is a BananaPi R-4 which has great mainline Linux and OpenWRT support (with NAT HW flow offloading, WiFi encryption offloading (WED) working in snapshot and even QoS offloading with TC Flower in the making), already able to route 10Gbps with 0% CPU usage, it costs around $120 on Aliexpress. I would rather support companies and chip makers that care about mainline Linux and provide upstream patches for their SOC features (Mediatek in this example).
@@elektronick88 Yeah I was looking into the BananaPi R4 precisely for all the reasons you cited... unfortunately, there's barely any documentation (most I could find was in Chinese, as *pictures*), no information on how to put the OS together, and it's effectively not even close to a finished product. With my 25 years experience with Linux, I would still take a few weeks to build a usable router out of it.
But I suppose I can wait a few months until the whole thing has matured more.
@@RuddODragonFear Not sure what documentation do You need, I have a BPI R3 and i just clone OpenWRT and build firmware for it with things I like (like adblock, tools like tcpdump, container runtimes like Podman etc.), same story for BPI R4, You just have to go through menuconfig and choose what You need. There is also active community willing to help if You have any issues or questions. I agree that there are some things not finished yet or things that require some work and configuration, but hey... this is DIY, not a consumer product, but still I think that the work is worth the effort :)
@@RuddODragonFear OpenWrt has support for it, albeit only on snapshot for now. Next major release should have stable support.
@@subrezon Let's hope that comes soon, and then I'll experiment with it.
Correct me if I'm wrong but this (5:35) seems like a violation of the GPL to me. I don't think you are allowed to distribute derivative kernels under non compatible (i.e. non copyleft) licenses. That means that the proprietary software can only run in loadable modules and not patches built into the actual kernel binary. I might have misunderstood what you meant here though.
Yeah it seem alike that will run afoul of the GPL, I presume NXP also knows this and has a proper solution for it. (Modules, dkms etc)
@@EraYaN closed source kernel patches will violate the GPL anyway and by the time someone packs and sells this thing, that is, distributing as in the kernel, the recipient has all the rights in the worled to get the code for both the kernel patches *and* the drivers. To bypass GPL with dkms etc, you need the user to interactively install the needed stuff. If it's automated, it is as distributed, meaning it breaks GPL.
@@roysigurdkarlsbakk3842😮
Umm...
Only the parts that are written by others need to be attributed to and source made available.
Anything you write custom can be proprietary, that's exactly how Cisco, Oracle and tons of others do it. Patches are arguable because someone else wrote the patch to make the modification, thus one could argue that the patch it's self is an original work.
(Similar to claiming your artwork isn't original because you used my canvas material to make it, or Crayola's red paint to accent the sunset....)
@@jamess1787 If you change even a single bit in a GPL'd source file, YOU. MUST. PUBLISH. YOUR. CHANGES. Cisco, Oracle, "and tons of others" _do_ publish their changes to GPL'd code. Their products are built from a sizable amount of non-GPL stuff, and they obviously don't publish any of that.
(It used to be easier to find those tarballs, but Cisco being Cisco makes you jump through hoops to get anything these days.)
the problem with closed source, is that lets say, 3 years down the line. what happens when NXP wont update their drivers to work on anything newer than a 3 year old kernel release?
It's not closed source that way. We get the sources as a manufacturers, but can't share them. So in case NXP (which is extremely unlikely) stops working on it, we can pick it up ourselves.
@@tomazzaman that just kicks the problem down the road, how about a test without hw offloading and lets compare the results to make an informed decision, currently we only have the data for 1 side
@@ss-xy2imexactly. I can't make a decision without a benchmark without hw offloading
@@tomazzaman Haha, maintaining and keeping up to date in C and ASSEMBLER in a changing Kernel?
@@tomazzaman The most likely issue will be the end-of-life for the chip you're using, and thus, newer SDK's dropping support. Once you have the source, you will always have that code. (your license to use it can expire, but that doesn't magically erase it from your drives.)
The Linux kernel see's around 60+ CVE patches a week (2024 Linux Foundation numbers). If you have to manually provide the kernels due to proprietary binary blobs, it takes the customization out of this device no? In fact, there are very interesting scheduling behaviour changes with newer kernels as well.
to be fair, the linux kernel makes a CVE entry for _every_ bug/regression that appears, simply because they can't know if there is any security implications or not. the number of actual CVEs that matter and have any impact is significantly lower than that.
@@commanderguy-rw7tj This isn't totally true. Kernel upstream assigns a CVE for anything that COULD have security implications, not every bug outright. To be fair, a vast majority of bugs in the kernel can be considered a security concern due to the ring level of the relevant code...
@@commanderguy-rw7tj The new maintainer had to officially make a presentation regarding the topic due to EU regulation changes.
I'll echo the other comments that depending on how NXP is integrating code with the Linux kernel, restricting distribution of the source may be a violation of the GPL, so I'd steer clear of any solution that forces you to do that. Furthermore, I've been doing hardware long enough to know that these whiz-bang proprietary hardware solutions promise the world and seldom deliver in practice, and getting updates to fix bugs in their proprietary software and silicon, particularly in the long term, is a virtual impossibility. Went down this road with Broadcom in the early 2000's and they completely lied to us about the maturity of the silicon and proprietary APIs. We were forced to throw away over a year's worth of development and start fresh with our own solution based on a FPGA. Never again.
I'm sorry, but the third-party closed-source drivers would kill the entire product for me. If I have to rely on a chain of third parties to provide kernels - I might as well just get a proprietary solution to begin with, like a Ubiquiti or MikroTik, that give me an entire polished software suite.
really, and even things like mikrotik have their own share of problems despite being "large"
Projects like these are cool, no joke, I'd hardly be able to do even a small % of it, but it doesn't make it so practical. Either it has unique features, unique offerings (like OSS or whatever), or at least, cheaper. If it can't do anything dunno why'd one go with this
Raw forwarding throughput is cool, but I'm assuming firewall throughput is probably a more practical data point for most of the ethusiasts following this project.
Go for multiple SKUs. Faster CPU and more RAM should neither require a different PCB layout nor different SW, just different placement. The little extra cost for managing added SKU should be more than offset by larger reached customer base.
Agreed!
Logically, it's a no-brainer to reduce the costs of the cpu and ram, since they're barely being utilized. However, people like extravagance. And, depending on the final price of the product, it may make more sense to keep it as it is. You could also make the cheaper model have those reduced specs while putting the improved specs into the aluminum case model.
I just want it to be in a good price/performance ratio so why use a more powerful and expensive chip if it won't affect much performance
if he's paying $72-92 per CPU, price has to be at least $800 either way
are 4GB of RAM still enough if you run a copuple of docker containers and VMs on it?
I thought the whole point was to save power, not to have processing alongside routing.. normally if your router uses its cpu, it uses it for networking probably, but if you have the cpu totally freed from its purpose, what purpose could you give it anyway(am I missing something)
These videos are amazing. I’ve never seen someone talk so much and say nothing. Amazing!
what?
so you suggest providing only a pre-built kernel and without any option for the user to recompile it or upgrade it without losing acceleration?
since doesn't seem like you can redistribute the patches?
same with strongswan patches
if there is a security issue, there would be no way for an end-user to upgrade until it's fixed by NXP?
Valid question, thank you - added on my list for when I talk to NXP, and I'll make a dedicated video to answer all these.
Would be interesting to see the performance with IDS/IPS running. Since ,AFAIK needs a lot of processing power, would determine the specs of the cpu.
I've followed this project for a while, while your other claims I'm fine with I have some questions / concerns about the NAT-P / PAT. I get this is a consumer level device but how much memory is this hardware offload module going to have? Looking at the LS1046A block diagram I guess my question centers on the conntrack monitor module (you show the command reading data from it). Whats the max number of connections this module can handle? This would also be a limiting factor for the SPI firewall as well.
A less pressing question would be with the L2 bridging is it vlan aware (yes I do see it supports vlans but is the bridging aware of vlan tags?).
As much as I would love to say people will pay for the extra CPU speed money talks so while getting yourself established I would not want "too expensive" to be a reason people reject your product while you are trying to get established.
I'd love this to be an awesome router first and foremost, not another do-it-all homeserver. So I think the lower specs are plentiful with the hardware offloading capabilities you demoed.
Excuse my ignorance but if hw offloading means you cannot run a tcpdump how would you run a tcpdump given its a network device?
This is a really good question - I (hope) that one can use hw offloading for packet capture aswell?
I haven't gotten that far, yet. You can configure what exactly you want to offload, so for tcp dump, you'll have to turn it off, and then back on again once you're done debugging or whatever it was you were doing with tcpdump.
@@tomazzaman would that require a restart of network services/interrupt the network? If yes, i think it's a big problem, if not i think you would need to test the cpu while it's not offloaded to see if you can go for the lower cpu / hw
TCPdump is for dumping the packets that traverse the kernel. If the packet doesn't transverse it, you will not see it. That's why, in high-end routers like Juniper or Cisco, you must tell the ASIC to send the packet through the control pane.
You had my curiousity. Now, you have my attention.
Same. And if he manages to provoke our anticipation then this project is a sucess.
Networking beast because it leaves it open for more applications
I hope that you're aware of that using closed source patches and drivers in Linux breaks the GPLv2 license under which the Linux kernel is licensed. According to the license, if you do so, the drivers and patches are designated as derivative works and thus licensed under GPLv2. This is the ilcense and thus the law!
Doesn't this make the company that he obtained those codes from break the law, not him?
@@hubertnnn It's kind like saying that a drug dealer is innocent because they were just re-selling the heroin - they did not make it originally.
Well depends. For example zfs code is licensed under cddl which is incompatible with gpl but binaries are not so you can use zfs with linux. I don't know how those patches are applied here but I would presume they know what they doing to bypass gpl, or maybe not :).
@@darukutsu In principle, you can't ship linux with ZFS for that reason, but some distros ignore it and so do people else too, since OpenZFS is, after all, open source.
@@hubertnnn That's correct
I'll take the lowest specs assuming that the offloading works correctly, this'll make the device very very affordable and beat everyone in the market.
So have 2 versions, a Pro and Lite versions.
Depending on NDA'ed, GPL-violating, proprietary junk makes this a much less interesting product.
One of the appeals of a custom project like this is the ability to tinker, or just having the option to keep it alive if you for whatever reason can no longer support it - which is a real risk for a small company. Without acceleration, the router would be almost useless for 10 GbE speeds.
Aren't there offloading network chips available, rather than depending on NXPs SoC? I'm not opposed proprietary blobs or a few closed-source tools required, but NXPs patcheroo crazyness and hostility is too much.
Doing your own FPGA-based solution would probably be too expensive, both for hardware and development cost 😞
I think its appropriate to do a final comparison between the final version of the router and the POS Fritz-Box.
How does the system behave when there are actual firewall rules involved? So when it isn't just fast-path?
The first packet in a new connection goes through the firewall normally and if it passes, all the subsequent ones from the same connection then bypass it.
I think it is clear from the comments and from my thoughts, it would be good to expand on applications that Hardware offloading would be beneficial and the ones in which it is not. So people can better decide if giving up CPU and memory is a good ideia.
I'm a bit concerned that the packets handled by hardware offloading won't be available in the kernel - will the device support packet inspection or other forms of packet manipulation outside of the dedicated chip?
You can turn off hardware offloading to do the inspection. It'll be slower, granted, but only until you turn it back on again.
Many of these types of SoC support port-mirroring. So it would be a matter of providing an interface to these HW functions.
Holly molly!
That is fantastic!
Great job!!!
Simple routing with the iperf3 example is the easy part. Now check NAT functions and other IP functions. Like DSCP marking/rewriting. Or VXLAN/MPLS. Using the full pipe-line of the SoC to check 10Gbps performance.
And in regard to CPU, things like Bird/Quagga to run routing protocols like OSPF/BGP.
The more prefixes the more CPU/RAM.
Maybe overkill voor CPE devices.
On the way! Or should I say, under way! 💪
@@tomazzaman How many prefix lists, acl rules, nat entries, etc. are possible in hardware? I remember devices which rebooted when the nat table was full - hello bintec! :)
How's the bufferbloat score?
I would say that, at least for the initial launch, using the lower-end components to slightly reduce the cost might be the better route, especially if the system is as resource efficient as you're claiming; maybe after getting yourself established you can release a "pro" version of the router with the higher-end components.
Also, is there any chance that PoE will be available to any of the ports?
Without offloading I think you could also push that performance using VPP graph nodes in userspace with a AF_XDP socket as the entrypoint. You can use dhcpd and other apps to make it a great home solution.
yep, this was exactly the reason that i came to know about xdp
Great video again!
Just wanted to know : Did you happen to have run some testing without the hardware offloading ? I'm curious about the result.
On the price aspect : I think cheaper is the way. But cheaper mean that we rely more on the 'block box' that is the binary and the support that NXP is willing to provide.. If the cheaper variant can still handle some 10G routing capabilities, without hardware offloading, why not !
I would like to see 2 models on the market. Those who want the more powerful one and want to shell out for it can, those who want the cheaper version can have that.
Networking beast mode 🔥 please! Let's go! 💪🏻
🫡 roger, chief!
Thanks for sharing the journey of selecting both hardware and software decision. This is very interesting. Will it have a stateful IPv4 and IPv6 firewall builtin ? What size of the arp table and neighbor table it can support? What will be the size of firewall session table?
Go with lower spec device because it will be more affordable and more competitive. It is meant for home usage. Then when you start selling it make upgrade version. Maybe you will ad more ports or add POE+ or something else...
How is performance altered with IP filtering?
One feature of pfsense I use is pfBlockerNG, which blocks IP ranges used by known bad actors. In my current config there are ~16000 ranges being blocked.
Is this also completely offloaded? Are there limits to the number of ranges we can block with offload?
The number you are showing are impressive, but we are only talking about pure L3 routing, no firewalling. My question to you would be: what CAN'T be offloaded to the CPU and how will it affect the performance? Examples: advanced packet filtering with 1000+ rules, OpenVPN, IDS/IPS, SSL, etc. Are there any kind of encryption/decryption acceleration/offloading built into the CPU such as AES-NI that would benefit those features? In many cases, the CPU sizing for a firewall is directly linked to the resulting throughput of the encrypted tunnels it can deliver (IPSec, OpenVPN, Wireguard, etc.). With that in mind, I don't think that 1.4, 1.6 or 1.8 Ghz will make much of a difference, the 1.8 GHz will be around 30% faster than the 1.4 GHz, this will clearly not double or triple the encrypted tunnel bandwidth. I am curious of seeing the performance of a single OpenVPN tunnel (OpenVPN is single-threaded) on your hardware.
For the RAM and CPU, I would go for the most cost-effective option regarding the capacity it provides (biggest bang for the buck), not the cheaper. Which CPU cost less per GHz? What amount of RAM cost less per GB? From what you are saying in the video, the 1.6 GHz CPU with 8 GB of RAM seem to be the most cost-effective combinaison. Choosing a middle ground between the most powerful and the cheapest will probably be attractive to the highest number of early adopters and be commercially more viable initially and accelerate your return on investment. After securing a steady flow of sales, then releasing a lite (less powerful) version and a "beast" version (more powerful) would probably be possible based on which one you get the most requests for.
Not only that, it shows only a single connection with max packet size, what is not a realistic test, that speed with this condition any cpu would have near 0 utilisation (even without any dedicated hardware acceleration). Comparable SoC (not same but comparable by cores and speed and have hardware acceleration made by reputable company that make most SoC used in consumer routees) in comertial product (I will not name company but name starts with A) in real user case it bearly pass 650Mbps speeds even in tests like this it get full 10Gbps speed.
@@viaujoc Exactly, he should test with Trex and use a firewall. Everything else is just hypocrisy.
Hi Tomaz
Regular follower your project, I do really like the openness and explanations, I really do appreciate it.
Maybe in the future revisions, as this device will be a tinkerers heaven, if you want to even further drop the cost, maybe consider adding NVME or SATA storage option, leave the EMMC optional, and maybe you can revise use SO-DIMM options too, so the customer can decide what Storage + RAM configuration they want to use in the future.
I don't know if the CPU will have enough PCI-E lanes, but for a router appliance some kind of SATA based storage would be enough too, if the CPU has controller, or perhaps source a sata controller chip, but I have no idea if that makes sense of a cost perspective and how much complexity it would be to develop and revisit the board.
Just chiming in, throwing here some ideas, options, I love your work, keep it up!
Thank you! CPU does indeed come with a SATA controller, but we opted to use those SerDes lanes for Wifi 6/7 instead. Will see how the market responds with this first version and then act accordingly.
The faster CPU and more memory at the moment seems like a waste of resources. It's still going to be a beast either way.
It depends. Hardware offloading means the CPU and RAM will be free to do "something else". I think it depends on what this means. It would be good to know which kind of configurations, or features break the hardware offload, and what difference does having a slightly better CPU make. Anyway, I think that in most cases, a cheaper CPU with less RAM is a better option. If HW offload is disabled for a feature, is not like 400Mhz CPU clock will amount to a night and day difference in throughput
keep as is, the extra headroom could always come in handy
Nice video :) Personally I would strongly prefer to keep the current specs (1.6 GHz CPU and 8 GBs of RAM), but I would buy it even with the lovered specs.
I liked your venture due to your open sharing and open source numbers. Been watching since the first videos and plan to buy a finished version. It has been fascinating to see your process. To see you not being able to talk about certain parts due to NDA and closed source software is a turn off.
I don’t want some closed source module for this to run properly. I won’t be buying this if it’s using closed source OS software.
In addition if you add in software for packet inspection, nat, vpn tunneling, does the offloading still work as well? For a modern router there is more to it than just pure routing of packets.
Maybe do both variants? I personally would be fine with the cheaper version as long as it works :D
How are you able to distribute a kernel binary with patches while adhering to the GPL license of the kernel?
The patches are likely GPLed, but load the closed source binary code.
Actually, that's a super valid question. I've put it on my list for when I talk to NXP.
@@tomazzaman
-
I love to see you being open to critical comments/questions like this!
@@tomazzaman You need to speak with an attorney in the US where NXP is located. Not their internal sales rep who's trying to sell it to you.
My concern with all this is still security patching and zero day response, especially for 3rd party libraries, OS and SoC drivers etc. Performance is great but I am still not seeing how secure this product's lifecycle will be.
Tomaž, sorry to sound like a noob, but would this router be a drop in replacement for our Slo fiber internet providers supplied routers like T-2 now that the "fiber everywhere" initiative is progressing nicely.
To me the CPU/RAM question boils down to how much extra load is anticipated for non routing activities, otherwise you just end up with hardware waiting as fast as it can. I do like the proposal made by others to have a "basic" and "pro" version if you can tolerate the separate SKU and BOM parts.
Do you happen to know if the HW offloading allows for port mirroring for analysis purposes?
Future marketing slogan for the device "Routing from Slo is Fast!" [OK, I'll go now. No need to push me into Soča :) ]
I'm not a network guy, but I understand that the traffic component is handled out of CPU and Memory which is very nice. But won't there be use cases where having these ressources will be a must. have? Such as running services for traffic analysis, ads remover, maybe VPN tunnels etc... I'm curious.
VPN tunnels can be offloaded to router hardware, typically. There is also a bunch of types here too. If they can’t and yes that will be the case for wireguard for instance,
Traffic analysis is usually handled by a separate process, which can gobble up a mirror of whatever is being routed. If you are taking about IPS that’s a slightly more intense task but just getting metrics on the requests it’s easy.
Ads removing, due to the nature of HTTPS is usually done based on origin. That’s easy enough to get a list of firewall policies and feed that into the hardware as an ACL
I'd be interested in learning if inter-VLAN routing via tagged interfaces requires bypassing the HW offload of the firewall. Most solutions I've used in this space see a huge performance hit once you try to do this, and you usually get encouraged to separate your firewall from your inter-VLAN routing performance when doing so. I have a stupidly overcomplicated network in my house, and right now my core switch is doing inter-VLAN routing, but I'd love to have everything being done on one device.
It really depends on how much Physical SFP+ and actual 10G compatible ports you want to have in the final version of your product. If it’s going to be a solution up to 8 ports you can definitely go with a smaller requirement, if you go 16 and up you would need the additional power.
You should also consider a upgrade path for PoE++ input and PoE+ Output on Multiple Ports. Today that is a essential feature for a more expensive network appliances. Consider the price of you CPU alone I suspect you solution to end up above 250€ or even close to 400€ for that price point I expect a lot more of a device considering what you can get on the market right now.
hello, any detail regarding DMZ, Port Forwarding, Multi WAN, OpenVPN, WineGuard and other VPN protocols?
Will test all of these! stay tuned!
I wouldn't consider deploying these routers, since I'd be fully dependent on you for upgrading to future kernel versions. Even decades old industry giants mess up device lifecycles on the regular. Asking for that level of trust in a relatively small operation like this sounds unreasonable to me. On the other hand, if the entire software stack was open source I'd be more than willing to evaluate it for serious use, as the features do sound rather compelling. It would make not blatantly violating the GPL easier as well :^)
how much RAM does Suricata use??? I'm making some assumptions here that this thing will be able to run Suricata, though....
Are you still planning on this being capable of IDS/IPS inspection? I thought I remembered that was the original plan.
I am wondering if all this hardware acceleration and low power goes out the window as soon as you start doing that though? That was a problem on a lot of Ubiquiti earlier gen devices where their 10gb Unifi router went from 10gb throughput to 1~ gb throughput with IPS on because it had to move the traffic to general CPU cores. Newer gen Unifi devices changed that up by using different SoCs to get more IPS speed. This product doesnt have to be any kind of full decrypt/encrypt interception inspection with certificates on end devices sort of thing, but having a standard IPS type system (probably suricata running ET Open ruleset?) with pattern and packet signature matching to detect malware or hack attempts in encrypted traffic would be great to have. I know not everyone is of this opinion, but to me, all firewalls should have that type of traffic inspection as standard now days as a basic stateful firewall just isn't good enough anymore for evolving threats.
Id be fine with a 1.4ghz CPU as long as it has the same HW accel blocks in it since the CPU cores dont really matter.
Edit: Looking at the Layerscape CPUs, I see that the 1046A does not have a pattern matching HW engine, so I am guessing if you do enable IDS/IPS that the speed will drop a ton like I was thinking. I also see that the next gen LS2084A CPU steps things up a little bit (but for a lot more money) and increases the security engine from 10gb to 20gb, adds a 20gb data compression/decompression engine, and adds on a 10gb pattern matching engine. So this would be the CPU to use in order to get 10gbps IDS/IPS I would think. Sadly it costs $285~, though probably a bit less when buying by the thousand. At that point though IDK if it is even worth using the next gen CPU when you could step up to an even higher end on, the LX2082A that gets you 50gbps HW accel throughput for only an extra $15. lol. Though I dont see the LX2082A having a pattern matching engine like the LS2084A has... So maybe that's why it is not much more for so much more throughput, as it is missing the pattern match part the "lower end" part has.
The whole "FreeBSD is faster at networking tasks" sounds like the same tired old arguments that that community keeps claiming based on some very old benchmarks and "Netflix is using it so it must be fast". 😄
Nobody has provided any modern benchmarks to prove that this is still the case. On the contrary, the few modern benchmarks that I've seen seem to favor modern Linux. 🙂 VPP will obviously totally smoke kernel networking though, as will hardware offloading.
Those hardware offloading engines can definitely be very impressively efficient and performant. 😀
Indeed. FreeBSD was faster in the long long ago. Then Linux vastly improved. But since then, let's say "less enlightened people" have messed with linux networking - eg. removing the route cache - so I can't say either is very effective these days. Hardware routing and switching will ALWAYS be faster than software.
It’s not about benchmarks, it’s about the apis and facilities to explicitly support hw acceleration in their networking stack that linux currently lacks
@@l3xforever Not sure exactly what you're referring to. Mind providing some more details? 🙂 My experience is that hardware acceleration is generally better supported in Linux than in FreeBSD, but I'm sure there are some exceptions.
I think the main purpose of the router is to perform flawlessly in the workloads you demonstrated in the video. I think therefore that the 1.6GHz or even the 1.4GHz cpu would not compromise the performance of the router. Similarly 8Gb vs 4Gb given one option or the other do not affect significantly the routing performance you should go with the option that makes the product as much competitive in the market as possible. Consideration should be given to what kind of UI and life comfort features you plan to make available in the product. But I don't think a UI, even with heavy graphical sophistication and maybe historical logging would be affected much by lower clock speed CPU or 4Gb of RAM. My 2 cents.
Do note that hardware off-loading can get in the way of SQM which is important for many people.
Just sharing my journey, mate.
ALL IN!
I think it would make sense to keep the hardware modest. 1.5 to 1.6 Ghz would be fine, 2 or 4 GB of RAM would be fine. But I guess the big part of this is would be the shipping feature set.
Target audience? Home? Go cheaper, but if you are able make a limited edition from the beast. Small office? Scalability is everything, keep it as is.
I already see two version of this router emerging, depending on what traffic can really be offloaded and what software configurations would affect(and disable) hardware offloading, because that is important on a 10G router...
A cheaper version is preferred, yes, for the customers and for you guys... And that's probabably your main goal. But a higher end version might make sense too, maybe the faster CPU or something else added, maybe more connectivity options, or more RAM(example). That might be a different product, one that completes a product line. ;)
protip: use `ip -br a` instead of just `ip a`, or if you really want all the info, then use `ip -o a` to format it on "one line" per entry
also "btop" is much better than "htop" imho
btop allows you to e.g. hide all the other panes other than cpu graph, if you want to, for example
A bit more performance won't hurt for additional services like VPN, Proxy, system monitoring tools etc.
be sure to test for udp out of order packet handling and udp packet loss. Ubiquiti
hardware offloading has serious issues with this.
I have to say, I've been following this project for months now and I've been extremely excited this entire time. If the device were open to preorders at the end of the last video I may have even pulled the trigger this early.
The issues with the ASK have completely deflated my enthusiasm. I'm a security professional by trade, and the inability to patch my systems is a complete deal breaker for me. I sincerely hope that this has been a misunderstanding, or if not then I hope NXP pulls their head out of their rear because this is completely unsatisfactory.
Just about to publish a video, and there's a suprise inside! :)
@@tomazzaman I'm looking forward to it!
Please do a bandwidth test with slower CPU and less RAM amount (it might take less memory channels to have less RAM and that could really impact bandwidth) to be sure those don't impact hardware packet processing.
I would say the per device unit costs are something to consider. When you are manufacturing 100s or 1000s of units that extra 70 eur per device adds up quickly. So once you have your final design then your challenge is to reduce the cost per unit as much as possible to keep manufacturing costs down and profits up.
I have a few thought questions to think about:
So I have to wonder at your current cpu/hardware design can you go faster than 10GbE? Is 25, 40, or 100GbE in this hardware’s future where you might need the CPU cores?
What impact will full stateful packet inspection have on CPU usage (this is the big question). You can move packet forwarding to asic but how well does the hardware handle layer 3 to layer 7 packet inspection? I would think there would need to be kernel intervention beyond just simple packet forwarding. Before deciding on removing hardware lets see a near final prototype.
With that development hardware can you underclock the hardware to see what the impact of 1.0 or 1.4Ghz means to the system’s operations? But from what you have shown there is no need for a 1.6Ghz 4 core cpu. Full packet inspection might change that opinion.
On the open source front, I could see two versions (depending on the hardware dev kit licensing). The advanced version of your board design could use the hardware acceleration kit with its licensing fees. And the “basic” version where it does all in kernel routing and packet inspection. The former could use a lesser CPU and ram. The latter one would need the 1.6Ghz and 8GB of ram. Or you can just use the same hardware and the only difference is if you use the close source with the acceleration kit or open source with kernel based packet routing. This isn’t a fully formed plan, only an idea.
I realize you have limited hardware for testing, but I wonder what the results would be if you add more senders with the one receiver. The idea is to see if you can flood the 10GbE link until you start to get retrans errors. This would see how the hardware would manage to much data to forward.
But all in all your hardware and your design team is doing a great job. You need to find the niche where your hardware has a market.
Aim for €149 retail price, and go with 4GB and 1.4Ghz if you need. With offloading this good, no need to go up in specs, better to create a product first with mass apeal then later launch a more expensive option. If you want to make this more than a one product series, you need mass appeal, like the first EdgeRouter Lite, which really launched Ubiquiti.
I like how informative your videos are!
Depends if I can run additional software on the machine to make use of the additional RAM and CPU. I am new. Maybe there is already intention for the CPU/RAM i am not aware of.
If not I would tend more towards the cheaper option.
i could see a market for undecided users that would purchase the router if it's inexpensive enough and would later purchase a top of the line one if it met expectations
Use a CAMM module for ram :) is it possible to use something like IC Sockets for the CPU? Maybe you can make different hardware choices available.
It's difficult to make a choice here: are you saying that due to the efficient traffic offloading, the reduction in fitted RAM size and CPU clock speed will make no difference to overall performance of the unit? If so - sell it cheaper. Do these reductions impact overall routing performance, or future-proofing of the design? If yes, then it has to be the more expensive spec.
A long time ago you suggested variants at different price/performance points (I suggested different coloured cases to identify the variants) - is this what you are suggesting, multiple production models or a single high-perf model only? For single model only, then no choice - hi-perf. For multiple models at different spec levels, the question answers itself - but more than one variant in production is inevitably going to have knock-on effects later on (production and manufacturing complexity, support of multiple variants in differing builds of firmware, updates and bugfixes, etc. etc.).
I don't have enough info yet to decide what I'd like, but if the buy-in price difference is so great, then I would suggest it's *your* prerogative to make this choice for your launch product. Given the ethos (fastest, best) it implies the expensive variant is the one.
A price/performance/resource comparison of the low-price/high price variants would help here. Another one of your excellent charts in a follow up video to show the trade-offs more clearly?
Awesome video
My question is, what would 8GB of RAM even be used for if we kept it?
Do you reckon we could setup a home assistant server directly on the router? Would that benefit the extra specs that much?
Personally, I am just fine with the reduced specs. Specifically, 8GB RAM is overkill in my opinion, even for a top-of-the-line variant.
Well... speaking about performance vs price, it would be much easier to make a conclusion by seeing the practical tests and seeing approximate price.
Otherwise, pretty much interesting project, good luck with your routers!
Thank you!
RE closed source patches: would you be able to hire external security reviewers to verify the safety of the patches and ensure third party verification of the safety claims? I wouldn't mind seriously considering the product if either that or disabling any proprietary binaries/source was an option at runtime.
Networking beast!
Seeing how people respond with concerns like bugs, platform deprecation, and so on, I'd say that it would be a good idea to consider making sure that the router is capable of reaching basically the same results in terms of performance with the kernel bypass *disabled*. If you can downsize to 1.4 GHz while keeping CPU utilization under 70-80% on full load, then go for it.
It would be nice if this router support PPPoE Half-Bridge, we can connect to our own Firewall or IDS/IPS via DHCP
With regards to cost savings, my question would be are the CPU, RAM, and EMMC storage on sockets, so that we could easily upgrade them afterwards? If we can easily upgrade the device ourselves, then I don’t see a problem with shipping lower speed/capacity/cost components.
If you’re going to be like Apple and soldering them all onto the logic board, then I think you’re going to end up needing to provide multiple different versions with different capacities and speeds, much like Apple does. Because that would be our only way to upgrade.
Definitely figure out how hard it is to break hardware offload, and what the performance is when you do.
But, even assuming the performance is great without hardware offload, I'd still like to see what you can do without the ASK. If you're stuck with the ASK, I would probably just stay within the unifi product line and not bother with fragmenting my network stack.
Try doing iperf -c a.b.c.d -d (this should be the dual test which will make 2 connections one in each direction)
I'm only slowly entering the world of custom networking gear and not using the ISP provided gear with a couple of the cheapest APs and switches possible, so apologies for the novice questions and random thoughts;
Would it make sense to choose a mix of the two, so the 1.4GHz with the 8GB of ram? To me it seems like the 0.2Ghz is much less of a downgrade, than the 50% less memory.
I don't know what sort of services would benefit from another 4GB of ram or, perhaps more interestingly, 0.2Ghz faster processor. Would something like WireGuard run noticeably quicker on the 1.8 vs the 1.4Ghz, would the 4GB ram be enough for it? What are some services that you would usually run on a router?
Nonetheless I do think the first version should probably be the fastest to get your name out and known as solid, if slightly over built, routers
is the router with 1.4GHz CPU and 4GB still a networking beast?
if yes, use the cheaper components.
I believe so, but I wanted to check with my viewership whether ya'll agree.
I mean for comparison, I'm using an off the shelf router with what I believe is a 1.2 GHz CPU and 128 MB of RAM, serving as an edge router for 4 APs and 7 devices actively connected at any given time (give or take). Is the latency good? Absolutely not, but it's held up fine for several years. Pretty much the only issues I've had were some DNS outages (quad9) and some issues with the ISP modem.
@@tomazzamanmaybe hardware packet processing isn't as powerfull in slower CPU (marketing reasons) or it requires full RAM bandwidth. Please also check slower CPU with less RAM (it could mean less memory channels) for hardware accelerated performance.
@@graemewiebe2815128MB of RAM sounds like DDR3 (Samsung and SK Hynix no loger produce from May 2024)
At the start you mentioned that FreeBSD is better than Linux for routing, but I've found the opposite to be true. I was testing opnsense vs OpenWrt on an old SFF PC with an Intel i5-9500 and X540-T2 NIC. OpenWrt was able to successfully reach 8Gbps over a single iperf connection over the internet. Opnsense struggled to even hit 3Gbps, with one CPU core pegged at 100% (which makes me think that maybe NAT or conntrack is single threaded... not sure)
I know that this will be a significant overhead for logistics , but cant we have 2 versions?
For example i want to have the best of the best no mater the cost (in resonable ranges) , while others might not need a faster cpu or more ram
Whether the A72 cores run at 1.4GHz, 1.6GHz or 1.8GHz is irrelevant.
This is because if there is an edge case where the traffic cannot be offloaded, it will just be slow.
So yeah, save money on the CPU.
As for the RAM, what do we need 8GB for? Most routers ship with 256MiB or 512MiB, some with more.
Even if you're doing some amount of BGP, 4 gigs is plenty.
But what I want to know more about are the offloading capabilities.
It's probably wishful thinking but I want good performance with wireguard and traffic shaping (fq_codel, cake).
This seems to be good enogh product, to have an industrial usecase...we'll see
I figure it out right away.
That's why Windows has an advantage with well supported hardware, and Apple makes hardware and software, so there should be no issues there (as always, I have noticed on older hardware that even they control both hardware and software, they removed the very optimized drivers in favor to something which doesn't perform as good as it was, but it's so old people don't care about it). Drivers are what enable full use of the hardware, offloading all the operations to the ASICs (Application-Specific Integrated Circuits). And the software needs to call the driver functions to enable the hardware processing.
The same issue described here with Linux is present on "alternative firmware" for home routers / access points like the awful DD-WRT (and OpenWRT for that matter) and other variants which the developers have reverse engineered the hardware of those routers to provide support, but these reverse engineered drivers are basic most of the times and process everything on the CPU.
For those wondering what I'm talking about, think this way: you have a pretty decent GPU on your system, but no connection to the Internet or embedded drivers for that GPU. Windows enables it using the "STANDARD VGA ADAPTER" driver. It can enable a high resolution, SVGA, XVGA and the sorts, but window composition (moving windows, drawing) is bad. And it cannot draw video frames fast enough or accurately that way, simple tasks take up all your CPU cores. That's what's happens when you cannot offload processing to the many auxiliary processors, ASICs and utilize the hardware you own to the fullest.
Pretty interesting video.
I'm sold!
How can I get informed when this get's available?
Why not have 2-4 mounting options? RAM is drop in replaceable, same with that CPU. 2 extra reels on a pick and place is not that much tho they are going to occupy a few more slots it is feasible since you may free that much space just from resistor and capacitor optimisations. BUT when it comes to logistics there may be problems since you will need to have some way to mark them visually and for the SW. BUT that is simple , you can use a resistor/led to visually indicate what it is or laser engrave a mark or use a sticker/ marker mark . as for the SW knowing what is has a top command should be enough but you can also use a free IO pin to read a pull up/down. Or simply write it in flash/eeprom
If the routing performance is identical with lower tier parts I don't really see the need for the more expensive hw components
an alternative could maybe be keeping the faster processor and larger ram size while going above 10gb/s speeds
How many ports will be supported (depends on PCI lanes). I may see use for 2 WAN ports, and at least 2 LAN ports, plus an out of band management port.
Any free slots for expansion.
I know I'm dreaming :-)
The hardware offloading has impressive performance and low power consumption. My guess was going to be 6 watts. For it to be less than 2 watts was surprising. However if the offloading stops users from being able to use Wireguard, OpenVPN or rules.. that's gonna be complicated because for the price of this combined with its 10Gb target market.. it's gonna need to do those things and fast. I fear this will be a product that is amazing for everyday users who just need a fast router but I suspect those people have 1Gb or less internet speeds.
And so the enthusists who have 10Gb (like myself, I'm very lucky!) I'd want the ability to run all tasks very quickly. I think you need to show some results of what routing on just the CPU can do and what it can do when handling VPN traffic too. This is definitely important for a high-priced router.