Internet Throttling Woes

2023-10-22

It all started a few days ago, when the networking speed of my NAS suddenly dropped dramatically.

I immediately assumed this was the fault of my ISP, which we had issues with regardless. This was just the cherry on top. But, as you may have guessed by the existence of this blog post, it wasn't their fault. Read on!

Despite assuming it was the ISP at fault, I went ahead and did some digging on my own, since they reported no faults on their end. My first stop was disabling my router's Quality of Service feature, which prioritises traffic based on its "type". This did actually improve things a little bit - we gained about 150Mbit/s, before settling down to ~446Mbit/s from an average of 320Mbit/s. But this was still not what we were paying for. And thus the investigation continued.

To determine whether the issue only affected my NAS, I decided to run some iperf3 tests between my NAS and my desktop, which are connected to the same network switch. Theoretically, I should get the max throughput of 1Gbit/s, but to my horror I was only getting 250Mbit/s! This made me realise that indeed, the problem existed between the keyboard and the chair.

Some additional context is required at this point. My partner does a lot of clubbing/raving in VR, specifically VRChat. Most events use the VRCDN network, which is a streaming service specifically designed with VR in mind and maximises compatibility with VR platforms. These streams are usually delivered using the RTSP, which offers fairly low latency and processing. However, it is well known that the router provided from our ISP has issues with this protocol. Thankfully, we don't use the router functionality provided by our ISP's box, but we have still encountered serious issues with that protocol. To get around this, we use a VPN to route the traffic from the CDN, disgusing it from our modem and router so it doesn't get special treatment. For a while, we used the Wireguard client directly with a set of allowed IPs, but I moved the configuration to the router so we didn't need to deal with the admin-level application window in VR (it's... problematic). This is done using a feature on our Asus router called "VPN Fusion".

Quick side note: While writing this post, I reread the linked blog post and disabled the SIP Passthrough on my router. Testing in VRChat, this doesn't seem to have helped at all. So... the VPN remains.

Going through the steps of removing any and all possible interruptions to a networking connection, I ran through all the options on my router.I disabled the "App Analysis" feature (which attempts to break down traffic by the type of application) and disabled the VPN on the router, which applied to all devices. I ran an iper3 test and... 953Mbits/s.

I swiftly setup a meeting between my forehead and my desk (the two are good friends), and ran some speedtests just to be sure. Indeed, the NAS was getting closer-to-gigabit speeds I would expect, and so were other devices on my network.

https://cdn.gabrielsimmer.com/images/speedtest-grafana.png

My theory is that the Quality of Service, App Analysis and VPN Fusion all being enabled was too much for my poor router, and it became the bottleneck in the chain. Unsuprisingly disabling all these features also massively reduced the memory and CPU usage. I suppose the moral of the story is that when in doubt, check how much work your router is doing when you use the network - it could be passing through a lot of layers before getting to its destination, each layer requiring a bit of compute that eventually compounds into a severe bottleneck.

Or at least, that's my theory. Either way, I'm satisfied with the end result, even if this retro isn't entirely blameless. I've re-enabled VPN Fusion but only for the devices that need it, and haven't seen anything going awry, so I'll just keep QoS and App Analysis turned off (I wasn't really using the later anyways, and the former is maybe uneccessary when we pay for 1Gbit/s). At some point I'll build my own router and do more fancy networking stuff, which would probably also let me get more insight into what's going on internally, but for now we'll cruise with what we have.