Forum Discussion

Skull52's avatar
Skull52
Network Novice
3 years ago

High Latency and Packet loss

The Past couple of days I have been experiencing Latencies (Ping) of over 200ms and packet loss from 85%-100% pinging. Anyone else experiencing this? 

  • Skull52's avatar
    Skull52
    Network Novice

    The odd thing is that the failover to T-Mobile on the Netgate worked very well until a couple of days ago. I have been working with Netgate Support and we can’t find anything wrong with the Netgate.

  • I have the Nokia gateway & the excessive ping latency is common to all three of the T-Mobile gateways. A gateway replacement is a waste of time, effort & money. The problem has been reported by users coast to coast pretty much. You can do an ICMP ping for ipv4 or ipv6 and the result is the same. I have used Apple & Linux clients testing & it is pretty poor. It is common to see 70-80% packet loss running pings. Sure ICMP will have low priority & can be ignored but this is a recent behavioral change.

    --- 8.8.8.8 ping statistics ---

    100 packets transmitted, 19 packets received, 81.0% packet loss

    round-trip min/avg/max/stddev = 75.896/110.618/150.142/18.738 ms

    from netstat info:

    Input histogram:

                    echo reply: 56

                    destination unreachable: 6175

                    time exceeded: 104

    Regarding the speedtest.net operation:

    When a speedtest.net “test” is conducted to the server with a Wireshark capture you can see it opens a TCP session to the target test server. The TCP session (in my test run) is setup between TCP ports 8080 & 52830 for the packet exchange between the local client and the destination server. The local client TCP source port changes depending upon the packets. There are also some UDP exchange from time to time between the test server and the test client.

    82.83.133.40.in-addr.arpa name = charlotte02.speedtest.windstream.net. < Target Server

    Just prior to the session setup the local client, my MacBook Pro, repeats sending echo requests at 82.83.133.40. Over the course of the text there are 22 of the ICMP packets but all fail.

    Result: All fail to reach the server. (no response found) ← Reason

    Curious behavior as the trace route hits 192.0.0.1; then there are 4 responses 

    traceroute to 40.133.83.82 (40.133.83.82), 64 hops max, 52 byte packets

    1  www.webgui.nokiawifi.com (192.168.12.1)  1.042 ms  0.393 ms  0.330 ms

    2  192.0.0.1 (192.0.0.1)  0.533 ms  0.560 ms  0.454 ms

    3  * 192.0.0.1 (192.0.0.1)  27.806 ms *

    4  * 192.0.0.1 (192.0.0.1)  42.273 ms  30.483 ms

    5  192.0.0.1 (192.0.0.1)  27.970 ms  36.486 ms  28.959 ms

    6  * * *

    7  * * *

    8  * * *

    9  * * *

    10  * 10.164.165.59 (10.164.165.59)  495.730 ms *

    Non-authoritative answer:

    10.164.165.59.in-addr.arpa name = 59.165.164.10.man-static.vsnl.net.in.

    So it appears the traffic goes out the gateway but even performing the trace route using port 8080 or 443 it has issues. I have no clear idea where 192.0.0.1 is for sure. 

    Is anyone else seeing the trace route where 192.0.0.1 is the next hop after the gateway? 

    (00:50:b6:88:1a:f8 is the MAC address associated with the 192.0.0.1 IPv4 address from the Wireshark packet cap)

    I am still picking the packet capture apart as there are a variety of odd issues but the speed test is completed regardless of the issues with the exchange of packets. 

     

  • Skull52's avatar
    Skull52
    Network Novice

    Yeah, I don’t hold much hope that it will. I think you are correct something has broken icmp on TMHI but they won't admit it. The tech didn’t tell me what gateways were affected but just that the Arcadyan was not ,which I don’t buy it. I think it is all of them considering the Arcadyan and the sagemcom are both experiencing it i am sure the Nokia is too.

  • Mark_h_'s avatar
    Mark_h_
    Newbie Caller

    I can tell you it probably won’t fix the ping.  I have the sagemcom, and have the same thing, over 90% of pings fail.  Something has broken icmp on TMHI.  I would expect them to fix it eventually, way too many people and devices use ping to monitor connectivity.  It is technically not a correct gauge, but it is often the only one available.

  • Skull52's avatar
    Skull52
    Network Novice

    Mark,

    You are correct ping is the only way to detect a down condition with pfsence. This was working for 2 months until just a couple of days ago. I have the Arcadyan KVD21 Gateway and I contacted support today told them the issue about the ridiculous latency and the 1 in 20 ping replies and 85% - 100% packet loss and that there were others complaining about the same issue the tech said they were aware of an issue but the Arcadyan was not affected, so they did the same old process of re-provisioning the gateway which of course didn’t fix it so now they are sending me a replacement Arcadyan we will see if that fixes it. 

  • Mark_h_'s avatar
    Mark_h_
    Newbie Caller

    It is correct that a ping is not a reliable gauge of connectivity, but it is often used.  It seems to be what  is used by the Netgate firewall Skull52 has.

  • JMS's avatar
    JMS
    Roaming Rookie

    IMO ping to an open internet host is not a reliable gauge of connectivity.

    I have been running tailscale to another host (also behind TMHI at another house) to determine if we are both connected.  “tailscale ping <host>” runs every 5 minutes and has been working very reliably.  You might consider that or a similar VPN solution to determine when you want to failover.

  • Mark_h_'s avatar
    Mark_h_
    Newbie Caller

    I tried to find how to set the netgate to use a different port for its ping, but had no luck.  Another user yesterday mentioned the same thing.  It’s funny, because when I run a speedtest, the pings seem normal.  I don’t know what’s different.  Either the app, or www.speedtest.net still see normal ping response.

  • Skull52's avatar
    Skull52
    Network Novice

    Yeah That’s a problem, I am using a NETGATE 4100 Firewall with Starlink and T-Mobile in a failover configuration with the TMHI as the secondary and Starlink as primary. If Starlink goes offline which, it does in heavy rain the NETGATE switches to the TMHI until Starlink comes back online then it switches back. This has worked well until a couple of days ago. The problem is that the NETGATE uses Ping and Packet loss to determine an offline condition and with the excessive latency and packet loss on TMHI it thinks it is offline all the time and won’t switch so no internet access when Starlink goes off line.

  • Mark_h_'s avatar
    Mark_h_
    Newbie Caller

    The short answer is yes, others have seen it.  There’s another thread on here about it, but a few days ago, something broke ping on TMHI.  For me around 1 in 20 pings get a reply, today over 200ms(usually 40).

    I’m running pinginfoview from nirsoft, which lets you set the port it uses. Webservers respond usually on port 80, DNS 53.  This lets me do a periodic ping to check my connection.  My connection has been more stable since they broke the ping, so there is a bright side.