Guide data fetch timeouts

Is this considered normal during guide data fetches?
Appears it doesn't retry to fetch the 6 hour time block and moves on to the next 6 hour time block.

These are from different servers, not all the same one.

2023/11/01 09:34:05.847693 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-CA54023-X/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/01 09:50:12.096406 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/02 09:48:06.068745 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-CA54023-X/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/05 08:18:11.472010 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/05 08:41:12.828762 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/13 09:43:05.648422 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-CA54023-X/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/16 09:21:11.565027 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/16 09:29:12.382790 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/18 09:24:13.063696 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/18 09:32:11.765297 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/21 09:10:11.217559 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/23 09:22:11.061433 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/23 09:39:11.015982 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/24 09:42:05.727279 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/11/27 09:50:23.999273 [DVR] Error fetching guide data: unexpected end of JSON input:
2023/12/07 09:06:11.701763 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/12/07 09:32:12.681568 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/12/09 09:16:15.627030 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/12/09 09:45:12.254985 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": dial tcp: lookup tmsdata.fancybits.co: i/o timeout
2023/12/09 10:11:05.352790 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": http2: timeout awaiting response headers
2023/12/20 15:29:27.914894 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": http2: timeout awaiting response headers
2023/12/20 15:29:40.269550 [DVR] Error fetching guide data: Get "https://tmsdata.fancybits.co/v1.1/lineups/USA-DFLTE/grid?...": http2: timeout awaiting response headers

Do all of these servers have the same DNS server they are using? dial tcp: lookup tmsdata.fancybits.co: i/o timeout means that when we were trying to get the DNS response for that hostname it did not complete in time.

Yes, using Cloudflare DoH (DNS over HTTPS)
All the other 6 hour time blocks (63 of 64) in the updates completed, except for today where 2 of 64 time block fetches failed with http2: timeout awaiting response headers. Redownloading the guide was successful.

Huh. Sounds like you're running into some sporadic packet loss between you and Cloudflare.

Yes, it's correct that we don't retry the failed requests. It will be retried the next day and any missed changes will get picked up then.

What are the timeout values for these?
dial tcp: lookup tmsdata.fancybits.co: i/o timeout
http2: timeout awaiting response headers

Seems strange I could lose packets over an https connection.

The dial timeout (for a DNS response) is 5 seconds. The timeout for response headers is 10 seconds.

You're not losing packets over the HTTPS connection. You're experiencing the packet loss on the underlying TCP packets which cause the HTTPS connection to stall and in these cases timeout.

I do not have a lot of knowledge of DoH, but I could see how DoH would perform worse than normal DNS lookups when experiencing packet loss because tunneling all of the DNS requests over a single TCP (HTTPS) connection would experience head-of-line-blocking that could cause these stalls, where using standard DNS lookups over UDP would be sending independent packets to each nameserver, where any single one being dropped would not impact the other one.

So, just to make sure I have this right

Is a timeout waiting for your server tmsdata.fancybits.co to respond with guide data (nothing to do with Cloudflare DoH)

AND

Is a timeout waiting for Cloudflare DoH to provide a DNS lookup

Yes, though this does happen to also be proxied through Cloudflare servers.

Correct.

I guess since it's pretty infrequent and

I'm not going to worry about it unless it starts happening more frequently.

Thanks for the explanations.

https://1.1.1.1/help

I may have run into rate limiting if all my servers are fetching guide updates at the same time.

My router is set as the DNS server IP for all devices on my LAN and my router uses Cloudflare DoH for a DNS server. So my router is sending DNS lookups for all devices on my LAN.

From Network operators · Cloudflare 1.1.1.1 docs
Rate Limiting
Operators using 1.1.1.1 for typical Internet-facing applications and/or users should not encounter any rate limiting for their users. In some rare cases, security scanning use-cases or proxied traffic may be rate limited to protect our infrastructure as well as upstream DNS infrastructure from potential abuse.

Best practices include:
Avoiding tunneling or proxying all queries from a single IP address at high rates. Distributing queries across multiple public IPs will improve this without impacting cache hit rates (caches are regional).

For running so many servers from a single location, perhaps you ought to look into using something like unbound as a local recursive and caching DNS server.

It's possible, but what are you using on your local network to do DoH?

Built into my Synology Router, just have to enable either Cloudflare or Google

I would expect that any good DNS server would have caching built in, so that even if you have all of your servers fetching at the same time, they shouldn't be making one-to-one requests to Cloudflare. Unfortunately in my cursory digging I can't find out what the actual DNS server they are using on the Synology to dig into it deeper.

After investigating further, I found that we were already retrying in certain situations, and found that it was easy to add a couple of these timeouts and such to the cases that we will retry.

2 Likes

I would assume the router uses https://cloudflare-dns.com/dns-query
The UI allows these choices

  • Enable DoH (DNS over HTTPS): DoH ensures that DNS queries are sent over an encrypted connection for increased security and privacy. Tick this option and select Cloudflare or Google in DoH server URL. You can also enter the URL of your preferred DoH server into the field.

OH, you mean what they use for the DNS resolver in the router. Don't know.

Thanks. Just updated. Will post here if I see those timeouts again, or if it doesn't show it retried.

Just a follow-up to say it appears fixed.
Log entries from 4 of my servers show retries are being used.

Thanks

DVR Server 1
2023/12/30 16:18:56.883124 [WRN] Retrying guide data request in 20s due to timeout
2023/12/30 16:19:48.083440 [WRN] Retrying guide data request in 20s due to timeout
2023/12/30 16:24:12.161682 [WRN] Retrying guide data request in 20s due to timeout
2023/12/30 16:27:45.056822 [WRN] Retrying guide data request in 20s due to timeout

DVR Server 2
2023/12/31 09:54:09.433305 [WRN] Retrying guide data request in 20s due to timeout
2024/01/03 17:14:25.786528 [WRN] Retrying guide data request in 20s due to timeout
2024/01/03 17:14:56.683530 [WRN] Retrying guide data request in 20s due to timeout
2024/01/03 17:24:46.956927 [WRN] Retrying guide data request in 20s due to timeout
2024/01/03 17:25:32.302735 [WRN] Retrying guide data request in 20s due to timeout
2024/01/03 17:29:26.732777 [WRN] Retrying guide data request in 20s due to timeout
2024/01/03 17:30:32.500876 [WRN] Retrying guide data request in 20s due to timeout

DVR Server 3
2024/01/04 09:15:05.243650 [WRN] Retrying guide data request in 20s due to timeout

DVR Server 4
2024/01/04 09:51:05.858352 [WRN] Retrying guide data request in 20s due to timeout

1 Like