Very aggressive retry policy

Barabbas · November 13, 2024, 1:16am

If a source closes the connection on you it will be logged as:

Closed connection to ###
Job ### ended prematurely

Right now there doesn't seem to be any limit on the retry policy. The retry will happen instantly and if the connection is closed 1 billion times, it will result in 1 billion new connection attempts.

This seems extreme. Perhaps something more reasonable would be 10 instantaneous retries and then go into an exponential backoff strategy.

From google:

An exponential backoff retry policy is a technique that increases the wait time between retry attempts exponentially until a maximum number of retries is reached. It's a more sophisticated strategy than constant retries, which can strain a service or network

tmm1 · November 13, 2024, 1:26am

There is already exponential backoff.

Please submit diagnostics so we can examine the logs.

Barabbas · November 14, 2024, 4:19pm

I think an edge case must be missing in the check. What I observe in the logs is 400 connections over the course of 4 minutes. In each case Closed connection is registered followed by the acknowledgement of Job ended prematurely.

There is no delay or backoff and the reconnect initiates immediately. Presumably you are not counting it as a failure because you are properly connecting to the remote source. Then the remote source closes on you so you want to retry and keep going.

Can you add a counter to the job which counts the number of times it ended prematurely? If it ends prematurely more than X times, give up. A job which ends prematurely > X times is unlikely to produce a good mpg file anyways.

eric · November 14, 2024, 7:10pm

@Barabbas I don't see the diagnostics from you. What was the ID that it gave you when you submitted them?

system · December 14, 2024, 7:11pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.