Channels DVR stops responding randomly on macOS

I guess the Censys requests were a red herring and that it was just a coincidence they happened to send requests right before channels' web service last hung. From the tcpdump and channels http logs, it looks like channels was probed again several times by Censys over the last few days, but no corresponding hang occurred.

Okay, I guess I'm back to just waiting for the next hang and see if I can learn anything new when it happens.

1 Like

We have some leads from the last diagnostic but another will help narrow down the issue

Alright, it happened again this morning 4:02am PDT. I submitted logs: 5c2d196f-6a37-4919-961f-e2eb93740814 -- same curl operation timed out after 25 seconds when querying http://127.0.0.1:8089/status -- restarted with a kill -3.

I’ve been having the same issue for the past couple months on my Intel Mac mini. The Channels server becomes unresponsive every few days requiring a restart, though recordings still happen during this time despite the web server/clients not working, Restarting makes it work again, but it keeps freezing every few days, which is particularly annoying when I want to access it remotely. I contacted support and provided logs from when this happened, but haven’t heard back in a while…

2 Likes

@thully as a hacky workaround on your Intel Mac mini, just until the channels guys sort this out, you could run something like this in an open terminal window:

while true; do /usr/bin/nc -z 127.0.0.1 8089 || /usr/bin/pkill '[c]hannels-dvr'; sleep 60; done

It will check every minute to make sure channels is responding, and if it's not, it'll kill it, triggering channels to restart. Several of us have done this, but with launchd so we don't need a terminal window always open.

1 Like

Latest prerelease has a fix which may be related to this issue.

1 Like

Interesting. Only two minutes? Some of my prior hangs definitely outlasted two minutes, though. I'll test it out. Maybe I'll update my health check to not restart until after 3 minutes, just to see if it catches any.

1 Like

It's possible it would get repeatedly stuck for two minutes at a time. We are not fully certain. The channels-dvr process would have shown 100% cpu usage in the Activity Monitor at the time if this was the same issue.

I’ve tried repeated response for longer than and constant for more than 2min, and the web server is non-responsive. Hung. In fact I’ve had it in that state for multiple days. At least in my case, Only restarting the DVR server was the only way to regain normal function. Also on a brighter note updating my server to the m1 channels (or latest?) DVR build to match my m1 mac seemed to solve my issue. So maybe it was the channels auto updater that wasn’t working, or manual download/installation of a special and or latest build of channels DVR server was needed. Still haven't had that clarified.

1 Like

This seems to imply that previously you were running the x86_64 build on arm64 hardware; but when you "updated" to the arm64 build the issues went away?

(Perhaps this is a hardware/golang issue, and has nothing to do with Channels, per se. …)

It does seem that this is still happening with the recent update - away from home and can’t reach the server now after it’s been up for a few days. My DVR server is an x86 Mac mini, though I won’t be home to reset it for a couple weeks.

My DVR server updated to 2022.08.04 and restarted itself, and now I can access it remotely again. Is there any logging information that would help? Or has this issue been fixed in the new version?

Submitting diagnostics could give us some insights into what happened.

Had another extended http service unresponsive event this afternoon. Logs have been submitted as 4d6d75dd-9ba6-40ea-9ac7-2f196e1c3d17 -- It lasted about 55 minutes before I restarted it. The logs received no entries during the outage. All connections to the server timed out, and all clients gave up. But, it did continue recording in the background during the outage.

I restarted the service with a kill -QUIT, so stack traces on all threads appear to have been recorded, if that's helpful.

UPDATE: I'm actually not sure how long the outage was -- it may have been up to 13h 50min. I'd unfortunately disabled my monitoring "cron" job. But I did have a script that would curl /dvr/jobs every 10 seconds or so, and it stopped appearing in channels http logs around 1am.

Mine went unresponsive again and hasn't resumed yet - is there any logging I can get before I restart? When this happens it typically stays unresponsive until I restart it or it auto-updates, but recordings continue to happen. Obviously can't use the web interface, though I'm not sure if there' a command I can run.

I ran a /usr/bin/pkill -QUIT [c]hannels-dvr to cause the restart. @tmm said that that records additional information -- it seems to record stack traces before shutdown. Channels seems to always be recording verbose logs, so the main thing is submitting them after you restart from the WebUI under Support > Troubleshooting, which you probably already knew and have been doing. @tmm also mentioned running curl -v http://127.0.0.1:8089/status before restarting. The output of that, in my case, just shows that curl received no data until it timed out.

In the next build (v2022.08.16.2108), I've added a second http listener on port 58089 that we can use for debugging.

It would be interesting to know if both ports stop working at the same time, or if this new port keeps working when the old one dies. The new port only listens on localhost.

So please upgrade to this build when it comes out, and next time you experience the stall see if this command still works:

curl http://127.0.0.1:58089/status

and if so, you could use this to restart the DVR:

curl -XPUT http://127.0.0.1:58089/updater/force/restart

Based on these findings, perhaps we can make the DVR detect when this is happening and restart itself.

1 Like

*Update. Even with the latest channelsdvr server update web server end still freezes occasionally. Not a big deal since my solution to kill (quit/relaunch) server works a charm. Nevertheless annoying this issue has not been officially fixed yet. Until then, Kill it!!! :smiley:

Okay it appeared to be 711 minutes in the diagnostics so that's helpful to know it wasn't just 55 minutes like you had mentioned.

I have a few more requests for your debug script:

find the pid of the channels-dvr process and run lsof -nPp <pid>

also grab the output of netstat -an before restarting

These as well:

netstat -nt | grep 8089
netstat -anL
netstat -s

cc @sejmann

1 Like

This is... probably unrelated, but I've been experiencing hangs from time to time on MacOS. Today had to force kill the process today after Channels hung.

The logs don't show anything relevant to the crash, but it was associated w/ adding a show with 100+ episodes into a Virtual Channel, which seems to have maxed out the RAM on my machine. In Activity Monitor, it showed channels-dvr as using 425 threads and I think most of those were ffmpeg processes.

Update: the only error associated w/ the crash today was this line:

2022/08/21 11:36:21.298335 [TNR] Cancelling stream TVE-YouTubeTV ch6053 after no data was received for 2m0s

I'm wondering if the show that was added to the virtual channel, because it was on an SMB share, overloaded my network connection and it caused recording and indexing to both hang, and that somehow triggered a loop that look up all the RAM.