I have been attempting to execute a TSelector analysis that runs fine on proof-lite on a PoD proof cluster running on a condor backend. The execution begins successfully, and about 30-40% of my entries are processed, but then each of the workers starts throwing the following:
16:30:57 17606 Wrk-0.22 | Error in TXSocket::PickUpReady: error waiting at semaphore
16:30:57 17606 Wrk-0.22 | Error in TXProofServ::GetNextPacket: Recv() failed, returned -1
16:37:50 17606 Wrk-0.22 | Error in TXProofServ::HandleSocketInput: unknown command 1011
What could be causing this to happen? I am a little suspicious that it occurs approximately 30 minutes into the execution, which is the default time after which pod is supposed to shut down idle workers (although pod-info -n appears to show plenty up).