If the central manager crashes, jobs that are already running will continue to run unaffected. Queued jobs will remain in the queue unharmed, but can not begin running until the central manager is restarted and begins matchmaking again. Nothing special needs to be done after the central manager is brought back on line.
Depending on how your policy is set up, Condor will track any tty on the machine for the purpose of determining if a job is to be vacated or suspended on the machine. It could be the case that after you ssh there, Condor notices activity on the tty allocated to your connection and then vacates the job.
One likely error message within the collector log of the form
DaemonCore: PERMISSION DENIED to host <xxx.xxx.xxx.xxx> for command 0 (UPDATE_STARTD_AD)indicates a permissions problem. The condor_ startd daemons do not have write permission to the condor_ collector daemon. This could be because you used domain names in your HOSTALLOW_WRITE and/or HOSTDENY_WRITE configuration macros, but the domain name server (DNS) is not properly configured at your site. Without the proper configuration, Condor cannot resolve the IP addresses of your machines into fully-qualified domain names (an inverse lookup). If this is the problem, then the solution takes one of two forms:
HOSTALLOW_WRITE = *.your.domain.com
and this does not work, use
HOSTALLOW_WRITE = 192.131.133.*, 192.131.132.*
Alternatively, this permissions problem may be caused by being too restrictive in the setting of your HOSTALLOW_WRITE and/or HOSTDENY_WRITE configuration macros. If it is, then the solution is to change the macros, for example from
HOSTALLOW_WRITE = condor.your.domain.comto
HOSTALLOW_WRITE = *.your.domain.comor possibly
HOSTALLOW_WRITE = condor.your.domain.com, foo.your.domain.com, \ bar.your.domain.com
Another likely error message within the collector log of the form
DaemonCore: PERMISSION DENIED to host <xxx.xxx.xxx.xxx> for command 5 (QUERY_STARTD_ADS)indicates a similar problem as above, but read permission is the problem (as opposed to write permission). Use the solutions given above.