oss-sec mailing list archives

Re: backtrace_symbols() misuse by Ceph and its supposedly-safe use

From: Jacob Bachmeyer <jcb62281 () gmail com>
Date: Fri, 12 Jul 2024 18:15:07 -0500

Alexander Patrakov wrote:

[...]
What would be a good solution (as in: something that does not convert
crashes into deadlocks) here? I understand that, after memory
corruption, we are already in the UB territory, but is there anything
better possible than what is implemented?

I would suggest a monitor daemon that runs GDB to get the backtrace.The simplest way to do this would require Ceph to have its ownsupervisor (not unique; PostgreSQL has long had a "postmaster" processthat manages the worker "postgres" backend processes) and provide eachdaemon with a pipe back to the supervisor; the fatal error handler needonly write(2) to the pipe from a static string and/or fixed buffer (toreport a signal number) and then enter an infinite loop; the supervisorthen kills the crashed process, possibly after attaching GDB andcollecting a backtrace.

Alternately, simply run the Ceph daemons with `ulimit -c` nonzero andcollect the core files. The core files can be analyzed using GDB afterthe fact. No dedicated supervisor needed here, only kernel facilities.

The central problem here, as I understand it, is trying to do too muchin a process that has gone into undefined behavior. Attaching GDB ordumping a core file both sidestep that problem.



-- Jacob

Current thread:

backtrace_symbols() misuse by Ceph and its supposedly-safe use Alexander Patrakov (Jul 12)
- Re: backtrace_symbols() misuse by Ceph and its supposedly-safe use Jacob Bachmeyer (Jul 13)
- Re: backtrace_symbols() misuse by Ceph and its supposedly-safe use Simon McVittie (Jul 13)