glibc-2.11 revealed the following bug in init_afs(): The assignment
mmd->afs_pid = fork();
results in undefined behaviour because fork() returns twice and mmd->afs_pid lives
in a shared memory area. Depending on whether the child runs first, this results in
mmd->afs_pid being either zero or the pid of the afs child process.
mmd->afs_pid being zero seems to happen always with glibc-2.11 and has rather
strange consequences:
First, it causes para_server attempt to kill process 0 instead of the afs process on
exit. This fails because para_server never runs as root. However, it may result in dirty
osl tables as the afs process might access mmd after the shared memory area containing
mmd has already been destroyed.
Second, para_server fails to notice the death of the afs process, which is really bad and
may cause tons of error messages being written to the log.
Fix this bug by temporarily storing the afs pid in a local variable and setting mmd->afs_pid
only in the server (parent) process.
static int init_afs(void)
{
int ret, afs_server_socket[2];
+ pid_t afs_pid;
ret = socketpair(PF_UNIX, SOCK_DGRAM, 0, afs_server_socket);
if (ret < 0)
exit(EXIT_FAILURE);
afs_socket_cookie = para_random((uint32_t)-1);
- mmd->afs_pid = fork();
- if (mmd->afs_pid < 0)
+ afs_pid = fork();
+ if (afs_pid < 0)
exit(EXIT_FAILURE);
- if (!mmd->afs_pid) { /* child (afs) */
+ if (afs_pid == 0) { /* child (afs) */
close(afs_server_socket[0]);
afs_init(afs_socket_cookie, afs_server_socket[1]);
}
+ mmd->afs_pid = afs_pid;
close(afs_server_socket[1]);
ret = mark_fd_nonblocking(afs_server_socket[0]);
if (ret < 0)