DragonFly On-Line Manual Pages


SYS_CHECKPOINT(2)	 DragonFly System Calls Manual	     SYS_CHECKPOINT(2)

NAME

sys_checkpoint -- checkpoint or restore a process

LIBRARY

Standard C Library (libc, -lc)

SYNOPSIS

#include <sys/types.h> #include <sys/checkpoint.h> int sys_checkpoint(int type, int fd, pid_t pid, int retval);

DESCRIPTION

The sys_checkpoint() system call executes a checkpoint function as speci- fied by type. Supported types are as follows: CKPT_FREEZE Generate a checkpoint file. Currently pid must be -1 or the pid of the current process. The checkpoint file will be written out to fd, and retval is unused but must be specified as -1. As a special case, if pid and fd are both specified as -1, the system will generate a checkpoint file using the system checkpoint template. This function returns 0 on success, -1 on error, and typically 1 on resume. The value returned on resume is controlled by the retval argument passed to sys_checkpoint() when resuming a checkpoint file. A user program which installs its own SIGCKPT signal handler and calls sys_checkpoint() manually thus has control over both termination/continuance and resump- tion. CKPT_THAW Restore a checkpointed program. The pid must be spec- ified as -1, and fd represents the checkpoint file. The retval specifies the value returned to the resumed program if sys_checkpoint() was called directly. The checkpointed program will replace the current pro- gram, similar to an exec(3) call.

RETURN VALUES

Upon successful completion, the value 0 is typically returned. A check- point being resumed typically returns a positive value; otherwise the value -1 is returned and the global variable errno is set to indicate the error.

EXAMPLE

/* * Demonstrate checkpointing. Use control-E to checkpoint * the program and 'checkpt -r x.ckpt' to resume it. */ #include <sys/types.h> #include <sys/signal.h> #include <sys/checkpoint.h> #include <stdio.h> #include <unistd.h> #include <fcntl.h> #include <errno.h> void docheckpoint(void); int wantckpt; void dockpt(int sig) { wantckpt = 1; } int main(int argc, char** argv) { int i = 0; signal(SIGCKPT, dockpt); for (;;) { printf("iteration: %d\n", i); ++i; sleep(1); if (wantckpt) { wantckpt = 0; printf("Checkpoint requested\n"); docheckpoint(); } } return(0); } void docheckpoint(void) { int ret; int fd; fd = open("x.ckpt", O_RDWR|O_CREAT|O_TRUNC, 0666); if (fd < 0) { printf("unable to create checkpoint file: %s\n", strerror(errno)); return; } ret = sys_checkpoint(CKPT_FREEZE, fd, -1, -1); if (ret < 0) { printf("unable to checkpoint: %s\n", strerror(errno)); } else if (ret == 0) { printf("checkpoint successful, continuing\n"); } else if (ret == 1) { printf("resuming from checkpoint.\n"); } else { printf("unknown return value %d from sys_checkpoint\n", ret); exit(1); } /* note that the file descriptor is still valid on a resume */ close(fd); }

ERRORS

[EBADF] The given fd is not a valid regular file, socket descriptor, or pipe. Note that not all systems neces- sarily support checkpointing to sockets and pipes. [EPERM] The caller does not have permission to issue the checkpoint command. Checkpointing may be restricted or disabled using sysctls. [EIO] An I/O error occurred while reading from the file sys- tem. [EINVAL] An invalid parameter was specified.

CHECKPOINT FEATURES

The system checkpointing code will save the process register state (including floating point registers), signal state, file descriptors rep- resenting regular files or directories (anything that can be converted into a file handle for storage), and both shared and private memory map- pings. Private, writable mappings are copied to the checkpoint file while shared mappings and stored by referencing the file handle and off- set. Note that the system checkpointing code does not retain references to deleted files, so mappings and open descriptors of deleted files can- not be restored. Unpredictable operation will occur if a checkpoint- unaware program is restored and some of the underlying files mapped by the program have changed. The system checkpointing code is not able to retain the process pid, process group, user/group creds, or descriptors 0, 1, and 2. These will be inherited from whomever restores the checkpoint. When a checkpointed program is restored modified private mappings will be mapped from the checkpoint file itself, but major portions of the origi- nal program binary will be mapped from the original program binary. If the resumed program is checkpointed again the system will automatically copy any mappings from the original checkpoint file to the new one, since the original is likely being replaced. The caller must not truncate the existing checkpoint file when creating a new one or specify the existing file's file descriptor as the new one as this will destroy the data that the checkpoint operation needs to copy to the new file. It is best to checkpoint to a new file and then rename-over the old, or to remove(3) the old file before creating the new one so it remains valid as long as the program continues to run. Threaded programs cannot currently be checkpointed. The program must be reduced to a single thread before it can be safely checkpointed. MAP_VPAGETABLE mappings cannot currently be checkpointed. A program must restore such mappings manually on resumption. Only regular file and anonymous memory mappings are checkpointed and restored. Device and other special mappings are not. Only regular file descriptors are check- pointed and restored. Devices, pipes, sockets, and other special descriptors are not. Memory wiring states are not checkpointed or restored. madvise(2) states are not checkpointed or restored. Basic mapping permissions are checkpointed and restored.

SECURITY

The sysctl kern.ckptgroup controls which group can use system checkpoint- ing. By default, only users in the `wheel' group are allowed to check- point and restore processes. To allow users in any group to have this capability (risky), set sysctl kern.ckptgroup to -1.

SIGNALS

Two signals are associated with checkpointing. SIGCKPT is delivered via the tty ckpt character, usually control-E. Its default action is to checkpoint a program and continue running it. The SIGCKPTEXIT signal can only be delivered by kill(2). Its default action is to checkpoint a pro- gram and then exit. SIGCKPTEXIT might not be implemented by the system. Both signals are defined to be greater or equal to signal 32 and cannot be manipulated using legacy masking functions. If a program overrides the default action for a checkpoint signal the system will not undertake any action of its own. The program may issue the checkpoint command from the signal handler itself or simply set a reminder for later action. It is usually safest to set a reminder and do the actual checkpointing from your main loop.

SEE ALSO

checkpt(1), signal(3)

HISTORY

The sys_checkpoint() function call appeared in DragonFly 1.1. DragonFly 3.9 June 29, 2007 DragonFly 3.9