Andre Noll [Sun, 1 Sep 2019 17:29:39 +0000 (19:29 +0200)]
prune: Fail gracefully if pre-rm hook fails.
In this case wait_for_remove_process() returns non-negative and we
miss to set the exit code, making the command appear to succeed even
if the rm process has not been created.
In the calculation of the length of the argv array we did not take
into account that --source-dir may be given multiple times. This can
result in an invalid write at the end of the allocated space.
Andre Noll [Thu, 7 Feb 2019 20:41:22 +0000 (21:41 +0100)]
Remove stale comment.
This comment went stale ten years ago in commit 360bcc95d588 (Clean
up snapshot removal logic) which changed the type of the return value
of find_redundant_snapshot() from int to struct snapshot *.
Andre Noll [Wed, 6 Jun 2018 13:08:24 +0000 (15:08 +0200)]
Support multiple source directories.
rsync is capable of copying multiple source directories to a single
destination, but this is currently not supported by dss. This commit
adds this functionality. The implementation is straight-forward,
except that we don't want to add a trailing slash to every source
directory. The new comment in dss.c explains this in more detail.
Andre Noll [Tue, 14 Nov 2017 03:12:02 +0000 (04:12 +0100)]
run: Don't kill children twice.
When handle_signal(), the signal dispatcher of the run subcommand,
detects that SIGINT or SIGTERM was received, it calls kill_children()
to terminate any running rsync or rm processes. It then returns
negative which terminates the select loop. However, after select_loop()
returns, kill_children() is called again. Also the error message is
logged twice.
Not a biggie, but let's get rid of this redundancy by removing the
first call to kill_children().
Since handle_signal() is only called from com_run(), this patch
affects only the run subcommand.
Andre Noll [Tue, 14 Nov 2017 02:19:58 +0000 (03:19 +0100)]
Fix compute_next_snapshot_time().
The function computes the average idle time between snapshots and adds
this value to the completion time of the last snapshot to obtain the
start time for the next snapshot.
However, if the last snapshot happens to be incomplete, its completion
time is set to -1. Hence the computed next snapshot time is going to
be in the past, so we start the next snapshot immediately.
Although this is incorrect, the bug is benign because the correct next
snapshot time should also be in the past since we decided earlier to
create the snapshot which was now found incomplete.
Fix this by using the completion time of the last _complete_ snapshot
instead.
Andre Noll [Tue, 14 Nov 2017 02:14:47 +0000 (03:14 +0100)]
Silence a bogus scan-build warning.
We never pass a NULL pointer to create_snapshot(), but scan-build
is unable to prove this and claims that the array access results in
a null pointer dereference. The added assertion helps the reader of
the code, and it quietens scan-build.
Andre Noll [Mon, 6 Nov 2017 00:12:41 +0000 (01:12 +0100)]
Replace license boilerplate with single line SPDX comments.
This gets rid of existing copyright templates in favor of just the
one-liner SPDX (Software Package Data Exchange) notice. All files are
licensed under the GPL-2.0, so the same tag is added to each file. No
copyright is changed by this commit.
Several files (mostly the very short ones) did not contain a license
text so far. By default all files without license information are
under the default license of this package, which is GPL version 2.
This commit adds the missing SPDX line so that now all files except
dss.css, index.html.in, INSTALL, NEWS and README have it.
We also remove author and copyright year, since the author is the same
everywhere, and the year hasn't been updated any more since at least
six years. Accurate information is available from the git log.
The COPYING file can also be removed because the license text at
https://spdx.org/licenses/GPL-2.0.html is immutable.
Andre Noll [Sat, 11 Nov 2017 05:24:47 +0000 (06:24 +0100)]
find_orphaned_snapshot(): Improve log message.
It is kind of obvious that find_orphaned_snapshot() looks for, well,
orphaned snapshots. The new message at least gives the user a vague
idea what this means.
Andre Noll [Mon, 6 Nov 2017 00:52:03 +0000 (01:52 +0100)]
gcc-compat.h: Remove _static_inline_ macro.
The only purpose of this macro is to have a way to include static
inline functions into the doxygen source code documentation (but omit
normal static functions in .c files). Since dss does not use doxygen,
the macro is pointless.
Remove the equally pointless documentation of dss_rename(), one
of the two users of _static_inline_, while converting it to plain
static inline.
Andre Noll [Mon, 13 Nov 2017 16:46:57 +0000 (17:46 +0100)]
Merge branch 'refs/heads/t/configtest'
A single patch which adds the new configtest subcommand, plus a fixup
for a formatting issue which was noticed only after the branch had
already been merged to next.
* refs/heads/t/configtest:
show_subcommand_summary(): Increase column width.
New subcommand: configtest.
Andre Noll [Sun, 15 Oct 2017 16:08:47 +0000 (18:08 +0200)]
kill: New option --wait.
Simply running "dss kill" during system shutdown to terminate the
dss process does not work as expected because the kill subcommand
exits after the signal has been sent, which might be long before the
targeted dss process terminates.
For example, the dss main process might be running its exit hook to
inform the system administrator about the fact that the dss service
is going down when the shutdown procedure already has deactivated the
network. Or the shutdown procedure kills the exit hook with SIGKILL
during its normal "killing remaining processes" phase before file
systems are unmounted.
With the --wait option, the kill subcommand will not return until the
dss process has died or the timeout expires. We hardcode the timeout
in send_signal() for the time being. It can be made configurable if
this turns out to be necessary.
Andre Noll [Sun, 15 Oct 2017 18:29:39 +0000 (20:29 +0200)]
Allow word-splitting for exit hook.
All hooks except the exit hook are run via dss_exec_cmdline_pid(),
which performs word splitting to create the argument vector for
exec(2). For the exit hook, however, we build the argument vector
manually, so the command line for the exit hook is not split.
This commit removes this inconsistency. However, we can't use
dss_exec_cmdline_pid() here because we need to append the error string
which caused dss to exit to the argument vector as a single argument,
and this string may well contain whitespace characters.
Hence we run split_args() on the argument to --exit-hook to obtain
an argument vector, append the error string as another element,
and then run dss_exec().
Andre Noll [Mon, 6 Nov 2017 02:34:39 +0000 (03:34 +0100)]
README/INSTALL: Fix typo: snaphot.
The typo in README was introduced two years ago in commit 05e75054
(README: Explain that there are no incremental backups) while the
one in INSTALL is more than five years old, see commit a0b6810b (doc:
Add a second example config file).
Andre Noll [Sun, 15 Oct 2017 19:11:46 +0000 (21:11 +0200)]
ls: Print current duration of incomplete snapshots.
Currently the duration of incomplete (and orphaned) snapshots is shown
as 0:00. It's more interesting to see for how long the snapshot is
already being created, so print the difference of the current time
and the start time instead.
Fix an overlong line and a whitespace issue while at it.
Andre Noll [Mon, 16 Oct 2017 15:36:51 +0000 (17:36 +0200)]
run: Wait for children to die.
When the select loop returns and dss is about to terminate, it sends
SIGTERM to any running rm or rsync processes and exits. It does not
wait for these processes to die, however. This is trivial to implement,
and it makes life easier for shutdown scripts which like to proceed
with unmounting file systems.
Andre Noll [Sun, 15 Oct 2017 13:03:08 +0000 (15:03 +0200)]
New subcommand: configtest.
Similar to the identically named subcommand of apache2ctl. This is
trivial to implement because we only need to describe the subcommand
in dss.suite and create a command handler which prints an OK message
and returns success. If the config file contains errors, we abort
earlier anyway.
Andre Noll [Tue, 17 Oct 2017 17:11:07 +0000 (19:11 +0200)]
Save the subcommand pointer in a global variable.
This is needed for subcommand sensitive logging which will be
introduced in a subsequent commit. For now it allows to drop the
argument of check_config(), which is good given that handle_sighup()
already played dirty games by "knowing" it is only called from the
run subcommand.
Andre Noll [Mon, 16 Oct 2017 15:19:08 +0000 (17:19 +0200)]
New option: --mountpoint.
The new option applies to run, create, ls and prune. The feature
could be implemented as a pre-create hook, but since it is so common,
it makes sense to add it to dss proper.
As for the implementation we simply check that "." and ".." are on
different devices (or are identical).
Andre Noll [Sun, 15 Oct 2017 12:51:29 +0000 (14:51 +0200)]
Merge branch 'refs/heads/t/lopsub'
Conversion to lopsub and a few other improvements on top of it.
* refs/heads/t/lopsub:
INSTALL: Explain how to use CPPFLAGS and LDFLAGS.
build: Introduce DSS_CPPFLAGS.
build: Fix cc command which creates dependencies,
build: Combine CFLAGS and DEBUG_CFLAGS.
Implement --checksum.
run: Improve error diagnostics for chdir(2) failure.
run: Improve error message if dss is already running.
run: Fix exit status in case another dss process is running.
build: Add target install and install-strip.
Convert dss to lopsub.
Remove --no-resume.
This was not a good idea because ftok(3) hashes, among other
information, the inode number of the file, and this number changes
every time the configuration file is edited.
The revert conflicted slightly to the commit which renamed
get_key_or_die() to get_key() and changed the type of the return
value to key_t, but the conflict was easy to resolve.
Andre Noll [Sun, 16 Apr 2017 10:48:58 +0000 (12:48 +0200)]
ipc: Improve error diagnostics for kill.
If dss is not running, the kill command prints "No such file or
directory" because the call to semget(2) fails with ENOENT. This
message is a bit misleading, so let's return -E_NOT_RUNNING in this
case instead.
Andre Noll [Sun, 30 Apr 2017 00:33:57 +0000 (02:33 +0200)]
Replace dss.dia by a shell script.
The dia command line tool misaligns the text on the dss logo, and the
dia application started to segfault on my home box after a library
upgrade.
This patch replaces the dia source file by the mklogo bash script
which runs the convert utility of ImageMagick to write the dss logo
in png format to stdout.
Andre Noll [Thu, 13 Jul 2017 17:43:04 +0000 (19:43 +0200)]
build: Introduce DSS_CPPFLAGS.
As with CFLAGS, it is good practice to leave CPPFLAGS unset in the
Makefile and append it to the cc command after our own flags, to give
the user a chance to override our settings.
This patch initializes DSS_CPPFLAGS with the VERSION_STRING define
which was part of the receipe and adds -Wunused macros, which is a
preprocessor flag rather than a compiler flag.
DSS_CPPFLAGS and CPPFLAGS are added to the two relevant commands,
in addition to the existing DSS_CFLAGS and CFLAGS.
Andre Noll [Thu, 13 Jul 2017 17:33:52 +0000 (19:33 +0200)]
build: Fix cc command which creates dependencies,
The command to create Makefile.deps was hardcoded as gcc in Makefile.
This patch changes the command to $(CC) and adds the usual set of
flags which we use for compiling.
Andre Noll [Thu, 13 Jul 2017 17:21:11 +0000 (19:21 +0200)]
build: Combine CFLAGS and DEBUG_CFLAGS.
We needed two sets of flags for gengetopt because the C code generated
by gengetopt would not compile cleanly with our rather strict set of
flags. With lopsub this is no longer necessary.
Moreover, it is considered good practice to not set CFLAGS at all but
to append the contents of this variable to the compile command. This
way the user may set the variable to override some of the options.
This commit gets rid of CFLAGS in favor of DSS_CFLAGS, which is just
the union of the CFLAGS and the DEBUG_CFLAGS variables we had before.
Andre Noll [Sun, 16 Apr 2017 10:01:42 +0000 (12:01 +0200)]
run: Improve error diagnostics for chdir(2) failure.
In run mode, if the destination directory does not exist, dss prints
"No such file or directory" and exits, without telling the user (a)
it was a failed chdir(2) call that caused the error, and (b) the name
of the directory. This patch adds an error message containing this
information.
Since there is only one caller of dss_chdir(), let's get rid
of this public function in file.c and call chdir() directly from
change_to_dest_dir() of dss.c.
Andre Noll [Fri, 17 Jun 2016 07:29:12 +0000 (09:29 +0200)]
run: Improve error message if dss is already running.
The current error message, "child terminated unexpectedly", is not
very comprehensive.
The most likely reason for the child to terminate is that it could not
obtain the semaphore lock because another dss process is running. This
commit adds a test to com_run() that check this condition in the
parent before the child process is born. This way, if another process
is holding the lock, we can fail with a nice error message that also
includes the pid of the process that holds the lock.
Andre Noll [Thu, 16 Jun 2016 21:21:08 +0000 (23:21 +0200)]
run: Fix exit status in case another dss process is running.
In daemon mode, we must acquire the semaphore lock in the child process
because the child does not inherit semaphore adjustments. Currently
the parent exits successfully after the fork, so the command appears
to succeed even if the child dies immediately because it was unable
to acquire the lock because another dss process is holding the lock.
This commit introduces a mechanism which enables the parent to tell
whether the child completed its setup successfully. We create a
pipe prior to calling fork(2), and let the child write to one end
of the pipe after setup is complete and just before it enters the
main select loop. The parent reads from the other end of the pipe
and exits once the read(2) call returns. If the child dies early,
read(2) returns zero, indicating failure.
Andre Noll [Sun, 16 Apr 2017 11:56:09 +0000 (13:56 +0200)]
build: Add target install and install-strip.
It has always been a bit clumsy to copy the executable and the
manual page to their proper locations by hand, so this commit adds
the two standard targets "install" and "install-strip" which install
both files.
The installation prefix defaults to /usr/local and can be set with
PREFIX. Moreover, there is DESTDIR which may be given to prepend
another directory (useful for for "staged installs", where the
installed files are not placed directly into their expected location
but are instead copied into a temporary location).
Andre Noll [Fri, 6 May 2016 14:18:24 +0000 (16:18 +0200)]
Convert dss to lopsub.
This commit ditches gengetopt for the command line and config file
parsers in favor of the lopsub library. Hence from now on, lopsub
must be installed in order to compile dss while gengetopt is no
longer needed.
The mutually exclusive gengetopt group options --create, --prune, --ls,
--run, --kill and --reload are replaced by lopsub subcommands. However,
the --reload and --kill options have been combined to the new "kill"
subcommand which allows to send arbitrary signals to a running dss
process.
Due to the conversion, the syntax of the dss command changes
slightly. For example,
dss --run
becomes
dss run
while
dss -Rdc foo
needs to be spelled as
dss -c foo -- run -d
so that -d is regarded as an option to the "run" subcommand rather
than an option to dss.
With lopsub each subcommand has its own command line and config file
parser. Options to subcommands can be added to the configuration file
like this:
[run]
daemon
logfile=/var/log/dss.log
As for the implementation, the bulk of the changes is the conversion
of dss.ggo to the new dss.suite. The necessary adjustments to the
code are relatively simple. In particular, only dss.c needs to be
changed while all other .c files don't require any modifications.
The examples in INSTALL are adjusted to the new syntax. The commit also
drops support for Mac OS and Solaris, since lopsub is not supported
on these platforms yet.
Andre Noll [Fri, 17 Feb 2017 14:40:58 +0000 (15:40 +0100)]
ipc: Prefer key_t over int for System V IPC keys.
get_key() calls ftok(3), which returns a key_t value. key_t is also
the type which semget(2), the only function which receives the key via
mutex_get(), expects. It's stupid to convert the key_t from ftok(3)
into an int, only to convert it back to key_t later.
This patch changes ipc.c to use key_t everywhere. However, in
mutex_get() we print a log message containing the value of the key,
so the format string must be adjusted accordingly. Unfortunately,
on Linux, key_t is the same as int while on FreeBSD and NetBSD it is
defined as long. To avoid a warning from the compiler we use "%lx"
in the format string and cast the value to long.
Andre Noll [Fri, 17 Feb 2017 14:29:52 +0000 (15:29 +0100)]
ipc.c: Use ftok() instead of SuperFastHash.
ftok(3) uses the identity of the named file to generate a key_t type
System V IPC key, which is easier than computing the key by hashing
the (resolved) pathname of the config file. This change allows to
get rid of the realpath() and the super_fast_hash() implementation.
If ftok(3) fails, presumably because the underlying call to stat(2)
fails, we now simply return a phony identifier, similar to what we did
before in this case. This eliminates the only possible failure path
in get_key_or_die(), so this function is renamed to get_key().
Andre Noll [Mon, 9 May 2016 09:03:45 +0000 (11:03 +0200)]
Remove --no-resume.
There is no real reason for this option. Resuming a previously
cancelled snapshot is generally a very good idea, so the option is
kind of pointless. Remove it.
Andre Noll [Fri, 17 Jun 2016 07:18:40 +0000 (09:18 +0200)]
ipc: Simplify mutex_try_lock().
There is no need to actually obtain the lock. A single semaphore
operation will do just fine. With sem_op equal to zero and IPC_NOWAIT
the semop() call returns immediately, and the return value tells
whether the semaphore value was zero.
Rename the (static) function to mutex_is_locked() to indicate that
it performs only read-only operations on the semaphore set.
Andre Noll [Thu, 16 Jun 2016 21:06:29 +0000 (23:06 +0200)]
ipc: Make pid pointer optional.
This changes get_dss_pid() to handle the case where the caller passed
a NULL pid pointer. Conversely, if pid is not NULL, we now make sure
to initialize the given address in all cases.
The single caller currently never passes NULL, so this change is just
defensive programming, protecting against future users. Be liberal
in what you accept, be strict in what you return..
Andre Noll [Tue, 7 Jun 2016 14:23:36 +0000 (16:23 +0200)]
build: Add two more warning options.
Both -Wunused-parameter and -Wshadow were added to gcc long ago. In
particular gcc-4.6.3, which ships with Ubuntu-12.04, supports them. It
should thus be safe to enable both warnings unconditionally.
Andre Noll [Tue, 7 Jun 2016 14:29:38 +0000 (16:29 +0200)]
dss.c: Add missing inclusion of <stdio.h>.
This is required for example for rename(2). Compilation succeeds without
the include only because the gengetopt header includes stdio.h as well and
we happen to include this header before fd.h.
Andre Noll [Mon, 16 May 2016 12:55:28 +0000 (14:55 +0200)]
dss: Make argument of parse_config_file() a boolean.
It is used as such, so there is no point to have an int here. Also
rename the argument from override to sighup to indicate that we
need to distinguish whether the function is called at startup or
because the dss process received SIGHUP.
Andre Noll [Fri, 17 Jun 2016 08:17:28 +0000 (10:17 +0200)]
dss: Do not shadow a global declaration.
num_complete_snapshots is a local variable in
compute_next_snapshot_time(), but also the name of a public function
declared in snap.h, causing a warning on some (old) gcc versions.
This patch avoids the ambiguity and thus the warning by renaming the
variable. It was unusually long anyway.
Andre Noll [Mon, 20 Jun 2016 14:34:37 +0000 (16:34 +0200)]
Create html version of the man page with groff.
The html post processor of groff can directly create html, which is
expected to be of higher quality than the html generated by man2html. A
brief look at the "official" web site of man2html (as mentioned in the
description of the Ubuntu-14.04 package)
Andre Noll [Mon, 20 Jun 2016 14:05:31 +0000 (16:05 +0200)]
Convert INSTALL and NEWS to markdown format.
The grutatxt project is dead, so we have to switch to something else
eventually. Fortunately, there are only three files in grutatxt format,
one of which (README) does not need any changes. The other two are
converted to markdown format in this commit. This is a rather simple
matter since only section headings, links and preformatted text need
slight adjustments.
The commands in the Makefile are modified to run markdown(1) instead
of grutatxt(1).
Andre Noll [Mon, 13 Jun 2016 15:54:37 +0000 (17:54 +0200)]
Fix rsync exit handling in create mode.
The logic in handle_rsync_exit() is horribly broken in case dss is
run in create mode and the rsync process terminates unsuccessfully.
First we claim to restart rsync, which is wrong. Next we call the
post-create hook despite the documentation says that this hook is
only run on *successful* termination. Finally, we dereference a NULL
pointer to print the path of the snapshot.
Fortunately, all three issues are easy to fix by special casing create
mode in handle_rsync_exit().
Andre Noll [Tue, 29 Dec 2015 15:28:02 +0000 (15:28 +0000)]
Always try to keep one snapshot for recycling.
Currently, if --keep-redundant is not given, we try to get rid of
outdated and redundant snapshots quickly, even if there is plenty of
free disk space available. However, as these snapshots can be used
for recycling, it seems to be worth to keep them around as long as
there are fewer snapshots available as configured.
This commit changes try_to_free_disk_space() to not remove snapshots
any more in this case. This patch should reduce disk I/O in the common
case where no snapshots need to be removed due to low disk space.
Andre Noll [Tue, 29 Dec 2015 16:10:05 +0000 (16:10 +0000)]
Allow to run in daemon mode without log file.
It's kind of silly to insist in having a log file in daemon mode.
This commit removes the dependency of --daemon on --logfile and makes
/dev/null the default log file. Consequently, running dss --daemon
--run without specifying --logfile no longer fails, and nothing will
be logged by default.
Andre Noll [Tue, 29 Dec 2015 16:42:18 +0000 (16:42 +0000)]
Improve documentation of --keep-redundant.
The help text for --keep-redundant was rather convoluted. This commit
shortens the text with no essential semantic change.
The patch also removes the sentence that encourages to specify this
option if the destination directory is only used for snapshots. After
all, most file systems allow to create an insane number of files,
so keeping snapshots around forever can result in a file system that
can no longer be checked or repaired due to the excessive number of
used inodes.
Andre Noll [Tue, 29 Dec 2015 15:52:32 +0000 (15:52 +0000)]
Improve documentation of interval-related args.
Minor rewording of the help text for the --unit-interval option and
a new sentence which explains that the total number of snapshots
doubles if --num-intervals is increased by one.
Andre Noll [Wed, 16 Dec 2015 13:52:54 +0000 (14:52 +0100)]
README: Explain that there are no incremental backups.
This was unclear to an admin who had used dss for several years! So
maybe it is a good idea to explain the idea behind hardlink-based
backups a bit more.
This commit adds two new sentences to README, one for the admin and
another one for the user.
Andre Noll [Mon, 30 Mar 2015 16:20:16 +0000 (16:20 +0000)]
daemon.c: Open /dev/null read-write.
While daemonizing we redirect stdin, stdout and stderr to /dev/null,
which is considered good practice. We should, however, open these
two devices in read-write mode rather than read-only, since not being
able to write to stdout/stderr might confuse rsync and the hooks.
Andre Noll [Wed, 25 Feb 2015 10:15:33 +0000 (11:15 +0100)]
Improve signal handler.
The signal handler of dss has two issues: (a) it does not check the
return value of the write(2) call, and (b) it does not restore errno
on exit. The second issue might cause problems on systems where
write(2) sets errno also on success. Those problems would be very
hard to reproduce and debug. So it is probably a good idea to be
conservative here.
This commit fixes (a) by printing an error message and calling exit(3)
if the write to the signal pipe failed or resulted in a short write.
As for (b), we now save a copy of errno before the write(2) call,
and restore the old value on success.
Andre Noll [Fri, 12 Dec 2014 14:05:21 +0000 (15:05 +0100)]
Rework restart logic, introduce --max-errors.
It has happened several times in the past that dss made no progress
because the underlying rsync command terminates with exit code 13
(Errors with program diagnostics). Currently dss special cases this
exit code as a non-fatal error, i.e. it does not terminate but restarts
the rsync command after 60 seconds. If the problem is permanent,
no new snapshots will be created, but the exit hook is not called
either, which is unfortunate.
This commit tries to improve on this. With this patch applied, the
only non-fatal exit code from rsync is 24 (Partial transfer due
to vanished source files), which is actually considered success.
All other non-zero exit codes cause dss to restart the rsync command,
but only at most N times, where N is the argument given to the new
--max-rsync-errors option.