DragonFly On-Line Manual Pages
SLONIK FAILOVER(7) Configuration and Action commands SLONIK FAILOVER(7)
NAME
FAILOVER - Fail a broken replication set over to a backup node
SYNOPSIS
FAILOVER (options);
DESCRIPTION
The FAILOVER command causes the backup node to take over all sets that
currently originate on the failed node. slonik will contact all other
direct subscribers of the failed node to determine which node has the
highest sync status for each set. If another node has a higher sync
status than the backup node, the replication will first be redirected
so that the backup node replicates against that other node, before
assuming the origin role and allowing update activity.
After successful failover, all former direct subscribers of the failed
node become direct subscribers of the backup node. The failed node is
abandoned, and can and should be removed from the configuration with
SLONIK DROP NODE(7).
If multiple set origin nodes have failed, then you should tell FAILOVER
about all of them in one request. This is done by passing a list like
NODE=(ID=val,BACKUP NODE=val), NODE=(ID=val2, BACKUP NODE=val2) to
FAILOVER.
Nodes that are forwarding providers can also be passed to the failover
command as a failed node. The failover process will redirect the
subscriptions from these nodes to the backup node.
ID = ival
ID of the failed node
BACKUP NODE = ival
Node ID of the node that will take over all sets originating on
the failed node
This uses failednode(integer,integer).
EXAMPLE
FAILOVER (
ID = 1,
BACKUP NODE = 2
);
#example of multiple nodes
FAILOVER(
NODE=(ID=1, BACKUP NODE=2),
NODE=(ID=3, BACKUP NODE=4)
);
LOCKING BEHAVIOUR
Exclusive locks on each replicated table will be taken out on both the
new origin node as replication triggers are changed. If the new origin
was not completely up to date, and replication data must be drawn from
some other node that is more up to date, the new origin will not become
usable until those updates are complete.
DANGEROUS/UNINTUITIVE BEHAVIOUR
This command will abandon the status of the failed node. There is no
possibility to let the failed node join the cluster again without
rebuilding it from scratch as a slave. If at all possible, you would
likely prefer to use SLONIK MOVE SET(7) instead, as that does not
abandon the failed node.
If a second failure occours in the middle of a FAILOVER operation then
recovery might be complicated.
SLONIK EVENT CONFIRMATION BEHAVIOUR
Slonik will submit the FAILOVER_EVENT without waiting but wait until
the most ahead node has received confirmations of the FAILOVER_EVENT
from all nodes before completing.
VERSION INFORMATION
This command was introduced in Slony-I 1.0
In version 2.0, the default BACKUP NODE value of 1 was removed, so it
is mandatory to provide a value for this parameter
In version 2.2 support was added for passing multiple nodes to a single
failover command
18 January 2015 SLONIK FAILOVER(7)