#
Time-Adaptive Self Stabilization

We study the scenario where a transient fault hit *f* of the *n*
nodes of a distributed system by corrupting their state. We consider the
basic problem of *persistent bit*, where the system is required to
maintain a value in the face of transient failures by means of replication.
We give an algorithm to recover the value quickly: the value of the bit
is recovered at all nodes in *O(f)* time units for any unknown value
of *f<n/2*. Moreover, complete state quiescence occurs in *O(Diam)*
time units, where *Diam* denotes the diameter of the network. This
means that the value persists indefinitely so long as any *f<n/2*
faults are followed by *Omega(Diam)* fault-free time units. We prove
lower bounds which show that both time bounds are asymptotically optimal.
Using the algorithm for persistent bit, we present a general transformer
which takes a distributed non-reactive, non-stabilizing protocol *P*,
and produces a self-stabilizing protocol *P'* which solves the problem
*P* solves, with the additional property that if the number of faults
that hit the system after stabilization is *f*, for any unknown *f<n/2*,
then the output of *P'* regains stability in *O(f)* time units,
and the state stabilizes in *O(Diam)* time units.

Click here
for proceedings version, and here
for the full version.