CIRNE, L. D.; CIRNE, Lilianne Dantas.
Resumo:
The intense use of Distributed Systems in different kinds of applications demands more
reliability of these applications. The systems should be prepared to tolerate faults, i.e., a
system should not interrupt its operation in the presence of some components (hardware or
software) and data faults. Active replication is usually used when one aims at building such
high available and fault-tolerant services. Some Group Communication Systems already
offer support for the development of fault tolerant distributed applications. However, most
of those systems are not portable, a very important property in distributed systems.
In this context, the Java language has become widely used in Distributed Systems in the
last years, specially due to its portability and facilities for the development of distributed
applications. Nonetheless, Java provides no support for the development of fault-tolerant
distributed applications which can continue to function properly despite component
failures.
This paper describes an approach for fault-tolerance in Java which can meet the
requirements of active replication. In order to achieve that, an extension to the iBus
package designed by Silvano Maffeis [MAF96] has been developed and implemented. The
developed system, named iBusTF (fault-tolerant iBus), added new group communication
properties required by active replication : total order delivery and atomic membership. The
approach adopted has the advantage of only using Java resources, keeping total
compatibility with the iBus system.