Kalaiselvi, S. ; Rajaraman, V. (1999) A checkpointing algorithm for an SCl based distributed shared memory system Microprocessors and Microsystems, 22 (9). pp. 515-522. ISSN 0141-9331
Full text not available from this repository.
Official URL: http://linkinghub.elsevier.com/retrieve/pii/S01419...
Related URL: http://dx.doi.org/10.1016/S0141-9331(98)00116-1
Abstract
Distributed Shared Memory (DSM) systems combine the ease of programming of Shared Memory Parallel Computers and scalability of message passing multicomputers. IEEE has proposed an interface standard known as SCI standard to construct DSM systems. When the number of processors in a parallel computer increase it is imperative to build fault tolerance. This article presents an algorithm for checkpointing and rollback recovery of an SCI based DSM system using the provisions of the standard. It is shown that this checkpointing and rollback recovery procedure judiciously combines the features of both shared memory and message passing distributed memory system.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Elsevier Science. |
Keywords: | Fault Tolerant Computing; Checkpointing; Rollback Recovery; SCl Standard |
ID Code: | 38370 |
Deposited On: | 29 Apr 2011 08:43 |
Last Modified: | 29 Apr 2011 08:43 |
Repository Staff Only: item control page