Patent attributes
Reliably making configuration changes to distributed systems, including receiving commands for multiple configuration changes, subdividing configuration changes into separate tasks, and performing those tasks at each node. A configuration element receives sets of configuration change commands, acknowledging them so the user need not wait before issuing additional commands. Tasks are determined, each including consistent changes to system configuration, and each including single-device tasklets. Each particular tasklet might be assigned to a particular single device, or to any single device in the system. Next tasks are performed when tasklets are complete. If tasklets are not timely performed due to nodes which are relatively unresponsive, those nodes are marked “failed.” When a failed node returns to responsiveness, it marks itself “recovering.” When a recovering node catches up, it marks itself “operational.” Updates by failed or recovering nodes are skipped while synchronizing with operational nodes.