Publications
Dynamically Improving Resiliency to Timing Errors for Stream Processing Workloads
Abstract
Large-scale data processing paradigms, such as stream processing, are widespread in academic and corporate workloads. These environments are commonly subject to real-time requirements, such as latency and throughput, and resiliency requirements to node or network failures. These requirements have generally been approached as separate problems. Intermittent timing delays due to factors such as garbage collection can further complicate the management of the stream processing workload. Insufficient resource allocations can also lead to poor performance. Currently, tuning these applications is done manually. We show that improper configuration can greatly affect performance. It is reported that even 100ms of increased latency in online sales platforms can potentially result in lower sales. In this paper we propose Dynamo, a framework and monitor that implements a methodology for addressing both the …
- Date
- December 18, 2017
- Authors
- G.P.C Tran, J.P. Walters, S.P. Crago
- Conference
- The 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’17)