Best Practices for Enabling Speculative Execution on Large Scale Platforms

Apache Spark has the ‘speculative execution’ feature to handle the slow tasks in a stage due to environment issues like slow network, disk etc. If one task is running slowly in a stage, Spark driver can launch a speculation task for it on a different host. Between the regular task and its speculation task, Spark system will later take the result from the first successfully completed task and kill the slower one.

When we first enabled the speculation feature for all Spark applications by default on a large cluster of 10K+ nodes at LinkedIn, we observed that the default values set for Spark’s speculation configuration parameters did not work well for LinkedIn’s batch jobs. For example, the system launched too many fruitless speculation tasks (i.e. tasks that were killed later). Besides, the speculation tasks did not help shorten the shuffle stages. In order to reduce the number of fruitless speculation tasks, we tried to find out the root cause, enhanced Spark engine, and tuned the speculation parameters carefully. We analyzed the number of speculation tasks launched, number of fruitful versus fruitless speculation tasks, and their corresponding cpu-memory resource consumption in terms of gigabytes-hours. We were able to reduce the average job response times by 13%, decrease the standard deviation of job elapsed times by 40%, and lower total resource consumption by 24% in a heavily utilized multi-tenant environment on a large cluster. In this talk, we will share our experience on enabling the speculative execution to achieve good job elapsed time reduction at the same time keeping a minimal overhead.

Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/

Databricks

Forumoe

Forumoe

Best Practices for Enabling Speculative Execution on Large Scale Platforms

Post a Comment

0 Comments

Popular Posts

Numberblocks Intro Song Only ONE but Different Versions - Theme Song Compilation of Numberblocks One

Love salt? Doctor says alternative has many benefits without sacrificing love of sodium

RECOMMEND SAB1L MAN1S RED HIJAB STYLE 26-0784 || H S 2021

PERMANENT GOVT NON TEACHING STAFF RECRUITMENTI APPLY FROM ANY STATE I NO FEE FOR ST, SC, PwD, FEMALE

Minecraft 1.17 - How to build a Large Windmill [Tutorial]

Ozzy Man Reviews: Teamwork

Why did THIS CRACK APPEAR IN THE LAVA in Minecraft ? REALISTIC LAVA PIT !

Archive

Recent

Categories

HOT

Menu Footer Widget

Best Practices for Enabling Speculative Execution on Large Scale Platforms

You may like these posts

Post a Comment

0 Comments

Popular Posts

Archive

Recent

Categories

HOT

Menu Footer Widget