Hello, we have some features coming from jdbc sources and some jobs failed with this error:
ExecutorLostFailure (executor 14 exited caused by one of the running tasks) Reason: Container from a bad node: container_e06_1594496268635_0006_01_000015 on host: hopsworks0.logicalclocks.com. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
do we have to modify a parameter or what could be causing these errors?
Regards!
Hi @Theo
yes its from a hopswork job that reads from a jdbc source they features using the standard parameters
(2048m dirver,4080m executor) the cluster is default (12g) all in one VM server with 125Gb ram.
that feature job reads from 10 different tables from the same location and seems to reach the max at certain step
what i did was reset the kernel and the application an split the job in 2 and seems to work fine. but i wonder what parameter needs to be increased…either the cluster, yarn …because increasing the driver and executors more than the cluster max fails.
Maybe there is too much data being stored on the driver and it runs out of memory. Can you check if the container failing is the driver or one of the executors?
@Theo
i can’t find what i would need, was looking the spark doc and websites about but could not find what i would require, like have the cluster with more than 16gb RAM like you can do on cloud solution. would you know how can i modify the config?
Regards