python - Is it unsafe to run multiple tensorflow processes on the same GPU? -


i have 1 gpu (titan x pascal, 12 gb vram) , train multiple models, in parallel, on same gpu.

i tried encapsulated model in single python program (called model.py), , included code in model.py restrict vram usage (based on this example). able run 3 instances of model.py concurrently on gpu (with each instance taking little less 33% of vram). mysteriously, when tried 4 models received error:

2017-09-10 13:27:43.714908: e tensorflow/stream_executor/cuda/cuda_dnn.cc:371] coul d not create cudnn handle: cudnn_status_internal_error 2017-09-10 13:27:43.714973: e tensorflow/stream_executor/cuda/cuda_dnn.cc:338] coul d not destroy cudnn handle: cudnn_status_bad_param 2017-09-10 13:27:43.714988: f tensorflow/core/kernels/conv_ops.cc:672] check failed : stream->parent()->getconvolvealgorithms( conv_parameters.shouldincludewinogradnon fusedalgo<t>(), &algorithms) aborted (core dumped)

i later observed on tensorflow github people seem think unsafe have more 1 tensorflow process running per gpu. true, , there explanation why case? why able have 3 tensorflow processes running on same gpu , not 4?


Comments