Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[oneDNN] Added clearing oneDNN cache per executor #32499

Merged
merged 2 commits into from
Apr 28, 2021

Conversation

jczaja
Copy link
Contributor

@jczaja jczaja commented Apr 23, 2021

PR types

Bug fixes

PR changes

Others

Describe

This is second fix to #31992 . Problem was that when executor( naive, regular) we clear oneDNN cache which is bad behaviour when we have multiple executors used from multiple threads as it may happen that one thread is doign execution and halfway cache is cleared.

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@jczaja jczaja added the Intel label Apr 23, 2021
@jczaja jczaja changed the title [oneDNN] Added clearing oneDNN per executor [oneDNN] Added clearing oneDNN cache per executor Apr 23, 2021
@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Apr 25, 2021
@PaddlePaddle PaddlePaddle unlocked this conversation Apr 25, 2021
@jczaja
Copy link
Contributor Author

jczaja commented Apr 26, 2021

@luotao1 PR-CI-Coverage failed and it is shown that one of lines is not covered. I can see that execution of test (test_analyzer_int8_resnet) is crossing this line so not sure why overage fails to notice that. Could you please advice on this?

@@ -169,6 +169,9 @@ void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id,
bool force_disable_gc, bool keep_kid_scopes) {
platform::RecordBlock b(block_id);
if (FLAGS_use_mkldnn) EnableMKLDNN(pdesc);
#ifdef PADDLE_WITH_MKLDNN
platform::AttachPointerHashToMKLDNNKey(this, place_);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ?

#ifdef PADDLE_WITH_MKLDNN
if (FLAGS_use_mkldnn) {
  EnableMKLDNN(pdesc);
  platform::AttachPointerHashToMKLDNNKey(this, place_);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luotao1 We cannot do that. as FLAGS_use_mkldnn = True makes an iteration through ops to set attributes "use_mkldnn=true" , but you can set this attribute on your own without using FLAGS_use_mkldnn. And situation that op is already having use_mkldnn=true set happens when for example we create an operator inside unit tests (python UTs). So then FLAGS_use_mkldnn is set to false , but we oneDNN execution happens.

@luotao1
Copy link
Contributor

luotao1 commented Apr 27, 2021

I can see that execution of test (test_analyzer_int8_resnet) is crossing this line so not sure why overage fails to notice that.

Could you give the details how test_analyzer_int8_resnet crossing the line ClearDeviceContext?

@jczaja
Copy link
Contributor Author

jczaja commented Apr 27, 2021

@luotao1 So test_analyzer_int8_resnet is being run during PR-CI-Coverage as it can be found in its log. This test is strictly oneDNN test without any conditional paths. I have run this test locally and stopped execution at the line that coverage listed as not being crossed through (mkldnn_quantizer.cc:415). Below is a backtrace (captured in GDB) showing that mentioned line
was executed (frame stack level #0) . On frame stack level #8 we can see that this is an entry source file corressponding to
test_analyzer_int8_resnet being executed.

#0  paddle::AnalysisPredictor::MkldnnQuantizer::ClearDeviceContext ( this = 0x16c4480) at /home/jczaja/Paddle/paddle/fluid/inference/api/mkldnn_quantizer.cc : 415
#1  0x00007ffff0b89371 in paddle::AnalysisPredictor::MkldnnQuantizer::Quantize ( this = 0x16c4480) at /home/jczaja/Paddle/paddle/fluid/inference/api/mkldnn_quantizer.cc : 445
#2  0x00007ffff0b5713f in paddle::AnalysisPredictor::MkldnnQuantize ( this = 0xb1aea0) at /home/jczaja/Paddle/paddle/fluid/inference/api/analysis_predictor.cc : 715
#3  0x00007ffff0b56f38 in paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2> ( config = ...) at /home/jczaja/Paddle/paddle/fluid/inference/api/analysis_predictor.cc : 703
#4  0x00007ffff0b5af85 in paddle::CreatePaddlePredictor<paddle::AnalysisConfig> ( config = ...) at /home/jczaja/Paddle/paddle/fluid/inference/api/analysis_predictor.cc : 1147
#5  0x000000000043ba81 in paddle::inference::CreateTestPredictor ( config = 0x7fffffffd510 , use_analysis = true) at /home/jczaja/Paddle/paddle/fluid/inference/tests/api/tester_helper.h : 330
#6  0x000000000043d635 in paddle::inference::TestOneThreadPrediction ( config = 0x7fffffffd510 , inputs = ...  , outputs = 0x7fffffffd3f0 , use_analysis = true , data_type = paddle::framework::proto::VarType_Type_INT8 , sample_latency = 0x7fffffffd3ec) at /home/jczaja/Paddle/paddle/fluid/inference/tests/api/tester_helper.h : 566
#7  0x000000000043fdcb in paddle::inference::CompareQuantizedAndAnalysis ( config = 0x7fffffffd720 , qconfig = 0x7fffffffd510 , inputs = ...  , compared_idx = 1) at /home/jczaja/Paddle/paddle/fluid/inference/tests/api/tester_helper.h : 805 
#8  0x0000000000441c48 in paddle::inference::analysis::Analyzer_int8_image_classification_quantization_Test::TestBody ( this = 0xafa060) at /home/jczaja/Paddle/paddle/fluid/inference/tests/api/analyzer_int8_image_classification_tester.cc : 57
#9  0x00000000004a6a4a in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
#10 0x00000000004956da in testing::Test::Run() ()
#11 0x0000000000495818 in testing::TestInfo::Run() ()
#12 0x00000000004958f5 in testing::TestCase::Run() ()
#13 0x000000000049f955 in testing::internal::UnitTestImpl::RunAllTests() ()
#14 0x000000000049fae1 in testing::UnitTest::Run() ()
#15 0x000000000047fb0a in RUN_ALL_TESTS () at /home/jczaja/Paddle/build-debug/third_party/install/gtest/include/gtest/gtest.h : 2341
#16 0x000000000047f79d in main ( argc = 8 , argv = 0x7fffffffde08) at /home/jczaja/Paddle/paddle/testing/paddle_gtest_main.cc : 100

If you need more data then let me know.

@luotao1
Copy link
Contributor

luotao1 commented Apr 28, 2021

@jczaja PR_CI_Coverage is remitted now.

Copy link
Contributor

@juncaipeng juncaipeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

please cherry-pick to release 2.1

@luotao1 luotao1 merged commit ba61076 into PaddlePaddle:develop Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants