CMake: Add optional precompiled header support #61940

peterbell10 · 2021-07-20T23:32:31Z

Stack from ghstack:

-> CMake: Add optional precompiled header support #61940

This adds a USE_PRECOMPILED_HEADERS option to the CMake build which
precompiles ATen.h and also CUDAContext.h for the cuda library.
After making a change in native_functions.yaml, this speeds up compilation
time by around 15% on my machine.

Differential Revision: D29988775

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which precompiles `ATen.h` and also `CUDAContext.h` for the cuda library. After making a change in `native_functions.yaml`, this speeds up compilation time by around 15% on my machine. [ghstack-poisoned]

facebook-github-bot · 2021-07-20T23:32:36Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/61940
📄 Preview docs built from this PR

💊 CI failures summary and remediations

As of commit 9e33e01 (more details on the Dr. CI page):

2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_xla_linux_bionic_py3_6_clang9_test (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 28 19:37:55 2021-07-28 19:37:55.947626: E t...w/compiler/xla/service/slow_operation_alarm.cc:55]

Jul 28 19:26:28   test_AvgPool3d_backward_after_cat_dim1_device_xla (__main__.TestNNDeviceTypeXLA) ... skip (0.002s)
Jul 28 19:26:28   test_BatchNorm_empty_xla (__main__.TestNNDeviceTypeXLA) ... ok (0.142s)
Jul 28 19:26:28   test_Bilinear_empty_xla (__main__.TestNNDeviceTypeXLA) ... skip (0.003s)
Jul 28 19:26:28   test_CTCLoss_cudnn_xla (__main__.TestNNDeviceTypeXLA) ... skip (0.003s)
Jul 28 19:26:29   test_CTCLoss_empty_target_xla (__main__.TestNNDeviceTypeXLA) ... ok (0.913s)
Jul 28 19:28:39   test_Conv2d_backward_depthwise_xla_float64 (__main__.TestNNDeviceTypeXLA) ... 2021-07-28 19:28:39.777472: E tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
Jul 28 19:28:39 ********************************
Jul 28 19:28:39 Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Jul 28 19:28:39 Compiling module SyncTensorsGraph.30789
Jul 28 19:28:39 ********************************
Jul 28 19:37:55 2021-07-28 19:37:55.947626: E tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
Jul 28 19:37:55 ********************************
Jul 28 19:37:55 Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Jul 28 19:37:55 Compiling module SyncTensorsGraph.35441
Jul 28 19:37:55 ********************************


Too long with no output (exceeded 1h30m0s): context deadline exceeded

pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (2/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Jul 28 21:12:51 ERROR 2021-07-28T16:40:22Z: scc...eof ((socklen_t)))\n ^\n" }

Jul 28 21:12:51 ERROR 2021-07-28T16:40:14Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:332:2: error: \'struct sockaddr\' has no member named \'sa_len\'\n x.sa_len = 0;\n  ^\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:18Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:366:10: error: \'RTLD_MEMBER\' undeclared (first use in this function); did you mean \'RTLD_NEXT\'?\n   (void) RTLD_MEMBER;\n          ^~~~~~~~~~~\n          RTLD_NEXT\nconftest.c:366:10: note: each undeclared identifier is reported only once for each function it appears in\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:19Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c:361:9: error: unknown type name \'not\'\n         not a universal capable compiler\n         ^~~\nconftest.c:361:15: error: expected \'=\', \',\', \';\', \'asm\' or \'__attribute__\' before \'universal\'\n         not a universal capable compiler\n               ^~~~~~~~~\nconftest.c:361:15: error: unknown type name \'universal\'\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:19Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:367:4: error: unknown type name \'not\'; did you mean \'ino_t\'?\n    not big endian\n    ^~~\n    ino_t\nconftest.c:367:12: error: expected \'=\', \',\', \';\', \'asm\' or \'__attribute__\' before \'endian\'\n    not big endian\n            ^~~~~~\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:20Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:378:4: error: \'struct stat\' has no member named \'st_mtimespec\'; did you mean \'st_mtim\'?\n st.st_mtimespec.tv_nsec = 1;\n    ^~~~~~~~~~~~\n    st_mtim\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:22Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:402:24: error: expected expression before \')\' token\n if (sizeof ((socklen_t)))\n                        ^\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jul 28 21:12:51 Compile requests                   12832
Jul 28 21:12:51 Compile requests executed           7789
Jul 28 21:12:51 Cache hits                          6273
Jul 28 21:12:51 Cache hits (C/C++)                  5877
Jul 28 21:12:51 Cache hits (CUDA)                    396
Jul 28 21:12:51 Cache misses                        1444
Jul 28 21:12:51 Cache misses (C/C++)                1275
Jul 28 21:12:51 Cache misses (CUDA)                  169

2 jobs timed out:

pytorch_xla_linux_bionic_py3_6_clang9_test
pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which precompiles `ATen.h` and also `CUDAContext.h` for the cuda library. After making a change in `native_functions.yaml`, this speeds up compilation time by around 15% on my machine. [ghstack-poisoned]

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which precompiles `ATen.h` and also `CUDAContext.h` for the cuda library. After making a change in `native_functions.yaml`, this speeds up compilation time by around 15% on my machine. ghstack-source-id: 990b4e460433e89c5ebd6b3513e169cf495b8996 Pull Request resolved: #61940

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which precompiles `ATen.h` and also `CUDAContext.h` for the cuda library. After making a change in `native_functions.yaml`, this speeds up compilation time by around 15% on my machine. [ghstack-poisoned]

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which precompiles `ATen.h` and also `CUDAContext.h` for the cuda library. After making a change in `native_functions.yaml`, this speeds up compilation time by around 15% on my machine. ghstack-source-id: 21abe8f4cd9c8410c9835c42e92b7e6aa2fc4071 Pull Request resolved: #61940

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which precompiles `ATen.h` and also `CUDAContext.h` for the cuda library. After making a change in `native_functions.yaml`, this speeds up compilation time by around 15% on my machine. [ghstack-poisoned]

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which precompiles `ATen.h` and also `CUDAContext.h` for the cuda library. After making a change in `native_functions.yaml`, this speeds up compilation time by around 15% on my machine. ghstack-source-id: 04a91bc7298fff15e124d821e4cdf69bdc9b3bf2 Pull Request resolved: #61940

peterbell10 · 2021-07-29T14:17:23Z

@malfet PTAL.

malfet · 2021-07-29T14:23:34Z

CMakeLists.txt

+option(USE_PRECOMPILED_HEADERS "Use pre-compiled headers to accelerate build. Requires cmake >= 3.16." OFF)
+if(USE_PRECOMPILED_HEADERS AND (CMAKE_VERSION VERSION_LESS "3.16"))
+  message(FATAL_ERROR "Precompiled headers require cmake >= 3.16")
+endif()


Is there a reason not to turn it on by default?

Suggested change

option(USE_PRECOMPILED_HEADERS "Use pre-compiled headers to accelerate build. Requires cmake >= 3.16." OFF)

if(USE_PRECOMPILED_HEADERS AND (CMAKE_VERSION VERSION_LESS "3.16"))

message(FATAL_ERROR "Precompiled headers require cmake >= 3.16")

endif()

cmake_dependent_option(USE_PRECOMPILED_HEADERS "Use pre-compiled headers to accelerate build. Requires cmake >= 3.16." ON "CMAKE_VERSION VERSION_LESS 3.16" OFF)

I was just being conservative. There is an issue where the pch gets included everywhere, so it's possible for code to compile with PCH enabled but not without, and vice versa. It's also not necessarily a win on all machines, for example on a 32-core machine I see no benefit in overall compile times.

malfet · 2021-07-29T14:25:07Z

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-08-03T16:15:34Z

@malfet merged this pull request in b7ac286.

This option was added in pytorch#61940 and fits with this section's theme of improving build times. I've also changed it to a `cmake_dependent_option` instead of `FATAL_ERROR`ing for older CMake versions.

Summary: This option was added in #61940 and fits with this section's theme of improving build times. I've also changed it to a `cmake_dependent_option` instead of `FATAL_ERROR`ing for older CMake versions. Pull Request resolved: #62827 Reviewed By: astaff Differential Revision: D30342102 Pulled By: malfet fbshipit-source-id: 3095b44b7085aee8a884ec95cba9f8998d4442e7

peterbell10 mentioned this pull request Jul 20, 2021

Remove duplicated movedim implementation #61939

Closed

facebook-github-bot added the cla signed label Jul 20, 2021

pytorchbot added the open source label Jul 20, 2021

makslevental requested a review from malfet July 21, 2021 00:34

malfet approved these changes Jul 29, 2021

View changes

malfet reviewed Jul 29, 2021

View changes

facebook-github-bot closed this in b7ac286 Aug 3, 2021

facebook-github-bot added the Merged label Aug 3, 2021

peterbell10 mentioned this pull request Aug 5, 2021

Advertise USE_PRECOMPILED_HEADERS in CONTRIBUTING.md #62827

Closed

facebook-github-bot deleted the gh/peterbell10/99/head branch August 7, 2021 14:17

Jan	FEB	Mar
	27
2022	2023	2024

CMake: Add optional precompiled header support #61940

CMake: Add optional precompiled header support #61940

peterbell10 commented Jul 20, 2021 •

edited by malfet

facebook-github-bot commented Jul 20, 2021 •

edited

peterbell10 commented Jul 29, 2021

malfet Jul 29, 2021

peterbell10 Jul 29, 2021

malfet commented Jul 29, 2021

facebook-github-bot commented Aug 3, 2021

CMake: Add optional precompiled header support #61940

CMake: Add optional precompiled header support #61940

Conversation

peterbell10 commented Jul 20, 2021 • edited by malfet

facebook-github-bot commented Jul 20, 2021 • edited

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 2 new failures recognized by patterns

pytorch_xla_linux_bionic_py3_6_clang9_test (1/2)

pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (2/2)

peterbell10 commented Jul 29, 2021

malfet Jul 29, 2021

Choose a reason for hiding this comment

peterbell10 Jul 29, 2021

Choose a reason for hiding this comment

malfet commented Jul 29, 2021

facebook-github-bot commented Aug 3, 2021

peterbell10 commented Jul 20, 2021 •

edited by malfet

facebook-github-bot commented Jul 20, 2021 •

edited