The Wayback Machine - https://web.archive.org/web/20230227082326/https://github.com/pytorch/pytorch/pull/61940
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMake: Add optional precompiled header support #61940

Closed
wants to merge 5 commits into from

Conversation

peterbell10
Copy link
Collaborator

@peterbell10 peterbell10 commented Jul 20, 2021

Stack from ghstack:

This adds a USE_PRECOMPILED_HEADERS option to the CMake build which
precompiles ATen.h and also CUDAContext.h for the cuda library.
After making a change in native_functions.yaml, this speeds up compilation
time by around 15% on my machine.

Differential Revision: D29988775

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jul 20, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 9e33e01 (more details on the Dr. CI page):


  • 2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_test (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 28 19:37:55 2021-07-28 19:37:55.947626: E t...w/compiler/xla/service/slow_operation_alarm.cc:55]
Jul 28 19:26:28   test_AvgPool3d_backward_after_cat_dim1_device_xla (__main__.TestNNDeviceTypeXLA) ... skip (0.002s)
Jul 28 19:26:28   test_BatchNorm_empty_xla (__main__.TestNNDeviceTypeXLA) ... ok (0.142s)
Jul 28 19:26:28   test_Bilinear_empty_xla (__main__.TestNNDeviceTypeXLA) ... skip (0.003s)
Jul 28 19:26:28   test_CTCLoss_cudnn_xla (__main__.TestNNDeviceTypeXLA) ... skip (0.003s)
Jul 28 19:26:29   test_CTCLoss_empty_target_xla (__main__.TestNNDeviceTypeXLA) ... ok (0.913s)
Jul 28 19:28:39   test_Conv2d_backward_depthwise_xla_float64 (__main__.TestNNDeviceTypeXLA) ... 2021-07-28 19:28:39.777472: E tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
Jul 28 19:28:39 ********************************
Jul 28 19:28:39 Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Jul 28 19:28:39 Compiling module SyncTensorsGraph.30789
Jul 28 19:28:39 ********************************
Jul 28 19:37:55 2021-07-28 19:37:55.947626: E tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
Jul 28 19:37:55 ********************************
Jul 28 19:37:55 Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Jul 28 19:37:55 Compiling module SyncTensorsGraph.35441
Jul 28 19:37:55 ********************************


Too long with no output (exceeded 1h30m0s): context deadline exceeded

See CircleCI build pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (2/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Jul 28 21:12:51 ERROR 2021-07-28T16:40:22Z: scc...eof ((socklen_t)))\n ^\n" }
Jul 28 21:12:51 ERROR 2021-07-28T16:40:14Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:332:2: error: \'struct sockaddr\' has no member named \'sa_len\'\n x.sa_len = 0;\n  ^\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:18Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:366:10: error: \'RTLD_MEMBER\' undeclared (first use in this function); did you mean \'RTLD_NEXT\'?\n   (void) RTLD_MEMBER;\n          ^~~~~~~~~~~\n          RTLD_NEXT\nconftest.c:366:10: note: each undeclared identifier is reported only once for each function it appears in\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:19Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c:361:9: error: unknown type name \'not\'\n         not a universal capable compiler\n         ^~~\nconftest.c:361:15: error: expected \'=\', \',\', \';\', \'asm\' or \'__attribute__\' before \'universal\'\n         not a universal capable compiler\n               ^~~~~~~~~\nconftest.c:361:15: error: unknown type name \'universal\'\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:19Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:367:4: error: unknown type name \'not\'; did you mean \'ino_t\'?\n    not big endian\n    ^~~\n    ino_t\nconftest.c:367:12: error: expected \'=\', \',\', \';\', \'asm\' or \'__attribute__\' before \'endian\'\n    not big endian\n            ^~~~~~\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:20Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:378:4: error: \'struct stat\' has no member named \'st_mtimespec\'; did you mean \'st_mtim\'?\n st.st_mtimespec.tv_nsec = 1;\n    ^~~~~~~~~~~~\n    st_mtim\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 ERROR 2021-07-28T16:40:22Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "conftest.c: In function \'main\':\nconftest.c:402:24: error: expected expression before \')\' token\n if (sizeof ((socklen_t)))\n                        ^\n" }
Jul 28 21:12:51 
Jul 28 21:12:51 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jul 28 21:12:51 Compile requests                   12832
Jul 28 21:12:51 Compile requests executed           7789
Jul 28 21:12:51 Cache hits                          6273
Jul 28 21:12:51 Cache hits (C/C++)                  5877
Jul 28 21:12:51 Cache hits (CUDA)                    396
Jul 28 21:12:51 Cache misses                        1444
Jul 28 21:12:51 Cache misses (C/C++)                1275
Jul 28 21:12:51 Cache misses (CUDA)                  169

2 jobs timed out:

  • pytorch_xla_linux_bionic_py3_6_clang9_test
  • pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

[ghstack-poisoned]
This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Jul 21, 2021
This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

ghstack-source-id: 990b4e460433e89c5ebd6b3513e169cf495b8996
Pull Request resolved: #61940
This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Jul 22, 2021
This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

ghstack-source-id: 21abe8f4cd9c8410c9835c42e92b7e6aa2fc4071
Pull Request resolved: #61940
This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Jul 28, 2021
This adds a `USE_PRECOMPILED_HEADERS` option to the CMake build which
precompiles `ATen.h` and also `CUDAContext.h` for the cuda library.
After making a change in `native_functions.yaml`, this speeds up compilation
time by around 15% on my machine.

ghstack-source-id: 04a91bc7298fff15e124d821e4cdf69bdc9b3bf2
Pull Request resolved: #61940
@peterbell10
Copy link
Collaborator Author

@malfet PTAL.

malfet
malfet approved these changes Jul 29, 2021
option(USE_PRECOMPILED_HEADERS "Use pre-compiled headers to accelerate build. Requires cmake >= 3.16." OFF)
if(USE_PRECOMPILED_HEADERS AND (CMAKE_VERSION VERSION_LESS "3.16"))
message(FATAL_ERROR "Precompiled headers require cmake >= 3.16")
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to turn it on by default?

Suggested change
option(USE_PRECOMPILED_HEADERS "Use pre-compiled headers to accelerate build. Requires cmake >= 3.16." OFF)
if(USE_PRECOMPILED_HEADERS AND (CMAKE_VERSION VERSION_LESS "3.16"))
message(FATAL_ERROR "Precompiled headers require cmake >= 3.16")
endif()
cmake_dependent_option(USE_PRECOMPILED_HEADERS "Use pre-compiled headers to accelerate build. Requires cmake >= 3.16." ON "CMAKE_VERSION VERSION_LESS 3.16" OFF)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just being conservative. There is an issue where the pch gets included everywhere, so it's possible for code to compile with PCH enabled but not without, and vice versa. It's also not necessarily a win on all machines, for example on a 32-core machine I see no benefit in overall compile times.

@malfet
Copy link
Contributor

malfet commented Jul 29, 2021

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@malfet merged this pull request in b7ac286.

peterbell10 added a commit to peterbell10/pytorch that referenced this pull request Aug 5, 2021
This option was added in pytorch#61940 and fits with this section's theme of improving
build times.

I've also changed it to a `cmake_dependent_option` instead of `FATAL_ERROR`ing
for older CMake versions.
@facebook-github-bot facebook-github-bot deleted the gh/peterbell10/99/head branch August 7, 2021 14:17
facebook-github-bot pushed a commit that referenced this pull request Aug 17, 2021
Summary:
This option was added in #61940 and fits with this section's theme of improving build times.

I've also changed it to a `cmake_dependent_option` instead of `FATAL_ERROR`ing for older CMake versions.

Pull Request resolved: #62827

Reviewed By: astaff

Differential Revision: D30342102

Pulled By: malfet

fbshipit-source-id: 3095b44b7085aee8a884ec95cba9f8998d4442e7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants