master
Commits on Nov 7, 2021
-
[torch.fx] Fix replace pattern mechanism (#66442)
Summary: Fixes #{issue number} The following code would not return the pattern correctly: ```python def f(x): x = torch.sigmoid(x) x = torch.sigmoid(x) return torch.sigmoid(x) def pattern(x): return torch.sigmoid(x) def replacement(x): return torch.exp(x) def comparison(x): x = torch.exp(x) x = torch.exp(x) return torch.exp(x) traced = symbolic_trace(f) comparison_fn = symbolic_trace(comparison) subgraph_rewriter.replace_pattern(traced, pattern, replacement) # Only one sigmoid gets converted. ``` This PR fixes this by adding a new test. Pull Request resolved: #66442 Reviewed By: ZolotukhinM Differential Revision: D32238424 Pulled By: ansley fbshipit-source-id: 386e777174c639baafc166d5ffbc0658a96b1ee9
Commits on Nov 6, 2021
-
Add custom zipper script to zip python modules for torch.deploy (#67006)
Summary: Pull Request resolved: #67006 Test Plan: nervouslaugh_ Reviewed By: shunting314 Differential Revision: D31822429 fbshipit-source-id: c2efeab1446fbeb70b98d4ee766fbc670cf091b0
-
[PyTorch Edge] Update bytecode version compatibility check (#67417)
Summary: Pull Request resolved: #67417 bytecode version is valid when it's smaller than kMaxSupported and larger than kMinSupported ghstack-source-id: 142609392 Test Plan: ``` buck test mode/dev //caffe2/test/cpp/jit:jit -- --exact 'caffe2/test/cpp/jit:jit - LiteInterpreterTest.isCompatibleFail' ``` Reviewed By: JacobSzwejbka, iseeyuan Differential Revision: D31984839 fbshipit-source-id: 2011e77455c931c0a8a58267494d44bcf167b877
-
[DDP] Fix some issues with code example in DDP docstring (#67883)
Summary: Pull Request resolved: #67883 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Test Plan: Imported from OSS Reviewed By: zhaojuanmao Differential Revision: D32190946 Pulled By: jamesr66a fbshipit-source-id: a376324b95cbe833ffa606ecdfc6156432880f70
-
[rpc] Switch RPC agent check to TORCH_CHECK and add more descriptive …
…error (#67882) Summary: Pull Request resolved: #67882 I ran into a hard-to-interpret error message when trying to run the following script, which was missing an `init_rpc` call: ``` # $ torchrun --standalone --nnodes=1 --nproc_per_node=1 script.py import os rank = int(os.environ['LOCAL_RANK']) world_size = int(os.environ['WORLD_SIZE']) import torch.distributed # !!!!!! Uncomment the following and the script succeeds # torch.distributed.rpc.init_rpc('worker', rank=rank, world_size=world_size) import torch.distributed as dist dist.init_process_group(backend='gloo') import torchvision.models as models import torch rn50 = models.resnet50() rn50.train() rn50 = torch.nn.parallel.DistributedDataParallel(rn50) from torch.distributed.rpc import RRef from torch.distributed.optim import DistributedOptimizer params = [] for param in rn50.parameters(): params.append(RRef(param)) dist_optim = DistributedOptimizer( torch.optim.SGD, params, lr=0.05) loss_func = torch.nn.CrossEntropyLoss() with torch.distributed.autograd.context() as context_id: pred = rn50(torch.randn(50, 3, 224, 224)) target = torch.randn(50, 1000).softmax(dim=1) loss = loss_func(pred, target) dist.autograd.backward(context_id, [loss]) dist_optim.step(context_id) ``` Error: ``` Traceback (most recent call last): File "/xxx/torchrun_exp/script.py", line 23, in <module> params.append(RRef(param)) RuntimeError: agentINTERNAL ASSERT FAILED at "../torch/csrc/distributed/rpc/rpc_agent.cpp":237, please report a bug to PyTorch. Current RPC agent is not set! ``` Since this is a user-facing error, I've changed `TORCH_INTERNAL_ASSERT` to `TORCH_CHECK` and added a hint about how to resolve the issue. On the other hand, the fact that this was originally `TORCH_INTERNAL_ASSERT` may suggest that the author thought that this should be an internal-only error condition. If there is some other place that should be throwing an exception in this case that is failing, let me know and I can adapt the fix to change that location. Question for reviewers: * Is there a good test file where I can add a test for this error condition? cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D32190947 Pulled By: jamesr66a fbshipit-source-id: 3621d755329fd524db68675c55b1daf20e716d43
Commits on Nov 5, 2021
-
Add meta support to tensor range factories (#67032)
Summary: Pull Request resolved: #67032 This PR adds meta backend support to the `range`, `arange`, `linspace`, and `logspace` operators. Note that the original PR (#66630) was reverted due to two failing unit tests in the Bionic CI. This revision includes a fix for those tests; otherwise its content is identical to the previous PR. Original commit changeset: 2f9d8d1acbb0 ghstack-source-id: 142487306 Test Plan: Extended the existing tensor creation tests to assert meta backend support. Reviewed By: zhaojuanmao Differential Revision: D31834403 fbshipit-source-id: a489858a2a8a38a03234b14408e14d2b208a8d34
-
Revert D31932215: [pytorch][PR] Don't #define NUM_THREADS
Test Plan: revert-hammer Differential Revision: D31932215 (f70e806) Original commit changeset: ccdf11e249fb fbshipit-source-id: 4c330aebe9cfb483f02ceb1fdaf5c3b0f8fa6fa1
-
[quan][fusion] Fix a additional_fuser_method method for fuse_fx (#67876)
Summary: Pull Request resolved: #67876 Previously we miss it when we call obj.convert and this argument would not impact the fusion. This PR fixes it and adds a test for it Test Plan: python test/test_quantization.py TestFuseFx Imported from OSS Reviewed By: malfet Differential Revision: D32191364 fbshipit-source-id: 566bd39461010d70a21de71f611bb929976fe01d
-
Don't #define NUM_THREADS (#67258)
Summary: PyTorch doesn't compile with the latest `main` branch of cub again. The root cause is, PyTorch defines a macro `NUM_THREADS`, and cub added some code like ```C++ template<...., int NUM_THREADS, ...> ``` and these two mess up with each other. Pull Request resolved: #67258 Reviewed By: albanD Differential Revision: D31932215 Pulled By: ngimel fbshipit-source-id: ccdf11e249fbc0b6f654535067a0294037ee7b96
-
-
(torch/elastic) fix scale down bug caused by calling rdzv_handler.shu…
…tdown() on premature agent failures (#67749) Summary: Pull Request resolved: #67749 Fixes: #67742 Test Plan: Added unittests. Validated manually: ``` # start agent 0 $ torchrun --rdzv_backend c10d --rdzv_id 123 --rdzv_endpoint localhost:29500 --nnodes 1:2 --nproc_per_node 1 --monitor_interval 1 test.py # start agent 1 torchrun --rdzv_backend c10d --rdzv_id 123 --rdzv_endpoint localhost:29500 --nnodes 1:2 --nproc_per_node 1 --monitor_interval 1 test.py # kill agent 0 CTRL+C (SIGINT) or kill -15 (SIGTERM) # restart it torchrun --rdzv_backend c10d --rdzv_id 123 --rdzv_endpoint localhost:29500 --nnodes 1:2 --nproc_per_node 1 --monitor_interval 1 test.py ``` Reviewed By: cbalioglu Differential Revision: D32129005 fbshipit-source-id: db292268250ef6f1e06f5b4c5bd67124d8dfd325
-
-
Refactor cuDNN Convolution memory format and Conv-Bias-Relu code (#65594
) Summary: This PR makes several changes: - Changed function `bool cudnn_conv_use_channels_last(...)` to `at::MemoryFormat cudnn_conv_suggest_memory_format(...)` - Removed `resize_` in cudnn convolution code. Added a new overloading method `TensorDescriptor::set` that also passes the desired memory format of the tensor. - Disabled the usage of double + channels_last on cuDNN Conv-Relu and Conv-Bias-Relu. Call `.contiguous(memory_format)` before passing data to cuDNN functions. - Disabled the usage of cuDNN fused Conv-Bias-Relu in cuDNN < 8.0 version due to a CUDNN_STATUS_NOT_SUPPORTED error. Instead, use the native fallback path. - Let Conv-Bias-Relu code respect the global `allow_tf32` flag. From cuDNN document, double + NHWC is genenrally not supported. Close #66968 Fix #55301 Pull Request resolved: #65594 Reviewed By: jbschlosser, malfet Differential Revision: D32175766 Pulled By: ngimel fbshipit-source-id: 7ba079c9f7c46fc56f8bfef05bad0854acf380d7
-
[Foreach] Implement L1&L2 norm (#62646)
Summary: Implement L1 & L2 norm in fast path with the reference of [nvidia/apex](https://github.com/NVIDIA/apex/blob/master/csrc/multi_tensor_l2norm_kernel.cu). When `ord` is neither 1 nor 2, then slow path is chosen. Related: #58833 cc ptrblck mcarilli ngimel Pull Request resolved: #62646 Reviewed By: malfet Differential Revision: D32173421 Pulled By: ngimel fbshipit-source-id: 14b7544601658a979b83509df351e1848ded7675
-
[nnc] Add support for dynamic shapes in TensorExprKernel (#67861)
Summary: Pull Request resolved: #67861 Previously submitted as #67197. This got reverted because its failures were hidden by the failures of another PR. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D32178196 Pulled By: navahgar fbshipit-source-id: cc8a5c68aed360d06289e69645461cfa773e1300
-
Add ownership to more edge tests (#67859)
Summary: Fixes #66232 This should be the last immediate task. I anticipate test ownership will change overtime but this is the last big thing to close it out Pull Request resolved: #67859 Reviewed By: soulitzer Differential Revision: D32210534 Pulled By: janeyx99 fbshipit-source-id: 7fd835d87d9d35d49ec49de1fcfa29b085133e99
-
remove use of THGenerateAllTypes, clean up (#67867)
Summary: Per title Pull Request resolved: #67867 Reviewed By: mruberry Differential Revision: D32191053 Pulled By: ngimel fbshipit-source-id: 84eb6c2989495fca5f7b055c4984efe5de94e812
-
autodiff fix for autocast_to_xxx (#67648)
Summary: Fixes autocast + autodiff issue where `RuntimeError: grad_inputs.size() == node->inputs().size()INTERNAL ASSERT FAILED at "../torch/csrc/jit/runtime/autodiff.cpp":426, please report a bug to PyTorch.` Pull Request resolved: #67648 Reviewed By: cpuhrsch Differential Revision: D32083227 Pulled By: davidberard98 fbshipit-source-id: edf526cff4ec21874ae35ec730d13c250073e10c
-
[PyTorchEge] backport test (#67824)
Summary: Pull Request resolved: #67824 Testing backport of all prod models using model test framework Ref: [Create tests at run-time (google test)](https://stackoverflow.com/questions/19160244/create-tests-at-run-time-google-test) breaking the list of models into 20 chunks based on a simple hash (sum of all char values) ghstack-source-id: 142398833 Test Plan: ``` buck test //xplat/pytorch/mobile/test:test_read_all_mobile_model_configs Starting new Buck daemon... Parsing buck files: finished in 7.6 sec Creating action graph: finished in 0.9 sec [RE] Metadata: Session ID=[reSessionID-66f5adfe-50d1-4599-9828-3e8115181601] [RE] Waiting on 0 remote actions. Completed 1008 actions remotely, action cache hit rate: 43.59%. Downloaded 26/1523 artifacts, 252.60 Kbytes, 96.6% cache miss (for updated rules) Building: finished in 01:18.6 min (100%) 5532/5532 jobs, 770/5532 updated Total time: 01:27.3 min Testing: finished in 11:21.6 min (41 PASS/0 FAIL) BUILD SUCCEEDED RESULTS FOR //xplat/pytorch/mobile/test:test_read_all_mobile_model_configs PASS 673.8s 41 Passed 0 Skipped 0 Failed //xplat/pytorch/mobile/test:test_read_all_mobile_model_configs TESTS PASSED ``` Reviewed By: dhruvbird Differential Revision: D32068955 fbshipit-source-id: d06c2434a4a69572ab52df31a684e5973b9d551c
-
[ONNX] Update onnx function export with comments and clean up (#66817) (
#67803) Summary: Pull Request resolved: #67803 * Addresses comments from #63589 [ONNX] remove torch::onnx::PRODUCER_VERSION (#67107) Use constants from version.h instead. This simplifies things since we no longer have to update PRODUCER_VERSION for each release. Also add TORCH_VERSION to version.h so that a string is available for this purpose. [ONNX] Set `ir_version` based on opset_version. (#67128) This increases the odds that the exported ONNX model will be usable. Before this change, we were setting the IR version to a value which may be higher than what the model consumer supports. Also some minor clean-up in the test code: * Fix string replacement. * Use a temporary file so as to not leave files around in the test current working directory. Test Plan: Imported from OSS Reviewed By: msaroufim Differential Revision: D32181306 Pulled By: malfet fbshipit-source-id: 02f136d34ef8f664ade0bc1985a584f0e8c2b663 Co-authored-by: BowenBao <[email protected]> Co-authored-by: Gary Miguel <[email protected]> Co-authored-by: Nikita Shulga <[email protected]>
-
[FSDP] Address follow up comments for CPU offload (#67813)
Summary: Pull Request resolved: #67813 Address Shen's comments in https://github.com/pytorch/pytorch/pull/67249/files ghstack-source-id: 142379312 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D32157545 fbshipit-source-id: 3cc2df6d5fa0d3b9383ed3711e7f79729dbb1dda
-
[forward ad] Also check layout of grad matches that of self for inpla…
…ce over view (#67816) Summary: Fixes #67800 Currently when the grad is the same layout as base, we try to assign the same tensor to the forward grad of both the base and the view. However, when the layout of the grad is different from the layout of the view, this triggers a copy to be created, and the tangent of the view (after the inplace) will not have a view relationship with the view of the base. This PR just changes it so that we only do the above optimization when the layout also matches the layout of self Pull Request resolved: #67816 Reviewed By: malfet Differential Revision: D32190021 Pulled By: soulitzer fbshipit-source-id: b1b2c9b332e83f4df5695ee9686ea76447f9305b
-
Add retry logic for test_multitenancy and documentation for find_free…
…_port (#67775) Summary: Pull Request resolved: #67775 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D32142749 Pulled By: H-Huang fbshipit-source-id: 67ab4ede4f4bff96a1ffd41d55b3be0edc82b1ce
-
Fix conv_transpose3d backward with non-contiguous grad_out (#67829)
Summary: Many thanks to Forest Yang (meowmix) from the forum for reporting it with a minimal reproduction. Pull Request resolved: #67829 Reviewed By: malfet Differential Revision: D32184786 Pulled By: albanD fbshipit-source-id: b63dbd3148b5def2109deb2f4612c08f55f59dfb
-
Fix typo in LinearLR docs (#67840)
Summary: The final learning rate should be 0.05 like the lr used as the argument for the optimizer and not 0.005. Pull Request resolved: #67840 Reviewed By: jbschlosser Differential Revision: D32187091 Pulled By: albanD fbshipit-source-id: 8aff691bba3896a847d7b9d9d669a65f67a6f066
-
Fix warnings produced when running test_optim.py (#67756)
Summary: Fixes part of #67696 by adding calls to `optimizer.step()` in various places. ## Notes for reviewers: - It is not entirely clear which is the right optimizer to step in each case. I have favoured the more explicit approach of creating a set of optimizers and calling step on each of them. - At the time of writing, the only Scheduler without an `optimizer` instance variable is `ChainedScheduler` which I need to deal with once. I use `hasattr` to do this check. Let me know if this ought to be changed. - I am opening this PR for review when it only solve part of the issue, as I'd rather get feedback sooner. I think it is fine to fix the issue in several PRs too. Pull Request resolved: #67756 Reviewed By: jbschlosser Differential Revision: D32187864 Pulled By: albanD fbshipit-source-id: fd0d133bcaa3a24588e5a997ad198fdf5879ff5a
-
Revert D32063662: [pytorch][PR] TST Adds device transfer into module …
…info tests Test Plan: revert-hammer Differential Revision: D32063662 (da59bd1) Original commit changeset: 0868235a0ae7 fbshipit-source-id: a4f775874faa88be0eb5272dedf3bbc8194ebde6
-
Revert D32175963: Converting hardswish to strucutred kernels with met…
…atensor support Test Plan: revert-hammer Differential Revision: D32175963 (57335a9) Original commit changeset: f4d749c6aeaf fbshipit-source-id: 6d68a96cf872c2d7b518c061875b9336bca0043a
-
Revert D32175960: Moving parts of the Shape Registry into a common file
Test Plan: revert-hammer Differential Revision: D32175960 (d04389e) Original commit changeset: 2e30115ca554 fbshipit-source-id: 27f9889c535e4f7c21c50b2468e1e6650e952d4f
-
Revert D32175958: Adding Custom Rules to Device Propagation
Test Plan: revert-hammer Differential Revision: D32175958 (8532984) Original commit changeset: 26a9ef41e10a fbshipit-source-id: adcc70687b5b454f358b5446bed2c06d04e61435
-
Revert D32175957: Adding custom testing based on opinfos input for op…
…s with custom rules. Test Plan: revert-hammer Differential Revision: D32175957 (b8e165e) Original commit changeset: 1cb51a7b6cbb fbshipit-source-id: 29fd0750d9981758436c55eea2de40cdaddfb9be
-
Revert D32175959: Merging the implementations of ClearProfiling
Test Plan: revert-hammer Differential Revision: D32175959 (f175431) Original commit changeset: b335dacce709 fbshipit-source-id: 23d1f75d47f15effc9806bd6e5228007d521b0b3
-
[Static Runtime] Add a comment on clients taking ownership of managed…
… output tensors (#67554) Summary: Pull Request resolved: #67554 This change adds a comment on clients taking ownership of managed output tensor to remind SR developers of how and why that matters. Test Plan: N/A Reviewed By: swolchok Differential Revision: D32013468 fbshipit-source-id: bcc13055c329c61677bdcc76411fe8db44bb2cee
-
Implement padding with slice layer (#67888)
Summary: Pull Request resolved: #67888 Implement padding with slice layer, work step is: reverse slice and pad 0 [1, 2] => [2, 1, 0 ... 0] transpose, reverse tensor back to original order, finish pre-pad [2, 1, 0 ... 0] => [0 ... 0, 1, 2] continue post-pad [0 ... 0, 1, 2] => [0 ... 0, 1, 2, 0 ... 0] Test Plan: buck test mode/dev-nosan caffe2/test/fx2trt/converters:test_pad Reviewed By: 842974287 Differential Revision: D32160739 fbshipit-source-id: dbbc04d916e23551e3ce9be480283377e9a38b34