pytorch / pytorch Public

Summary: Fixes #{issue number} The following code would not return the pattern correctly: ```python def f(x): x = torch.sigmoid(x) x = torch.sigmoid(x) return torch.sigmoid(x) def pattern(x): return torch.sigmoid(x) def replacement(x): return torch.exp(x) def comparison(x): x = torch.exp(x) x = torch.exp(x) return torch.exp(x) traced = symbolic_trace(f) comparison_fn = symbolic_trace(comparison) subgraph_rewriter.replace_pattern(traced, pattern, replacement) # Only one sigmoid gets converted. ``` This PR fixes this by adding a new test. Pull Request resolved: #66442 Reviewed By: ZolotukhinM Differential Revision: D32238424 Pulled By: ansley fbshipit-source-id: 386e777174c639baafc166d5ffbc0658a96b1ee9

Summary: Pull Request resolved: #67924 This diff reverts the changes made in D31762735 (0cbfd46) Test Plan: Wait for CI Reviewed By: derekmod-fb Differential Revision: D32214744 fbshipit-source-id: e0a65b6a31a88216ae1243549fcbc901ef812374

Summary: Pull Request resolved: #67006 Test Plan: nervouslaugh_ Reviewed By: shunting314 Differential Revision: D31822429 fbshipit-source-id: c2efeab1446fbeb70b98d4ee766fbc670cf091b0

Summary: Pull Request resolved: #67417 bytecode version is valid when it's smaller than kMaxSupported and larger than kMinSupported ghstack-source-id: 142609392 Test Plan: ``` buck test mode/dev //caffe2/test/cpp/jit:jit -- --exact 'caffe2/test/cpp/jit:jit - LiteInterpreterTest.isCompatibleFail' ``` Reviewed By: JacobSzwejbka, iseeyuan Differential Revision: D31984839 fbshipit-source-id: 2011e77455c931c0a8a58267494d44bcf167b877

Summary: Pull Request resolved: #67883 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Test Plan: Imported from OSS Reviewed By: zhaojuanmao Differential Revision: D32190946 Pulled By: jamesr66a fbshipit-source-id: a376324b95cbe833ffa606ecdfc6156432880f70

…error (#67882) Summary: Pull Request resolved: #67882 I ran into a hard-to-interpret error message when trying to run the following script, which was missing an `init_rpc` call: ``` # $ torchrun --standalone --nnodes=1 --nproc_per_node=1 script.py import os rank = int(os.environ['LOCAL_RANK']) world_size = int(os.environ['WORLD_SIZE']) import torch.distributed # !!!!!! Uncomment the following and the script succeeds # torch.distributed.rpc.init_rpc('worker', rank=rank, world_size=world_size) import torch.distributed as dist dist.init_process_group(backend='gloo') import torchvision.models as models import torch rn50 = models.resnet50() rn50.train() rn50 = torch.nn.parallel.DistributedDataParallel(rn50) from torch.distributed.rpc import RRef from torch.distributed.optim import DistributedOptimizer params = [] for param in rn50.parameters(): params.append(RRef(param)) dist_optim = DistributedOptimizer( torch.optim.SGD, params, lr=0.05) loss_func = torch.nn.CrossEntropyLoss() with torch.distributed.autograd.context() as context_id: pred = rn50(torch.randn(50, 3, 224, 224)) target = torch.randn(50, 1000).softmax(dim=1) loss = loss_func(pred, target) dist.autograd.backward(context_id, [loss]) dist_optim.step(context_id) ``` Error: ``` Traceback (most recent call last): File "/xxx/torchrun_exp/script.py", line 23, in <module> params.append(RRef(param)) RuntimeError: agentINTERNAL ASSERT FAILED at "../torch/csrc/distributed/rpc/rpc_agent.cpp":237, please report a bug to PyTorch. Current RPC agent is not set! ``` Since this is a user-facing error, I've changed `TORCH_INTERNAL_ASSERT` to `TORCH_CHECK` and added a hint about how to resolve the issue. On the other hand, the fact that this was originally `TORCH_INTERNAL_ASSERT` may suggest that the author thought that this should be an internal-only error condition. If there is some other place that should be throwing an exception in this case that is failing, let me know and I can adapt the fix to change that location. Question for reviewers: * Is there a good test file where I can add a test for this error condition? cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D32190947 Pulled By: jamesr66a fbshipit-source-id: 3621d755329fd524db68675c55b1daf20e716d43

Summary: Pull Request resolved: #67032 This PR adds meta backend support to the `range`, `arange`, `linspace`, and `logspace` operators. Note that the original PR (#66630) was reverted due to two failing unit tests in the Bionic CI. This revision includes a fix for those tests; otherwise its content is identical to the previous PR. Original commit changeset: 2f9d8d1acbb0 ghstack-source-id: 142487306 Test Plan: Extended the existing tensor creation tests to assert meta backend support. Reviewed By: zhaojuanmao Differential Revision: D31834403 fbshipit-source-id: a489858a2a8a38a03234b14408e14d2b208a8d34

Test Plan: revert-hammer Differential Revision: D31932215 (f70e806) Original commit changeset: ccdf11e249fb fbshipit-source-id: 4c330aebe9cfb483f02ceb1fdaf5c3b0f8fa6fa1

Summary: Pull Request resolved: #67876 Previously we miss it when we call obj.convert and this argument would not impact the fusion. This PR fixes it and adds a test for it Test Plan: python test/test_quantization.py TestFuseFx Imported from OSS Reviewed By: malfet Differential Revision: D32191364 fbshipit-source-id: 566bd39461010d70a21de71f611bb929976fe01d

Summary: PyTorch doesn't compile with the latest `main` branch of cub again. The root cause is, PyTorch defines a macro `NUM_THREADS`, and cub added some code like ```C++ template<...., int NUM_THREADS, ...> ``` and these two mess up with each other. Pull Request resolved: #67258 Reviewed By: albanD Differential Revision: D31932215 Pulled By: ngimel fbshipit-source-id: ccdf11e249fbc0b6f654535067a0294037ee7b96

Summary: Fixes #67713 Pull Request resolved: #67912 Reviewed By: seemethere Differential Revision: D32215323 Pulled By: atalman fbshipit-source-id: 45da7c4bb13c877c9b38bea8615adf75c4a9702d

…tdown() on premature agent failures (#67749) Summary: Pull Request resolved: #67749 Fixes: #67742 Test Plan: Added unittests. Validated manually: ``` # start agent 0 $ torchrun --rdzv_backend c10d --rdzv_id 123 --rdzv_endpoint localhost:29500 --nnodes 1:2 --nproc_per_node 1 --monitor_interval 1 test.py # start agent 1 torchrun --rdzv_backend c10d --rdzv_id 123 --rdzv_endpoint localhost:29500 --nnodes 1:2 --nproc_per_node 1 --monitor_interval 1 test.py # kill agent 0 CTRL+C (SIGINT) or kill -15 (SIGTERM) # restart it torchrun --rdzv_backend c10d --rdzv_id 123 --rdzv_endpoint localhost:29500 --nnodes 1:2 --nproc_per_node 1 --monitor_interval 1 test.py ``` Reviewed By: cbalioglu Differential Revision: D32129005 fbshipit-source-id: db292268250ef6f1e06f5b4c5bd67124d8dfd325

Summary: Fixes #60492 Updates searchsorted API to be more consistent with numpy and adds an OpInfo for searchsorted Pull Request resolved: #66818 Reviewed By: mruberry Differential Revision: D31745142 Pulled By: samdow fbshipit-source-id: 0b9600afc3cb0720afb5811212404ee96d2a7d93

) Summary: This PR makes several changes: - Changed function `bool cudnn_conv_use_channels_last(...)` to `at::MemoryFormat cudnn_conv_suggest_memory_format(...)` - Removed `resize_` in cudnn convolution code. Added a new overloading method `TensorDescriptor::set` that also passes the desired memory format of the tensor. - Disabled the usage of double + channels_last on cuDNN Conv-Relu and Conv-Bias-Relu. Call `.contiguous(memory_format)` before passing data to cuDNN functions. - Disabled the usage of cuDNN fused Conv-Bias-Relu in cuDNN < 8.0 version due to a CUDNN_STATUS_NOT_SUPPORTED error. Instead, use the native fallback path. - Let Conv-Bias-Relu code respect the global `allow_tf32` flag. From cuDNN document, double + NHWC is genenrally not supported. Close #66968 Fix #55301 Pull Request resolved: #65594 Reviewed By: jbschlosser, malfet Differential Revision: D32175766 Pulled By: ngimel fbshipit-source-id: 7ba079c9f7c46fc56f8bfef05bad0854acf380d7

Summary: Implement L1 & L2 norm in fast path with the reference of [nvidia/apex](https://github.com/NVIDIA/apex/blob/master/csrc/multi_tensor_l2norm_kernel.cu). When `ord` is neither 1 nor 2, then slow path is chosen. Related: #58833 cc ptrblck mcarilli ngimel Pull Request resolved: #62646 Reviewed By: malfet Differential Revision: D32173421 Pulled By: ngimel fbshipit-source-id: 14b7544601658a979b83509df351e1848ded7675

Summary: Pull Request resolved: #67861 Previously submitted as #67197. This got reverted because its failures were hidden by the failures of another PR. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D32178196 Pulled By: navahgar fbshipit-source-id: cc8a5c68aed360d06289e69645461cfa773e1300

Summary: Fixes #66232 This should be the last immediate task. I anticipate test ownership will change overtime but this is the last big thing to close it out Pull Request resolved: #67859 Reviewed By: soulitzer Differential Revision: D32210534 Pulled By: janeyx99 fbshipit-source-id: 7fd835d87d9d35d49ec49de1fcfa29b085133e99

Summary: Per title Pull Request resolved: #67867 Reviewed By: mruberry Differential Revision: D32191053 Pulled By: ngimel fbshipit-source-id: 84eb6c2989495fca5f7b055c4984efe5de94e812

Summary: Fixes autocast + autodiff issue where `RuntimeError: grad_inputs.size() == node->inputs().size()INTERNAL ASSERT FAILED at "../torch/csrc/jit/runtime/autodiff.cpp":426, please report a bug to PyTorch.` Pull Request resolved: #67648 Reviewed By: cpuhrsch Differential Revision: D32083227 Pulled By: davidberard98 fbshipit-source-id: edf526cff4ec21874ae35ec730d13c250073e10c

Summary: Pull Request resolved: #67824 Testing backport of all prod models using model test framework Ref: [Create tests at run-time (google test)](https://stackoverflow.com/questions/19160244/create-tests-at-run-time-google-test) breaking the list of models into 20 chunks based on a simple hash (sum of all char values) ghstack-source-id: 142398833 Test Plan: ``` buck test //xplat/pytorch/mobile/test:test_read_all_mobile_model_configs Starting new Buck daemon... Parsing buck files: finished in 7.6 sec Creating action graph: finished in 0.9 sec [RE] Metadata: Session ID=[reSessionID-66f5adfe-50d1-4599-9828-3e8115181601] [RE] Waiting on 0 remote actions. Completed 1008 actions remotely, action cache hit rate: 43.59%. Downloaded 26/1523 artifacts, 252.60 Kbytes, 96.6% cache miss (for updated rules) Building: finished in 01:18.6 min (100%) 5532/5532 jobs, 770/5532 updated Total time: 01:27.3 min Testing: finished in 11:21.6 min (41 PASS/0 FAIL) BUILD SUCCEEDED RESULTS FOR //xplat/pytorch/mobile/test:test_read_all_mobile_model_configs PASS 673.8s 41 Passed 0 Skipped 0 Failed //xplat/pytorch/mobile/test:test_read_all_mobile_model_configs TESTS PASSED ``` Reviewed By: dhruvbird Differential Revision: D32068955 fbshipit-source-id: d06c2434a4a69572ab52df31a684e5973b9d551c

#67803) Summary: Pull Request resolved: #67803 * Addresses comments from #63589 [ONNX] remove torch::onnx::PRODUCER_VERSION (#67107) Use constants from version.h instead. This simplifies things since we no longer have to update PRODUCER_VERSION for each release. Also add TORCH_VERSION to version.h so that a string is available for this purpose. [ONNX] Set `ir_version` based on opset_version. (#67128) This increases the odds that the exported ONNX model will be usable. Before this change, we were setting the IR version to a value which may be higher than what the model consumer supports. Also some minor clean-up in the test code: * Fix string replacement. * Use a temporary file so as to not leave files around in the test current working directory. Test Plan: Imported from OSS Reviewed By: msaroufim Differential Revision: D32181306 Pulled By: malfet fbshipit-source-id: 02f136d34ef8f664ade0bc1985a584f0e8c2b663 Co-authored-by: BowenBao <[email protected]> Co-authored-by: Gary Miguel <[email protected]> Co-authored-by: Nikita Shulga <[email protected]>

Summary: Pull Request resolved: #67813 Address Shen's comments in https://github.com/pytorch/pytorch/pull/67249/files ghstack-source-id: 142379312 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D32157545 fbshipit-source-id: 3cc2df6d5fa0d3b9383ed3711e7f79729dbb1dda

…ce over view (#67816) Summary: Fixes #67800 Currently when the grad is the same layout as base, we try to assign the same tensor to the forward grad of both the base and the view. However, when the layout of the grad is different from the layout of the view, this triggers a copy to be created, and the tangent of the view (after the inplace) will not have a view relationship with the view of the base. This PR just changes it so that we only do the above optimization when the layout also matches the layout of self Pull Request resolved: #67816 Reviewed By: malfet Differential Revision: D32190021 Pulled By: soulitzer fbshipit-source-id: b1b2c9b332e83f4df5695ee9686ea76447f9305b

…_port (#67775) Summary: Pull Request resolved: #67775 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D32142749 Pulled By: H-Huang fbshipit-source-id: 67ab4ede4f4bff96a1ffd41d55b3be0edc82b1ce

Summary: Many thanks to Forest Yang (meowmix) from the forum for reporting it with a minimal reproduction. Pull Request resolved: #67829 Reviewed By: malfet Differential Revision: D32184786 Pulled By: albanD fbshipit-source-id: b63dbd3148b5def2109deb2f4612c08f55f59dfb

Summary: The final learning rate should be 0.05 like the lr used as the argument for the optimizer and not 0.005. Pull Request resolved: #67840 Reviewed By: jbschlosser Differential Revision: D32187091 Pulled By: albanD fbshipit-source-id: 8aff691bba3896a847d7b9d9d669a65f67a6f066

Summary: Fixes part of #67696 by adding calls to `optimizer.step()` in various places. ## Notes for reviewers: - It is not entirely clear which is the right optimizer to step in each case. I have favoured the more explicit approach of creating a set of optimizers and calling step on each of them. - At the time of writing, the only Scheduler without an `optimizer` instance variable is `ChainedScheduler` which I need to deal with once. I use `hasattr` to do this check. Let me know if this ought to be changed. - I am opening this PR for review when it only solve part of the issue, as I'd rather get feedback sooner. I think it is fine to fix the issue in several PRs too. Pull Request resolved: #67756 Reviewed By: jbschlosser Differential Revision: D32187864 Pulled By: albanD fbshipit-source-id: fd0d133bcaa3a24588e5a997ad198fdf5879ff5a

…info tests Test Plan: revert-hammer Differential Revision: D32063662 (da59bd1) Original commit changeset: 0868235a0ae7 fbshipit-source-id: a4f775874faa88be0eb5272dedf3bbc8194ebde6

…atensor support Test Plan: revert-hammer Differential Revision: D32175963 (57335a9) Original commit changeset: f4d749c6aeaf fbshipit-source-id: 6d68a96cf872c2d7b518c061875b9336bca0043a

Test Plan: revert-hammer Differential Revision: D32175960 (d04389e) Original commit changeset: 2e30115ca554 fbshipit-source-id: 27f9889c535e4f7c21c50b2468e1e6650e952d4f

Test Plan: revert-hammer Differential Revision: D32175958 (8532984) Original commit changeset: 26a9ef41e10a fbshipit-source-id: adcc70687b5b454f358b5446bed2c06d04e61435

…s with custom rules. Test Plan: revert-hammer Differential Revision: D32175957 (b8e165e) Original commit changeset: 1cb51a7b6cbb fbshipit-source-id: 29fd0750d9981758436c55eea2de40cdaddfb9be

Test Plan: revert-hammer Differential Revision: D32175959 (f175431) Original commit changeset: b335dacce709 fbshipit-source-id: 23d1f75d47f15effc9806bd6e5228007d521b0b3

… output tensors (#67554) Summary: Pull Request resolved: #67554 This change adds a comment on clients taking ownership of managed output tensor to remind SR developers of how and why that matters. Test Plan: N/A Reviewed By: swolchok Differential Revision: D32013468 fbshipit-source-id: bcc13055c329c61677bdcc76411fe8db44bb2cee

Summary: Pull Request resolved: #67888 Implement padding with slice layer, work step is: reverse slice and pad 0 [1, 2] => [2, 1, 0 ... 0] transpose, reverse tensor back to original order, finish pre-pad [2, 1, 0 ... 0] => [0 ... 0, 1, 2] continue post-pad [0 ... 0, 1, 2] => [0 ... 0, 1, 2, 0 ... 0] Test Plan: buck test mode/dev-nosan caffe2/test/fx2trt/converters:test_pad Reviewed By: 842974287 Differential Revision: D32160739 fbshipit-source-id: dbbc04d916e23551e3ce9be480283377e9a38b34

Feb	NOV	Feb
	08
2019	2021	2022

pytorch / pytorch Public

Commits on Nov 7, 2021

Commits on Nov 6, 2021

Commits on Nov 5, 2021