Convert num_kernels to int64 before calling into CUDA GET_BLOCKS #44472

@walterddr in the python 3.x the size of int is not defined, it would work as well for 64 bit.
I'm not getting your issue, can you explain it using an example?

kshitij12345 · 2020-09-11T11:54:36Z

@dipanshug124 You can look at #43476 (comment) for more context.

Fowley-P · 2020-09-12T20:58:30Z

Hey, I'm looking at fixing this issue, but as someone who doesn't have a lot of experience with the size of datasets used for this, would a u32 suffice, or does it have to be i64? I'd like to avoid using the full i64 if possible, especially for something like CUDA kernels, which can't have a negative value.

Randl · 2020-09-13T23:41:26Z

I don't think u32 is enough (or will soon be not enough). Not sure whether negative values (say, -1) are used for some purposes, else you could use u64.

ngimel · 2020-09-13T23:45:47Z

i64 in this case won't get passed to cuda kernels, it is used to compute number of blocks without overflow.

Fowley-P · 2020-09-14T04:25:00Z

Yeah I mentioned the negative value aspect of it because it's specifically to compute the number of kernels needed, and you'll never have negative values for the kernels so might as well use uints instead of ints to increase the capacity for almost no cost. And with respect to the extra space, I try to avoid using unnecessary space no matter which part of the code I'm working on. I'll get to work making the changes, but it'll take me a bit to make the changes in all the places I need to. Thanks for your input.

walterddr · 2020-09-14T15:14:35Z

@Fowley-P sorry for the late reply. this issue is to address the legacy int32 computation to prevent using overflowed value to compute # of blocks needed. so (1) doesnt matter whether it is unsign/sign since it cannot be more than MAX_INT32 # of blocks. it will error out way sooner. (2) dont worry about covering every single instance. please feel free to publish a PR when you at least address the 4 examples above. --Rong


        Update on "Convert num_kernels to int64 before calling into CUDA GET_…

…BLOCKS" this addresses the example int64 issue in #44472. Differential Revision: [D23699819](https://our.internmc.facebook.com/intern/diff/D23699819) [ghstack-poisoned]

walterddr · 2020-09-15T22:47:14Z

only partially closed via #44688

walterddr mentioned this issue Sep 10, 2020

use non-overflowing divide in cuda kernel util GET_BLOCKS #44391

Closed

walterddr added module: cuda topic: 64-bit good first issue and removed good first issue labels Sep 10, 2020

colesbury added the triaged label Sep 10, 2020

walterddr added the good first issue label Sep 10, 2020

soulitzer mentioned this issue Sep 15, 2020

Convert num_kernels to int64 before calling into CUDA GET_BLOCKS #44688

Closed

facebook-github-bot closed this in 993b465 Sep 15, 2020

walterddr reopened this Sep 15, 2020

walterddr linked a pull request that will close this issue Sep 16, 2020

fix legacy GET_BLOCKS code from THCUNN/common.h #44789

Open

Aug	SEP	Oct
	17
2019	2020	2021

pytorch / pytorch

Convert num_kernels to int64 before calling into CUDA GET_BLOCKS #44472

Convert num_kernels to int64 before calling into CUDA GET_BLOCKS #44472

walterddr commented Sep 10, 2020 •

edited by pytorch-probot bot

dipanshug124 commented Sep 11, 2020

kshitij12345 commented Sep 11, 2020

Fowley-P commented Sep 12, 2020

Randl commented Sep 13, 2020

ngimel commented Sep 13, 2020

Fowley-P commented Sep 14, 2020

walterddr commented Sep 14, 2020

walterddr commented Sep 15, 2020

pytorch / pytorch

Join GitHub today

Convert num_kernels to int64 before calling into CUDA GET_BLOCKS #44472

Convert num_kernels to int64 before calling into CUDA GET_BLOCKS #44472

Comments

walterddr commented Sep 10, 2020 • edited by pytorch-probot bot

🐛 Bug

Additional context

dipanshug124 commented Sep 11, 2020

kshitij12345 commented Sep 11, 2020

Fowley-P commented Sep 12, 2020

Randl commented Sep 13, 2020

ngimel commented Sep 13, 2020

Fowley-P commented Sep 14, 2020

walterddr commented Sep 14, 2020

walterddr commented Sep 15, 2020

walterddr commented Sep 10, 2020 •

edited by pytorch-probot bot