MAINT Improve source generation using Tempita
#20481
Conversation
Tempita
for WeightVector
…vector` templates
…the`dtypes` list in Tempita `.tp` files
Answering @NicolasHug's comments on #13358:
I removed the |
from ..utils._weight_vector cimport WeightVector64 as WeightVector | ||
from ..utils._seq_dataset cimport SequentialDataset64 as SequentialDataset |
jjerphan
Jul 15, 2021
Member
To follow-up with @jeremiedbb's comment (50de266#r261996527): to me, this looks fine as the import is done similarly for SequentialDataset
; am I missing something?
To follow-up with @jeremiedbb's comment (50de266#r261996527): to me, this looks fine as the import is done similarly for SequentialDataset
; am I missing something?
ogrisel
Aug 5, 2021
Member
I cannot see the original comment but this looks fine to me as well.
I cannot see the original comment but this looks fine to me as well.
…emplates are included.
This reverts commit 9456ba3.
89cca51
to
9452515
Tempita
Tempita
Here are some final suggestions and a comment for adapting a numerical hard-coded value. |
for j in range(xnnz): | ||
idx = x_ind_ptr[j] | ||
innerprod += w_data_ptr[idx] * x_data_ptr[j] | ||
innerprod *= self.wscale | ||
return innerprod | ||
|
||
cdef void scale(self, double c) nogil: | ||
cdef void scale(self, {{c_type}} c) nogil: |
jjerphan
Aug 3, 2021
•
Member
Shall the epsilon value for the comparision on self.wscale
(at line 197) be adapted based on c_type
?
Shall the epsilon value for the comparision on self.wscale
(at line 197) be adapted based on c_type
?
ogrisel
Aug 5, 2021
Member
Indeed, it's probably a good idea to use 1e-7
or even 1e-6
when self.wscale
has type float32
. Otherwise I am not sure this condition will even be triggered on float32 data.
Or we could just use 1e-6
for both dtypes. But maybe this will have non-trivial impacts on the performance of SGDClassifier
and co on np.float64
which we should try to avoid introducing as part of this PR.
Unfortunately it's not easy to write tests for this because this is a Cython class with cdef
methods.
Maybe we can set epsilon to 1e-6
for float32
and keep 1e-9
unchanged for float64
and just write tests for the float32 case at the public Python API level only in test_sgd.py
in a subsequent PR where we enable float32 support for linear SGD models.
Indeed, it's probably a good idea to use 1e-7
or even 1e-6
when self.wscale
has type float32
. Otherwise I am not sure this condition will even be triggered on float32 data.
Or we could just use 1e-6
for both dtypes. But maybe this will have non-trivial impacts on the performance of SGDClassifier
and co on np.float64
which we should try to avoid introducing as part of this PR.
Unfortunately it's not easy to write tests for this because this is a Cython class with cdef
methods.
Maybe we can set epsilon to 1e-6
for float32
and keep 1e-9
unchanged for float64
and just write tests for the float32 case at the public Python API level only in test_sgd.py
in a subsequent PR where we enable float32 support for linear SGD models.
Co-authored-by: Julien Jerphanion <[email protected]>
Co-authored-by: Julien Jerphanion <[email protected]>
LGTM once the following comments are addressed: |
for j in range(xnnz): | ||
idx = x_ind_ptr[j] | ||
innerprod += w_data_ptr[idx] * x_data_ptr[j] | ||
innerprod *= self.wscale | ||
return innerprod | ||
|
||
cdef void scale(self, double c) nogil: | ||
cdef void scale(self, {{c_type}} c) nogil: |
ogrisel
Aug 5, 2021
Member
Indeed, it's probably a good idea to use 1e-7
or even 1e-6
when self.wscale
has type float32
. Otherwise I am not sure this condition will even be triggered on float32 data.
Or we could just use 1e-6
for both dtypes. But maybe this will have non-trivial impacts on the performance of SGDClassifier
and co on np.float64
which we should try to avoid introducing as part of this PR.
Unfortunately it's not easy to write tests for this because this is a Cython class with cdef
methods.
Maybe we can set epsilon to 1e-6
for float32
and keep 1e-9
unchanged for float64
and just write tests for the float32 case at the public Python API level only in test_sgd.py
in a subsequent PR where we enable float32 support for linear SGD models.
Indeed, it's probably a good idea to use 1e-7
or even 1e-6
when self.wscale
has type float32
. Otherwise I am not sure this condition will even be triggered on float32 data.
Or we could just use 1e-6
for both dtypes. But maybe this will have non-trivial impacts on the performance of SGDClassifier
and co on np.float64
which we should try to avoid introducing as part of this PR.
Unfortunately it's not easy to write tests for this because this is a Cython class with cdef
methods.
Maybe we can set epsilon to 1e-6
for float32
and keep 1e-9
unchanged for float64
and just write tests for the float32 case at the public Python API level only in test_sgd.py
in a subsequent PR where we enable float32 support for linear SGD models.
Lines 65 to 70 in 81dde3a |
Co-authored-by: Julien Jerphanion <[email protected]>
Still LGTM, just another suggestion: |
LGTM. Some small doc things otherwise everything looks good. |
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
4b8cd88
into
scikit-learn:main
Thanks @mbatoul |
Reference Issues/PRs
Relates to #11000.
Supersedes #13358.
Closes #13358
What does this implement/fix? Explain your changes.
This work builds on top of @massich's work (see #13358):