The Wayback Machine - https://web.archive.org/web/20250629163014/https://github.com/scikit-learn/scikit-learn/issues/31635
Skip to content

Two bugs in sklearn.metrics.roc_curve: drop_intermediate=True option #31635

Open
@gberriz-mgb

Description

@gberriz-mgb

Describe the bug

The function sklearn.metrics.roc_curve contains two separate (but potentially interacting) bugs related to the drop_intermediate=True option. This report describes both.


Bug 1: Incorrect Ordering of drop_intermediate Relative to Initial Point Prepending

When drop_intermediate=True (the default), roc_curve attempts to simplify the ROC curve by removing intermediate points—those that are collinear with their neighbors and therefore do not affect the curve's shape.

However, intermediate points are dropped before the initial point (0, 0) and the threshold inf are prepended to the results. This causes incorrect retention of points that would otherwise be considered intermediate if the full curve were evaluated from the start.

Example:

y_true  = numpy.array([0, 0, 0, 0, 1, 1, 1, 1])
y_score = numpy.array([0, 1, 2, 3, 4, 5, 6, 7])

In this case, a threshold of 4 perfectly separates class 0 from class 1. The expected simplified ROC curve should be:

fpr = [0., 0., 1.]
tpr = [0., 1., 1.]
thresholds = [inf, 4., 0.]

Instead, the actual output is:

fpr = [0., 0., 0., 1.]
tpr = [0., 0.25, 1., 1.]
thresholds = [inf, 7., 4., 0.]

The point (0., 0.25) is redundant but retained, because it is evaluated before (0., 0.) is prepended—leading to an incorrect assessment of its relevance.

Root Cause:

# Incorrect order: intermediates dropped before prepending
fps, tps, thresholds = _binary_clf_curve(...)

if drop_intermediate:
    # identify and drop intermediates
    ...

# only afterward:
fps = numpy.r_[0, fps]
tps = numpy.r_[0, tps]
thresholds = numpy.r_[inf, thresholds]

Recommended Fix:

Reorder the operations so that the initial point is prepended before identifying intermediate points:

fps, tps, thresholds = _binary_clf_curve(...)

# Prepend start of curve
fps = numpy.r_[0, fps]
tps = numpy.r_[0, tps]
thresholds = numpy.r_[numpy.inf, thresholds]

if drop_intermediate:
    optimal_idxs = ...
    fps = fps[optimal_idxs]
    tps = tps[optimal_idxs]
    thresholds = thresholds[optimal_idxs]

Bug 2: Faulty Heuristic for Identifying Intermediate Points

Even with the correct ordering of operations in place, the logic for identifying intermediate points is flawed.

Currently, roc_curve uses this heuristic:

optimal_idxs = numpy.where(
    numpy.r_[True,
             numpy.logical_or(numpy.diff(fps, 2), numpy.diff(tps, 2)),
             True]
)[0]

This retains the first and last points, and any point for which the second difference in fps or tps is nonzero—intended to approximate convex changes.

However, this approach removes only intermediate points that lie exactly midway between their neighbors. It fails to remove redundant points in many valid cases.


Example 1:

y_true  = numpy.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1])
y_score = numpy.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4])

Here again, a threshold of 4 cleanly separates the positive class. The minimal correct ROC curve is therefore:

fpr = [0., 0., 1.]
tpr = [0., 1., 1.]
thresholds = [inf, 4., 0.]

Actual output:

fpr = [0., 0., 0.1, 0.3, 0.6, 1.]
tpr = [0., 1., 1., 1., 1., 1.]
thresholds = [inf, 4., 3., 2., 1., 0.]

Several intermediate points along the horizontal segment (0, 1) to (1, 1) are incorrectly retained.


Example 2:

y_true  = numpy.array([0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1])
y_score = numpy.array([0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3])

Here, at every score threshold, class distributions are balanced—no discriminatory power exists. The minimal ROC should be:

fpr = [0., 1.]
tpr = [0., 1.]
thresholds = [inf, 0.]

Instead, the actual output is:

fpr = [0., 0.4, 0.7, 0.9, 1.]
tpr = [0., 0.4, 0.7, 0.9, 1.]
thresholds = [inf, 3., 2., 1., 0.]

This inflates the ROC curve with redundant points that lie on the same diagonal segment.


Cause:

The use of second-order differences (diff(..., 2)) only identifies intermediate points that lie exactly midway between neighbors.


Recommended Fix:

Use a geometric test for collinearity based on vector cross-products:

def collinear_free_mask(x, y, tolerance=1e-12):
    dx0 = x[1:-1] - x[:-2]
    dx1 = x[2:] - x[1:-1]
    dy0 = y[1:-1] - y[:-2]
    dy1 = y[2:] - y[1:-1]
    is_collinear = numpy.abs(dx0 * dy1 - dy0 * dx1) < tolerance
    return numpy.flatnonzero(numpy.r_[True, ~is_collinear, True])

Replace:

optimal_idxs = numpy.where(...)[0]

with:

optimal_idxs = collinear_free_mask(fps, tps)

This approach correctly identifies and removes redundant points, regardless of their spacing or positioning.

Steps/Code to Reproduce

import numpy
import sklearn

print(sklearn.metrics.roc_curve(numpy.array([0, 0, 0, 0, 1, 1, 1, 1]), numpy.array([0, 1, 2, 3, 4, 5, 6, 7])))
print(sklearn.metrics.roc_curve(numpy.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]), numpy.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4])))
print(sklearn.metrics.roc_curve(numpy.array([0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1]), numpy.array([0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3])))

Expected Results

(array([0., 0., 1.]), array([0., 1., 1.]), array([inf,  4.,  0.]))
(array([0., 0., 1.]), array([0., 1., 1.]), array([inf,  4.,  0.]))
(array([0., 1.]), array([0., 1.]), array([inf,  0.]))

Actual Results

(array([0., 0., 0., 1.]), array([0.  , 0.25, 1.  , 1.  ]), array([inf,  7.,  4.,  0.]))
(array([0. , 0. , 0.1, 0.3, 0.6, 1. ]), array([0., 1., 1., 1., 1., 1.]), array([inf,  4.,  3.,  2.,  1.,  0.]))
(array([0. , 0.4, 0.7, 0.9, 1. ]), array([0. , 0.4, 0.7, 0.9, 1. ]), array([inf,  3.,  2.,  1.,  0.]))

Versions

System:
    python: 3.11.13 (main, Jun  3 2025, 18:38:25) [Clang 16.0.0 (clang-1600.0.26.6)]
executable: <PATH_TO_VIRTUALENV>/bin/python3
   machine: macOS-14.4.1-arm64-arm-64bit

Python dependencies:
      sklearn: 1.6.1
          pip: 25.1.1
   setuptools: 67.7.2
        numpy: 1.25.2
        scipy: 1.11.1
       Cython: None
       pandas: 2.2.3
   matplotlib: 3.8.0
       joblib: 1.5.1
threadpoolctl: 3.6.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: <PATH_TO_VIRTUALENV>/lib/python3.11/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
        version: 0.3.23.dev
threading_layer: pthreads
   architecture: armv8

       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: <PATH_TO_VIRTUALENV>/lib/python3.11/site-packages/scipy/.dylibs/libopenblas.0.dylib
        version: 0.3.21.dev
threading_layer: pthreads
   architecture: armv8

       user_api: openmp
   internal_api: openmp
    num_threads: 10
         prefix: libomp
       filepath: <PATH_TO_VIRTUALENV>/lib/python3.11/site-packages/sklearn/.dylibs/libomp.dylib
        version: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions