clscurves.plotter package

Submodules

clscurves.plotter.cost module

class clscurves.plotter.cost.CostPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_cost([title, cmap, log_scale, x_col, …])

Plot the “Misclassification Cost” curve.

compute_cost

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)None

Initialize self. See help(type(self)) for accurate signature.

compute_cost(fn_cost_multiplier=1, fp_cost_multiplier=1, use_weighted_fn=False, use_weighted_fp=False)None
plot_cost(title: str = 'Misclassification Cost', cmap: str = 'rainbow', log_scale: bool = False, x_col: str = 'thresh', x_label: Optional[str] = None, x_rng: Optional[List[float]] = None, y_label: str = 'Cost', y_rng: Optional[List[float]] = None, color_by: str = 'frac', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, return_fig: bool = False)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]

Plot the “Misclassification Cost” curve.

Note: compute_cost must be run first to obtain cost values.

Parameters
title

Title of plot.

cmap

Colormap string specification.

log_scale

Boolean to specify whether the x-axis should be scaled by a log10 transformation.

x_col

Name of column in metrics.curves that specifies which values to use for the x coordinates of the cost curve.

x_label

Label to apply to x-axis. Defaults for common choices of x-axis will be supplied if no x-label override is supplied here.

x_rng

Specify an x-axis range of the form [min_value, max_value] to override the default range.

y_label

Label to apply to y-axis.

y_rng

Specify a y-axis range of the form [min_value, max_value] to override the default range.

color_by

Name of column in metrics.curves that specifies which values to use when coloring points along the cost curve.

cbar_rng

Specify a color bar range of the form [min_value, max_value] to override the default range.

cbar_label

Custom label to apply to the color bar. If None is supplied, the default (“Fraction Flagged”) will be used.

grid

Whether to plot grid lines.

dpi

Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.

bootstrapped

Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.

bootstrap_alpha

Opacity of bootstrap curves.

bootstrap_color

Color of bootstrap curves.

imputed

Whether to plot imputed curves.

return_fig

If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

Returns
Optional[Tuple[plt.figure, plt.axes]]

The plot’s figure and axis object.

clscurves.plotter.dist module

class clscurves.plotter.dist.DistPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool, reverse_thresh: bool)

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_dist([weighted, label, kind, …])

Plot the data distribution.

plot_cdf

plot_pdf

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool, reverse_thresh: bool)None

Initialize self. See help(type(self)) for accurate signature.

plot_cdf(**kwargs)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]
plot_dist(weighted: bool = False, label: Literal[all, 0, 1, None] = 'all', kind: str = 'CDF', kernel_size: float = 10, log_scale: bool = False, title: Optional[str] = None, cmap: str = 'rainbow', color_by: str = 'recall', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, x_rng: Optional[List[float]] = None, y_rng: Optional[List[float]] = None, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, return_fig: bool = False)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]

Plot the data distribution.

This plots either the CDF (Cumulative Distribution Function) or PDF (Probability Density Function) curve.

Parameters
weighted

Specifies whether the weighted or unweighted fraction flagged should be used when computing the CDF or PDF. If unweighted, the fraction flagged is the number of cases flagged divided by the number of cases total. If weighted, it is the sum of the weights of all the cases flagged, divided by the sum of the weights of all the cases.

label

Class label to plot the CDF for; one of “all”, 1, 0, or None.

kind

Either “cdf” or “pdf”.

kernel_size

Used for PDF only: standard deviation of the Gaussian of kernel to use when smoothing the PDF curve.

log_scale

Boolean to specify whether the x-axis should be log-scaled.

title

Title of plot.

cmap

Colormap string specification.

color_by

Name of key in metrics.curves that specifies which values to use when coloring points along the PDF or CDF curve.

cbar_rng

Specify a color bar range of the form [min_value, max_value] to override the default range.

cbar_label

Custom label to apply to the color bar. If None is supplied, a default will be selected from the cbar_dict.

grid

Whether to plot grid lines.

x_rng

Range of the horizontal axis.

y_rng

Range of the vertical axis.

dpi

Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.

bootstrapped

Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.

bootstrap_alpha

Opacity of bootstrap curves.

bootstrap_color

Color of bootstrap curves.

imputed

Whether to plot imputed curves.

return_fig

If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

Returns
Optional[Tuple[plt.figure, plt.axes]]

The plot’s figure and axis object.

plot_pdf(**kwargs)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]

clscurves.plotter.plotter module

class clscurves.plotter.plotter.MetricsPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)

Bases: clscurves.config.MetricsAliases

A helper class to provide methods shared by each metrics plotter.

These methods streamline the process of making a single classification curve metrics plot, making a bootstrapped plot, and adding a confidence ellipse to a specified operating point.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)None

Initialize self. See help(type(self)) for accurate signature.

clscurves.plotter.pr module

class clscurves.plotter.pr.PRPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_pr([weighted, title, cmap, color_by, …])

Plot the PR (Precision & Recall) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)None

Initialize self. See help(type(self)) for accurate signature.

plot_pr(weighted: bool = False, title: str = 'Precision-Recall Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, f1_contour: bool = False, grid: bool = True, op_value: Optional[float] = None, return_fig: bool = False)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]

Plot the PR (Precision & Recall) curve.

Parameters
weighted

Specifies whether the weighted or unweighted recall (i.e. TPR) should be used. For example, recall (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the PR curve, this weighting applies only to the recall axis.

title

Title of plot.

cmap

Colormap string specification.

color_by

Name of key in metrics.curves that specifies which values to use when coloring points along the PR curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.

cbar_rng

Specify a color bar range of the form [min_value, max_value] to override the default range.

cbar_label

Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.

dpi

Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.

bootstrapped

Sspecifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.

bootstrap_alpha

Opacity of bootstrap curves.

bootstrap_color

Color of bootstrap curves.

imputed

Whether to plot imputed curves.

f1_contour

Whether to include reference contours for curves of constant F1.

grid

Whether to plot grid lines.

op_value

Threshold value to plot a confidence ellipse for when the plot is bootstrapped.

return_fig

If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

clscurves.plotter.prg module

class clscurves.plotter.prg.PRGPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)

Bases: clscurves.plotter.plotter.MetricsPlotter

Plot the PRG (Precision-Recall-Gain) curve.

“Precision-Recall-Gain” plots defined here.

A “weighted recall” variant of Recall Gain is not discussed in the paper, and may not have any theoretical guarantees or reasonable interpretations. Because it’s unclear how a positive-example weight should affect the quantities in a PRG plot, we do not support weighted plotting at this time.

Methods

plot_prg([title, cmap, color_by, cbar_rng, …])

Plot the PRG (Precision-Recall-Gain) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)None

Initialize self. See help(type(self)) for accurate signature.

plot_prg(title: str = 'Precision-Recall-Gain Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]

Plot the PRG (Precision-Recall-Gain) curve.

Parameters
title

Title of plot.

cmap

Colormap string specification.

color_by

Name of key in metrics.curves that specifies which values to use when coloring points along the PRG curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.

cbar_rng

Specify a color bar range of the form [min_value, max_value] to override the default range.

cbar_label

Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.

grid

Whether to plot grid lines.

dpi

Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.

bootstrapped

Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.

bootstrap_alpha

Opacity of bootstrap curves.

bootstrap_color

Color of bootstrap curves.

imputed

Whether to plot imputed curves.

op_value

Threshold value to plot a confidence ellipse for when the plot is bootstrapped.

return_fig

If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

clscurves.plotter.rf module

class clscurves.plotter.rf.RFPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_rf([weighted, scale, title, cmap, …])

Plot the RF (Recall & Fraction Flagged) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)None

Initialize self. See help(type(self)) for accurate signature.

plot_rf(weighted: bool = False, scale: float = 1.0, title: str = 'Recall-Fraction Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]

Plot the RF (Recall & Fraction Flagged) curve.

Parameters
weighted

Specifies whether the weighted or unweighted TPR (i.e. recall) should be used. For example, TPR (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the RF curve, this weighting applies only to the TPR axis.

scale

Value to scale the x-axis by. By choosing a lower number like 0.1, you can observe greater detail in sensitive regions of the RF curve.

title

Title of plot.

cmap

Colormap string specification.

color_by

Name of key in metrics.curves that specifies which values to use when coloring points along the ROC curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.

cbar_rng

Specify a color bar range of the form [min_value, max_value] to override the default range.

cbar_label

Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.

grid

Whether to plot grid lines.

dpi

Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.

bootstrapped

Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.

bootstrap_alpha

Opacity of bootstrap curves.

bootstrap_color

Color of bootstrap curves.

imputed

Whether to plot imputed curves.

op_value

Threshold value to plot a confidence ellipse for when the plot is bootstrapped.

return_fig

If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

clscurves.plotter.roc module

class clscurves.plotter.roc.ROCPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_roc([weighted, title, cmap, color_by, …])

Plot the ROC (Receiver Operating Characteristic) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)None

Initialize self. See help(type(self)) for accurate signature.

plot_roc(weighted: bool = False, title: str = 'ROC Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False)Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]

Plot the ROC (Receiver Operating Characteristic) curve.

Parameters
weighted

Specifies whether the weighted or unweighted TPR and FRP should be used. For example, TPR (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the ROC curve, this weighting applies to both the TPR and FPR axis.

title

Title of plot.

cmap

Colormap string specification.

color_by

Name of key in metrics.curves that specifies which values to use when coloring points along the ROC curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.

cbar_rng

Specify a color bar range of the form [min_value, max_value] to override the default range.

cbar_label

Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.

grid

Whether to plot grid lines.

dpi

Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.

bootstrapped

Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.

bootstrap_alpha

Opacity of bootstrap curves.

bootstrap_color

Color of bootstrap curves.

imputed

Whether to plot imputed curves.

op_value

Threshold value to plot a confidence ellipse for when the plot is bootstrapped.

return_fig

If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

Module contents