clscurves.plotter package¶

Submodules¶

clscurves.plotter.cost module¶

class clscurves.plotter.cost.CostPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_cost([title, cmap, log_scale, x_col, …])

Plot the “Misclassification Cost” curve.

compute_cost

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶: Initialize self. See help(type(self)) for accurate signature.

compute_cost(fn_cost_multiplier=1, fp_cost_multiplier=1, use_weighted_fn=False, use_weighted_fp=False) → None¶

plot_cost(title: str = 'Misclassification Cost', cmap: str = 'rainbow', log_scale: bool = False, x_col: str = 'thresh', x_label: Optional[str] = None, x_rng: Optional[List[float]] = None, y_label: str = 'Cost', y_rng: Optional[List[float]] = None, color_by: str = 'frac', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

Plot the “Misclassification Cost” curve.

Note: compute_cost must be run first to obtain cost values.

Parameters

title: Title of plot.
cmap: Colormap string specification.
log_scale: Boolean to specify whether the x-axis should be scaled by a log10 transformation.
x_col: Name of column in metrics.curves that specifies which values to use for the x coordinates of the cost curve.
x_label: Label to apply to x-axis. Defaults for common choices of x-axis will be supplied if no x-label override is supplied here.
x_rng: Specify an x-axis range of the form [min_value, max_value] to override the default range.
y_label: Label to apply to y-axis.
y_rng: Specify a y-axis range of the form [min_value, max_value] to override the default range.
color_by: Name of column in metrics.curves that specifies which values to use when coloring points along the cost curve.
cbar_rng: Specify a color bar range of the form [min_value, max_value] to override the default range.
cbar_label: Custom label to apply to the color bar. If None is supplied, the default (“Fraction Flagged”) will be used.
grid: Whether to plot grid lines.
dpi: Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
bootstrapped: Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
bootstrap_alpha: Opacity of bootstrap curves.
bootstrap_color: Color of bootstrap curves.
imputed: Whether to plot imputed curves.
return_fig: If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

Returns

Optional[Tuple[plt.figure, plt.axes]]: The plot’s figure and axis object.

clscurves.plotter.dist module¶

class clscurves.plotter.dist.DistPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool, reverse_thresh: bool)¶

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_dist([weighted, label, kind, …])

Plot the data distribution.

plot_cdf
plot_pdf

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool, reverse_thresh: bool) → None¶: Initialize self. See help(type(self)) for accurate signature.

plot_cdf(**kwargs) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

plot_dist(weighted: bool = False, label: Literal[all, 0, 1, None] = 'all', kind: str = 'CDF', kernel_size: float = 10, log_scale: bool = False, title: Optional[str] = None, cmap: str = 'rainbow', color_by: str = 'recall', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, x_rng: Optional[List[float]] = None, y_rng: Optional[List[float]] = None, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

Plot the data distribution.

This plots either the CDF (Cumulative Distribution Function) or PDF (Probability Density Function) curve.

Parameters

weighted: Specifies whether the weighted or unweighted fraction flagged should be used when computing the CDF or PDF. If unweighted, the fraction flagged is the number of cases flagged divided by the number of cases total. If weighted, it is the sum of the weights of all the cases flagged, divided by the sum of the weights of all the cases.
label: Class label to plot the CDF for; one of “all”, 1, 0, or None.
kind: Either “cdf” or “pdf”.
kernel_size: Used for PDF only: standard deviation of the Gaussian of kernel to use when smoothing the PDF curve.
log_scale: Boolean to specify whether the x-axis should be log-scaled.
title: Title of plot.
cmap: Colormap string specification.
color_by: Name of key in metrics.curves that specifies which values to use when coloring points along the PDF or CDF curve.
cbar_rng: Specify a color bar range of the form [min_value, max_value] to override the default range.
cbar_label: Custom label to apply to the color bar. If None is supplied, a default will be selected from the cbar_dict.
grid: Whether to plot grid lines.
x_rng: Range of the horizontal axis.
y_rng: Range of the vertical axis.
dpi: Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
bootstrapped: Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
bootstrap_alpha: Opacity of bootstrap curves.
bootstrap_color: Color of bootstrap curves.
imputed: Whether to plot imputed curves.
return_fig: If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

Returns

Optional[Tuple[plt.figure, plt.axes]]: The plot’s figure and axis object.

plot_pdf(**kwargs) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

clscurves.plotter.plotter module¶

class clscurves.plotter.plotter.MetricsPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶

Bases: clscurves.config.MetricsAliases

A helper class to provide methods shared by each metrics plotter.

These methods streamline the process of making a single classification curve metrics plot, making a bootstrapped plot, and adding a confidence ellipse to a specified operating point.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶: Initialize self. See help(type(self)) for accurate signature.

clscurves.plotter.pr module¶

class clscurves.plotter.pr.PRPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_pr([weighted, title, cmap, color_by, …])

Plot the PR (Precision & Recall) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶: Initialize self. See help(type(self)) for accurate signature.

plot_pr(weighted: bool = False, title: str = 'Precision-Recall Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, f1_contour: bool = False, grid: bool = True, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

Plot the PR (Precision & Recall) curve.

Parameters

weighted: Specifies whether the weighted or unweighted recall (i.e. TPR) should be used. For example, recall (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the PR curve, this weighting applies only to the recall axis.
title: Title of plot.
cmap: Colormap string specification.
color_by: Name of key in metrics.curves that specifies which values to use when coloring points along the PR curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
cbar_rng: Specify a color bar range of the form [min_value, max_value] to override the default range.
cbar_label: Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.
dpi: Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
bootstrapped: Sspecifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
bootstrap_alpha: Opacity of bootstrap curves.
bootstrap_color: Color of bootstrap curves.
imputed: Whether to plot imputed curves.
f1_contour: Whether to include reference contours for curves of constant F1.
grid: Whether to plot grid lines.
op_value: Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
return_fig: If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

clscurves.plotter.prg module¶

class clscurves.plotter.prg.PRGPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶

Bases: clscurves.plotter.plotter.MetricsPlotter

Plot the PRG (Precision-Recall-Gain) curve.

“Precision-Recall-Gain” plots defined here.

A “weighted recall” variant of Recall Gain is not discussed in the paper, and may not have any theoretical guarantees or reasonable interpretations. Because it’s unclear how a positive-example weight should affect the quantities in a PRG plot, we do not support weighted plotting at this time.

Methods

plot_prg([title, cmap, color_by, cbar_rng, …])

Plot the PRG (Precision-Recall-Gain) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶: Initialize self. See help(type(self)) for accurate signature.

plot_prg(title: str = 'Precision-Recall-Gain Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

Plot the PRG (Precision-Recall-Gain) curve.

Parameters

title: Title of plot.
cmap: Colormap string specification.
color_by: Name of key in metrics.curves that specifies which values to use when coloring points along the PRG curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
cbar_rng: Specify a color bar range of the form [min_value, max_value] to override the default range.
cbar_label: Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.
grid: Whether to plot grid lines.
dpi: Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
bootstrapped: Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
bootstrap_alpha: Opacity of bootstrap curves.
bootstrap_color: Color of bootstrap curves.
imputed: Whether to plot imputed curves.
op_value: Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
return_fig: If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

clscurves.plotter.rf module¶

class clscurves.plotter.rf.RFPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_rf([weighted, scale, title, cmap, …])

Plot the RF (Recall & Fraction Flagged) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶: Initialize self. See help(type(self)) for accurate signature.

plot_rf(weighted: bool = False, scale: float = 1.0, title: str = 'Recall-Fraction Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

Plot the RF (Recall & Fraction Flagged) curve.

Parameters

weighted: Specifies whether the weighted or unweighted TPR (i.e. recall) should be used. For example, TPR (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the RF curve, this weighting applies only to the TPR axis.
scale: Value to scale the x-axis by. By choosing a lower number like 0.1, you can observe greater detail in sensitive regions of the RF curve.
title: Title of plot.
cmap: Colormap string specification.
color_by: Name of key in metrics.curves that specifies which values to use when coloring points along the ROC curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
cbar_rng: Specify a color bar range of the form [min_value, max_value] to override the default range.
cbar_label: Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.
grid: Whether to plot grid lines.
dpi: Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
bootstrapped: Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
bootstrap_alpha: Opacity of bootstrap curves.
bootstrap_color: Color of bootstrap curves.
imputed: Whether to plot imputed curves.
op_value: Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
return_fig: If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

clscurves.plotter.roc module¶

class clscurves.plotter.roc.ROCPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶

Bases: clscurves.plotter.plotter.MetricsPlotter

Methods

plot_roc([weighted, title, cmap, color_by, …])

Plot the ROC (Receiver Operating Characteristic) curve.

__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶: Initialize self. See help(type(self)) for accurate signature.

plot_roc(weighted: bool = False, title: str = 'ROC Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶

Plot the ROC (Receiver Operating Characteristic) curve.

Parameters

weighted: Specifies whether the weighted or unweighted TPR and FRP should be used. For example, TPR (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the ROC curve, this weighting applies to both the TPR and FPR axis.
title: Title of plot.
cmap: Colormap string specification.
color_by: Name of key in metrics.curves that specifies which values to use when coloring points along the ROC curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
cbar_rng: Specify a color bar range of the form [min_value, max_value] to override the default range.
cbar_label: Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.
grid: Whether to plot grid lines.
dpi: Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
bootstrapped: Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
bootstrap_alpha: Opacity of bootstrap curves.
bootstrap_color: Color of bootstrap curves.
imputed: Whether to plot imputed curves.
op_value: Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
return_fig: If set to True, will return (fig, ax) as a tuple instead of plotting the figure.

clscurves.plotter package¶

Submodules¶

clscurves.plotter.cost module¶

clscurves.plotter.dist module¶

clscurves.plotter.plotter module¶

clscurves.plotter.pr module¶

clscurves.plotter.prg module¶

clscurves.plotter.rf module¶

clscurves.plotter.roc module¶

Module contents¶