clscurves.plotter package¶
Submodules¶
clscurves.plotter.cost module¶
-
class
clscurves.plotter.cost.CostPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶ Bases:
clscurves.plotter.plotter.MetricsPlotterMethods
plot_cost([title, cmap, log_scale, x_col, …])Plot the “Misclassification Cost” curve.
compute_cost
-
__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
compute_cost(fn_cost_multiplier=1, fp_cost_multiplier=1, use_weighted_fn=False, use_weighted_fp=False) → None¶
-
plot_cost(title: str = 'Misclassification Cost', cmap: str = 'rainbow', log_scale: bool = False, x_col: str = 'thresh', x_label: Optional[str] = None, x_rng: Optional[List[float]] = None, y_label: str = 'Cost', y_rng: Optional[List[float]] = None, color_by: str = 'frac', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶ Plot the “Misclassification Cost” curve.
Note: compute_cost must be run first to obtain cost values.
- Parameters
- title
Title of plot.
- cmap
Colormap string specification.
- log_scale
Boolean to specify whether the x-axis should be scaled by a log10 transformation.
- x_col
Name of column in metrics.curves that specifies which values to use for the x coordinates of the cost curve.
- x_label
Label to apply to x-axis. Defaults for common choices of x-axis will be supplied if no x-label override is supplied here.
- x_rng
Specify an x-axis range of the form [min_value, max_value] to override the default range.
- y_label
Label to apply to y-axis.
- y_rng
Specify a y-axis range of the form [min_value, max_value] to override the default range.
- color_by
Name of column in metrics.curves that specifies which values to use when coloring points along the cost curve.
- cbar_rng
Specify a color bar range of the form [min_value, max_value] to override the default range.
- cbar_label
Custom label to apply to the color bar. If None is supplied, the default (“Fraction Flagged”) will be used.
- grid
Whether to plot grid lines.
- dpi
Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
- bootstrapped
Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
- bootstrap_alpha
Opacity of bootstrap curves.
- bootstrap_color
Color of bootstrap curves.
- imputed
Whether to plot imputed curves.
- return_fig
If set to True, will return (fig, ax) as a tuple instead of plotting the figure.
- Returns
- Optional[Tuple[plt.figure, plt.axes]]
The plot’s figure and axis object.
-
clscurves.plotter.dist module¶
-
class
clscurves.plotter.dist.DistPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool, reverse_thresh: bool)¶ Bases:
clscurves.plotter.plotter.MetricsPlotterMethods
plot_dist([weighted, label, kind, …])Plot the data distribution.
plot_cdf
plot_pdf
-
__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool, reverse_thresh: bool) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
plot_cdf(**kwargs) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶
-
plot_dist(weighted: bool = False, label: Literal[all, 0, 1, None] = 'all', kind: str = 'CDF', kernel_size: float = 10, log_scale: bool = False, title: Optional[str] = None, cmap: str = 'rainbow', color_by: str = 'recall', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, x_rng: Optional[List[float]] = None, y_rng: Optional[List[float]] = None, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶ Plot the data distribution.
This plots either the CDF (Cumulative Distribution Function) or PDF (Probability Density Function) curve.
- Parameters
- weighted
Specifies whether the weighted or unweighted fraction flagged should be used when computing the CDF or PDF. If unweighted, the fraction flagged is the number of cases flagged divided by the number of cases total. If weighted, it is the sum of the weights of all the cases flagged, divided by the sum of the weights of all the cases.
- label
Class label to plot the CDF for; one of “all”, 1, 0, or None.
- kind
Either “cdf” or “pdf”.
- kernel_size
Used for PDF only: standard deviation of the Gaussian of kernel to use when smoothing the PDF curve.
- log_scale
Boolean to specify whether the x-axis should be log-scaled.
- title
Title of plot.
- cmap
Colormap string specification.
- color_by
Name of key in metrics.curves that specifies which values to use when coloring points along the PDF or CDF curve.
- cbar_rng
Specify a color bar range of the form [min_value, max_value] to override the default range.
- cbar_label
Custom label to apply to the color bar. If None is supplied, a default will be selected from the
cbar_dict.- grid
Whether to plot grid lines.
- x_rng
Range of the horizontal axis.
- y_rng
Range of the vertical axis.
- dpi
Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
- bootstrapped
Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
- bootstrap_alpha
Opacity of bootstrap curves.
- bootstrap_color
Color of bootstrap curves.
- imputed
Whether to plot imputed curves.
- return_fig
If set to True, will return (fig, ax) as a tuple instead of plotting the figure.
- Returns
- Optional[Tuple[plt.figure, plt.axes]]
The plot’s figure and axis object.
-
plot_pdf(**kwargs) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶
-
clscurves.plotter.plotter module¶
-
class
clscurves.plotter.plotter.MetricsPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶ Bases:
clscurves.config.MetricsAliasesA helper class to provide methods shared by each metrics plotter.
These methods streamline the process of making a single classification curve metrics plot, making a bootstrapped plot, and adding a confidence ellipse to a specified operating point.
-
__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
clscurves.plotter.pr module¶
-
class
clscurves.plotter.pr.PRPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶ Bases:
clscurves.plotter.plotter.MetricsPlotterMethods
plot_pr([weighted, title, cmap, color_by, …])Plot the PR (Precision & Recall) curve.
-
__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
plot_pr(weighted: bool = False, title: str = 'Precision-Recall Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, f1_contour: bool = False, grid: bool = True, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶ Plot the PR (Precision & Recall) curve.
- Parameters
- weighted
Specifies whether the weighted or unweighted recall (i.e. TPR) should be used. For example, recall (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the PR curve, this weighting applies only to the recall axis.
- title
Title of plot.
- cmap
Colormap string specification.
- color_by
Name of key in metrics.curves that specifies which values to use when coloring points along the PR curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
- cbar_rng
Specify a color bar range of the form [min_value, max_value] to override the default range.
- cbar_label
Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the color_by value) will be used.
- dpi
Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
- bootstrapped
Sspecifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
- bootstrap_alpha
Opacity of bootstrap curves.
- bootstrap_color
Color of bootstrap curves.
- imputed
Whether to plot imputed curves.
- f1_contour
Whether to include reference contours for curves of constant F1.
- grid
Whether to plot grid lines.
- op_value
Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
- return_fig
If set to True, will return (fig, ax) as a tuple instead of plotting the figure.
-
clscurves.plotter.prg module¶
-
class
clscurves.plotter.prg.PRGPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶ Bases:
clscurves.plotter.plotter.MetricsPlotterPlot the PRG (Precision-Recall-Gain) curve.
“Precision-Recall-Gain” plots defined here.
A “weighted recall” variant of Recall Gain is not discussed in the paper, and may not have any theoretical guarantees or reasonable interpretations. Because it’s unclear how a positive-example weight should affect the quantities in a PRG plot, we do not support weighted plotting at this time.
Methods
plot_prg([title, cmap, color_by, cbar_rng, …])Plot the PRG (Precision-Recall-Gain) curve.
-
__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
plot_prg(title: str = 'Precision-Recall-Gain Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶ Plot the PRG (Precision-Recall-Gain) curve.
- Parameters
- title
Title of plot.
- cmap
Colormap string specification.
- color_by
Name of key in metrics.curves that specifies which values to use when coloring points along the PRG curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
- cbar_rng
Specify a color bar range of the form [min_value, max_value] to override the default range.
- cbar_label
Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the
color_byvalue) will be used.- grid
Whether to plot grid lines.
- dpi
Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
- bootstrapped
Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
- bootstrap_alpha
Opacity of bootstrap curves.
- bootstrap_color
Color of bootstrap curves.
- imputed
Whether to plot imputed curves.
- op_value
Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
- return_fig
If set to True, will return (fig, ax) as a tuple instead of plotting the figure.
-
clscurves.plotter.rf module¶
-
class
clscurves.plotter.rf.RFPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶ Bases:
clscurves.plotter.plotter.MetricsPlotterMethods
plot_rf([weighted, scale, title, cmap, …])Plot the RF (Recall & Fraction Flagged) curve.
-
__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
plot_rf(weighted: bool = False, scale: float = 1.0, title: str = 'Recall-Fraction Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶ Plot the RF (Recall & Fraction Flagged) curve.
- Parameters
- weighted
Specifies whether the weighted or unweighted TPR (i.e. recall) should be used. For example, TPR (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the RF curve, this weighting applies only to the TPR axis.
- scale
Value to scale the x-axis by. By choosing a lower number like 0.1, you can observe greater detail in sensitive regions of the RF curve.
- title
Title of plot.
- cmap
Colormap string specification.
- color_by
Name of key in metrics.curves that specifies which values to use when coloring points along the ROC curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
- cbar_rng
Specify a color bar range of the form [min_value, max_value] to override the default range.
- cbar_label
Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the
color_byvalue) will be used.- grid
Whether to plot grid lines.
- dpi
Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
- bootstrapped
Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
- bootstrap_alpha
Opacity of bootstrap curves.
- bootstrap_color
Color of bootstrap curves.
- imputed
Whether to plot imputed curves.
- op_value
Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
- return_fig
If set to True, will return (fig, ax) as a tuple instead of plotting the figure.
-
clscurves.plotter.roc module¶
-
class
clscurves.plotter.roc.ROCPlotter(metrics: clscurves.utils.MetricsResult, score_is_probability: bool)¶ Bases:
clscurves.plotter.plotter.MetricsPlotterMethods
plot_roc([weighted, title, cmap, color_by, …])Plot the ROC (Receiver Operating Characteristic) curve.
-
__init__(metrics: clscurves.utils.MetricsResult, score_is_probability: bool) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
plot_roc(weighted: bool = False, title: str = 'ROC Curve', cmap: str = 'rainbow', color_by: str = 'thresh', cbar_rng: Optional[List[float]] = None, cbar_label: Optional[str] = None, grid: bool = True, dpi: Optional[int] = None, bootstrapped: bool = False, bootstrap_alpha: float = 0.15, bootstrap_color: str = 'black', imputed: bool = False, op_value: Optional[float] = None, return_fig: bool = False) → Optional[Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes]]¶ Plot the ROC (Receiver Operating Characteristic) curve.
- Parameters
- weighted
Specifies whether the weighted or unweighted TPR and FRP should be used. For example, TPR (= tp/pos), if unweighted, is the number of positive cases captured above a threshold, divided by the total number of positive cases. If weighted, it is the sum of weights (or “amounts”) associated with each positive case captured above a threshold, divided by the sum of weights associated with all positive cases. For the ROC curve, this weighting applies to both the TPR and FPR axis.
- title
Title of plot.
- cmap
Colormap string specification.
- color_by
Name of key in metrics.curves that specifies which values to use when coloring points along the ROC curve; this should be either “frac” for fraction of cases flagged or “thresh” for score discrimination threshold.
- cbar_rng
Specify a color bar range of the form [min_value, max_value] to override the default range.
- cbar_label
Custom label to apply to the color bar. If None is supplied, the default (“Threshold Value” or “Fraction Flagged”, depending on the
color_byvalue) will be used.- grid
Whether to plot grid lines.
- dpi
Resolution in “dots per inch” of resulting figure. If not specified, the Matplotlib default will be used. A good rule of thumb is 150 for good quality at normal screen resolutions and 300 for high quality that maintains sharp features after zooming in.
- bootstrapped
Specifies whether bootstrapped curves should be plotted behind the main colored performance scatter plot.
- bootstrap_alpha
Opacity of bootstrap curves.
- bootstrap_color
Color of bootstrap curves.
- imputed
Whether to plot imputed curves.
- op_value
Threshold value to plot a confidence ellipse for when the plot is bootstrapped.
- return_fig
If set to True, will return (fig, ax) as a tuple instead of plotting the figure.
-