scludam.plots module

Module for helper plotting functions.

scludam.plots.color_from_proba(proba: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], palette: str)[source]

Create color list from palette and probabilities.

It desaturates the colors given the probabilities

Parameters:
  • proba (Numeric2DArray) – Membership probability array of shape (n_points, n_classes).

  • palette (str) – Name of seaborn palette.

Returns:

  • List – Color list of length n_points where each point has a color according to the class it belongs.

  • List – Desaturated color list of length n_points where each point has a color according to the class it belongs. The saturation is higher if the probability is closer to 1 and lower if it is closer to 1 / n_classes.

  • List – Color list of length n_classes, defining a color for each class.

scludam.plots.scatter3dprobaplot(data: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]] | DataFrame, proba: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], cols: List[str] | None = None, x: int = 0, y: int = 1, z: int = 2, palette: str = 'viridis', desaturate: bool = True, **kwargs)[source]

Create a 3D probability plot.

It represents the provided data in x, y and z. It passes kwargs to matplotlib scatter3D [1]

Parameters:
  • data (Union[Numeric2DArray, pd.DataFrame]) – Data to be plotted.

  • proba (Numeric2DArray) – Array of membership probabilities, of shape (n_points, n_classes)

  • cols (List[str], optional) – List of ordered column names, by default None. Used if data is provided as numpy array.

  • x (int, optional) – Index of the x variable, by default 0.

  • y (int, optional) – Index of the y variable, by default 1.

  • z (int, optional) – Index of the z variable, by default 2.

  • palette (str, optional) – Seaborn palette string, by default “viridis”

  • desaturate (bool, optional) – If True, desaturate colors according to probability, by default True.

Returns:

Plot of the clustering results.

Return type:

matplotlib.collections.PathCollection

Raises:

ValueError – If data has less than 3 columns.

References

scludam.plots.surfprobaplot(data: DataFrame | ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], proba: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], x: int = 0, y: int = 1, palette: str = 'viridis', cols: List[str] | None = None, **kwargs)[source]

Create surface 3D probability plot.

It represents the provided data in x y. It passes kwargs to matplotlib plot_trisurf [2].

Parameters:
  • data (Union[pd.DataFrame, Numeric2DArray]) – Data to be plotted.

  • proba (Numeric2DArray) – Membership probability array.

  • x (int, optional) – Index of the x variable, by default 0

  • y (int, optional) – Index of the y variable, by default 1

  • palette (str, optional) – Seaborn palette string, by default “viridis”

  • cols (List[str], optional) – List of ordered column names, by default None.

Returns:

Plot of the clustering results.

Return type:

matplotlib.collections.PathCollection

Raises:
  • ValueError – If data has less than 2 columns.

  • ValueError – If x or y parameters are invalid.

References

scludam.plots.pairprobaplot(data: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]] | DataFrame, proba: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], labels: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], cols: List[str] | None = None, palette: str = 'viridis_r', diag_kind: str = 'kde', diag_kws: dict | None = None, plot_kws: dict | None = None, **kwargs)[source]

Pairplot of the data and the membership probabilities.

It passes kwargs, diag_kws and plot_kws to seaborn pairplot [3] function.

Parameters:
  • data (Union[Numeric2DArray, pd.DataFrame]) – Data to be plotted.

  • proba (Numeric2DArray) – Membership probability array.

  • labels (Numeric1DArray) – Labels of the data.

  • cols (List[str], optional) – Column names, by default None

  • palette (str, optional) – Seaborn palette, by default “viridis_r”

  • diag_kind (str, optional) – Kind of plot for diagonal, by default “kde”. Valid values are “hist” and “kde”.

  • diag_kws (dict, optional) – Additional arguments for diagonal plots, by default None

  • plot_kws (dict, optional) – Additional arguments for off-diagonal plots, by default None

Returns:

Pairplot.

Return type:

seaborn.PairGrid

Raises:

ValueError – Invalid diag_kind.

References

scludam.plots.tsneprobaplot(data: DataFrame | ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], labels: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], proba: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], **kwargs)[source]

Plot of data and membership probabilities using t-SNE projection.

It pases kwargs to seaborn scatterplot [4] function.

Parameters:
  • data (Union[pd.DataFrame, Numeric2DArray]) – Data to be plotted.

  • labels (Numeric1DArray) – Labels of the data.

  • proba (Numeric2DArray) – Membership probability array.

Returns:

T-SNE projected plot.

Return type:

matplotlib.axes.Axes

References

scludam.plots.heatmap2D(hist2D: ndarray[Any, dtype[number]], edges: ndarray[Any, dtype[ScalarType]] | List | Tuple, bin_shape: ndarray[Any, dtype[ScalarType]] | List | Tuple, index: ndarray[Any, dtype[ScalarType]] | List | Tuple | None = None, annot: bool = True, annot_prec: int = 2, annot_threshold: Number = 0.1, ticks: bool = True, tick_prec: int = 2, **kwargs)[source]

Create a heatmap from a 2D histogram.

Also marks index if provided. Create ticklabels from bin centers and not from bin indices. kwargs are passed to seaborn.heatmap [5].

Parameters:
  • hist2D (NumericArray) – Histogram.

  • edges (ArrayLike) – Edges.

  • bin_shape (ArrayLike) – Bin shape of the histogram.

  • index (ArrayLike, optional) – Index to be marked, by default None

  • annot (bool, optional) – Use default annotations, by default True. If true, annotations are created taking into account the rest of annot parameters.

  • annot_prec (int, optional) – Annotation number precision, by default 2

  • annot_threshold (Number, optional) – Only annotate cells with values bigger than annot_threshold, by default 0.1

  • ticks (bool, optional) – Create ticklabels from the bin centers, by default True

  • tick_prec (int, optional) – Ticklabels number precision, by default 2

Returns:

Heatmap. To get the figure from the result of the function, use fig = heatmap2D.get_figure().

Return type:

matplotlib.axes._subplots.AxesSubplot

References

scludam.plots.univariate_density_plot(x: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], y: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], ax: Axes | None = None, figure: Figure | None = None, figsize: Tuple[int, int] = (8, 6), grid: bool = True, **kwargs)[source]

Plot univariate density plot.

Create a filled lineplot given the densities for x. kwargs are passed to matplotlib scatter plot [6].

Parameters:
  • x (Numeric1DArray) – X linespace.

  • y (Numeric1DArray) – Densities.

  • ax (Optional[Axes], optional) – Ax to plot, by default None

  • figure (Optional[Figure], optional) – Figure to plot, by default None

  • figsize (Tuple[int, int], optional) – Figure size, by default (8, 6)

  • grid (bool, optional) – Add grid, by default True

Returns:

Axes of the plot.

Return type:

matplotlib.axes.Axes

References

scludam.plots.bivariate_density_plot(x: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], y: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], z: ndarray[Any, dtype[number]][ndarray[Any, dtype[number]]], levels: int | None = None, contour_color: str = 'black', ax: Axes | None = None, figure: Figure | None = None, figsize: Tuple[int, int] = (8, 6), colorbar: bool = True, title: str | None = None, title_size: int = 16, grid: bool = True, **kwargs)[source]

Create a bivariate density plot.

Create a heatmap like density plot given densities in x and y. kwargs are passed to matplotlib imshow [7].

Parameters:
  • x (Numeric1DArray) – X linespace.

  • y (Numeric1DArray) – Y linespace.

  • z (Numeric1DArray) – Densities in x and y.

  • levels (int, optional) – Number of levels to draw contour, by default None

  • contour_color (str, optional) – Color to draw contour, by default “black”

  • ax (Optional[Axes], optional) – Ax to plot, by default None

  • figure (Optional[Figure], optional) – Figure to plot, by default None

  • figsize (Tuple[int, int], optional) – Figure size, by default (8, 6)

  • colorbar (bool, optional) – Add a colorbar, by default True

  • title (Optional[str], optional) – Title to set, by default None

  • title_size (int, optional) – Title size, by default 16

  • grid (bool, optional) – Add grid, by default True

Returns:

  • matplotlib.axes.Axes – Axes of the plot.

  • matplotlib.image.AxesImage – Image of the plot.

References

scludam.plots.scatter2dprobaplot(data: DataFrame, proba: ndarray, labels: ndarray, cols: List[str] | None = None, palette: str = 'Set1', select_labels: List[int] | int | None = None, select_1: int | None = None, bg_kws: dict = {}, fg_kws: dict = {})[source]

Create a scatter plot with labels and probabilites.

Parameters:
  • data (pd.DataFrame) – dataframe with at least 2 columns.

  • proba (np.ndarray) – Probability array.

  • labels (np.ndarray) – Label array.

  • select_labels (Optional[Union[List[int], int]], optional) – Select labels to plot, by default None. If None, all labels are plotted.

  • select_1 (Optional[int], optional) – Used to select only one of the labels. Only plots that population and the background (noise lable -1), by default None.

  • cols (Optional[List[str]], optional) – Axes labels to be used, by default None. If None, the columns of data are used.

  • palette (str, optional) – Palette to be used to choolse label colors, by default “Set1”

  • bg_kws (dict, optional) – kwargs to be passed to sns.scatterplot for the background (noise label [-1]) scatter plot, by default {}.

  • fg_kws (dict, optional) – kwargs to be passed to sns.scatterplot for the foreground (labels [0, 1, …]), by default {}.

Returns:

Axes with the plot.

Return type:

Axes

Raises:
  • ValueError – If data has less than 2 columns.

  • ValueError – If probability and data have different number of rows.

scludam.plots.plot_objects(df: DataFrame, ax: Axes, cols: List[str])[source]

Plot object dataframe in an axis.

Object dataframe refers to a pandas dataframe created from simbad Table result, translated with simbad2gaiacolnames().

Parameters:
  • df (pd.DataFrame) – Dataframe of objects. must contain at least “MAIN_ID”, “TYPED_ID” and “OTYPE”.

  • ax (Axes) – Axis to plot on.

  • cols (list, optional) – Columns in the object dataframe to plot in the, x y axes of ax.

Returns:

axis with plotted objects.

Return type:

Axes

Raises:

ValueError – _description_

scludam.plots.plot_kernels(ax, means, covariances, nstd=3, **kwargs)[source]

Plot a collection of 2D Gaussians as ellipses.

Parameters:
  • ax (Axes) – ax to plot on

  • means (np.ndarray) – 2d array of kernel means.

  • covariances (np.ndarray) – 1d array of 2d covariances (3d array)

  • nstd (int, optional) – number of standard deviations to draw contour, by default 3

Returns:

ax with ploted ellipses.

Return type:

Axes

scludam.plots.horizontal_lineplots(ys: List[ndarray], cols=[], **kwargs)[source]

Plot a list of 1d arrays as horizontal lineplots.

Parameters:

ys (List[np.ndarray]) – List of 1d arrays to plot.

Returns:

axis with ploted lineplots.

Return type:

Axes