Quickstart
CLI
From version 1.0.10 there’s a suggested analysis pipeline in the form of a CLI tool to facilitate usage. To use it, create a script like the following, and run it:
from scludam import launch
if __name__ == "__main__":
launch()
From there, you can download data from GAIA, load a file, run the analysis pipeline, plot and save results.
NOTES:
For plotting to display correctly, you need to have the corresponding library installed, such as PyQt6 in your environment.
The CLI tool is still rudimentary, it will exit on some incorrect inputs. Follow the directions of the prompts carefully.
Main functionality example
1import matplotlib.pyplot as plt
2
3from scludam import DEP, CountPeakDetector, HopkinsTest, Query, RipleysKTest
4
5# search some data from GAIA
6# with error and correlation
7# columns for better KDE
8df = (
9 Query()
10 .select(
11 "ra",
12 "dec",
13 "ra_error",
14 "dec_error",
15 "ra_dec_corr",
16 "pmra",
17 "pmra_error",
18 "ra_pmra_corr",
19 "dec_pmra_corr",
20 "pmdec",
21 "pmdec_error",
22 "ra_pmdec_corr",
23 "dec_pmdec_corr",
24 "pmra_pmdec_corr",
25 "parallax",
26 "parallax_error",
27 "parallax_pmra_corr",
28 "parallax_pmdec_corr",
29 "ra_parallax_corr",
30 "dec_parallax_corr",
31 "phot_g_mean_mag",
32 )
33 # search for this identifier in simbad
34 # and bring data in a circle of radius
35 # 1/2 degree
36 .where_in_circle("ngc2168", 0.5)
37 .where(("parallax", ">", 0))
38 .where(("phot_g_mean_mag", "<", 18))
39 # include some common criteria
40 # for data precision
41 .where_arenou_criterion()
42 .where_aen_criterion()
43 .get()
44 .to_pandas()
45)
46
47# If data already has been downloaded, you can load it from a file:
48# > from astropy.table.table import Table
49# > df = Table.read("path_to_my_file/ngc2168_data.fits").to_pandas()
50
51
52# Build Detection-Estimation Pipeline
53dep = DEP(
54 # Detector configuration for the detection step
55 detector=CountPeakDetector(
56 bin_shape=[0.3, 0.3, 0.07],
57 min_score=3,
58 min_count=5,
59 ),
60 det_cols=["pmra", "pmdec", "parallax"],
61 sample_sigma_factor=2,
62 # Clusterability test configuration
63 tests=[
64 RipleysKTest(pvalue_threshold=0.05, max_samples=100),
65 HopkinsTest(),
66 ],
67 test_cols=[["ra", "dec"]] * 2,
68 # Membership columns to use
69 mem_cols=["pmra", "pmdec", "parallax", "ra", "dec"],
70).fit(df)
71
72# plot the results
73dep.scatterplot(["pmra", "pmdec"])
74# zoom on interesting area
75plt.axis([-1, 3, -5, 1])
76plt.show()
77
78# write results to file
79dep.write("ngc2168_result.fits")

Documentation quick guide
Building queries for GAIA catalogues and retrieving data:
Query
Detection and membership estimation pipeline:
DEP
Detection method:
CountPeakDetector
Clusterability tests:
stat_tests
Clustering method:
SHDBSCAN
Probability Estimation:
DBME
Kernel Density Estimation with per-observation or per-dimension bandwidth, plugin or rule-of-thumb methods:
HKDE
Documentation module guide
Query building, SIMBAD object searching and data related functionality:
fetcher
Detection and membership estimation pipeline:
pipeline
Detection:
CountPeakDetector
Clusterability tests:
stat_tests
Clustering:
shdbscan
Probability estimation:
membership
Kernel Density Estimation:
hkde
Utils such as GAIA column names interpretation and one hot encoding:
utils
Custom ploting functions:
plots
Utils for R communication:
rutils
Utils for masking data:
masker
Useful distributions for data generation:
synthetic