Quickstart

Main functionality example

import matplotlib.pyplot as plt

from scludam import DEP, CountPeakDetector, HopkinsTest, Query, RipleysKTest

# search some data from GAIA
# with error and correlation
# columns for better KDE
df = (
    Query()
    .select(
        "ra",
        "dec",
        "ra_error",
        "dec_error",
        "ra_dec_corr",
        "pmra",
        "pmra_error",
        "ra_pmra_corr",
        "dec_pmra_corr",
        "pmdec",
        "pmdec_error",
        "ra_pmdec_corr",
        "dec_pmdec_corr",
        "pmra_pmdec_corr",
        "parallax",
        "parallax_error",
        "parallax_pmra_corr",
        "parallax_pmdec_corr",
        "ra_parallax_corr",
        "dec_parallax_corr",
        "phot_g_mean_mag",
    )
    # search for this identifier in simbad
    # and bring data in a circle of radius
    # 1/2 degree
    .where_in_circle("ngc2168", 0.5)
    .where(("parallax", ">", 0))
    .where(("phot_g_mean_mag", "<", 18))
    # include some common criteria
    # for data precision
    .where_arenou_criterion()
    .where_aen_criterion()
    .get()
    .to_pandas()
)

# If data already has been downloaded, you can load it from a file:
# > from astropy.table.table import Table
# > df = Table.read("path_to_my_file/ngc2168_data.fits").to_pandas()


# Build Detection-Estimation Pipeline
dep = DEP(
    # Detector configuration for the detection step
    detector=CountPeakDetector(
        bin_shape=[0.3, 0.3, 0.07],
        min_score=3,
        min_count=5,
    ),
    det_cols=["pmra", "pmdec", "parallax"],
    sample_sigma_factor=2,
    # Clusterability test configuration
    tests=[
        RipleysKTest(pvalue_threshold=0.05, max_samples=100),
        HopkinsTest(),
    ],
    test_cols=[["ra", "dec"]] * 2,
    # Membership columns to use
    mem_cols=["pmra", "pmdec", "parallax", "ra", "dec"],
).fit(df)

# plot the results
dep.scatterplot(["pmra", "pmdec"])
# zoom on interesting area
plt.axis([-1, 3, -5, 1])
plt.show()

# write results to file
dep.write("ngc2168_result.fits")

Documentation quick guide

Building queries for GAIA catalogues and retrieving data: Query
Detection and membership estimation pipeline: DEP
Detection method: CountPeakDetector
Clusterability tests: stat_tests
Clustering method: SHDBSCAN
Probability Estimation: DBME
Kernel Density Estimation with per-observation or per-dimension bandwidth, plugin or rule-of-thumb methods: HKDE

Documentation module guide

Query building, SIMBAD object searching and data related functionality: fetcher
Detection and membership estimation pipeline: pipeline
Detection: CountPeakDetector
Clusterability tests: stat_tests
Clustering: shdbscan
Probability estimation: membership
Kernel Density Estimation: hkde
Utils such as GAIA column names interpretation and one hot encoding: utils
Custom ploting functions: plots
Utils for R communication: rutils
Utils for masking data: masker
Useful distributions for data generation: synthetic