scludam.fetcher module

Module for remote catalog data fetching.

The module provides functions for searching objects and tables, and to fetch the data from the remote catalog. Currently the data fetching supports the GAIA catalogues. Object searching is done using the Simbad service.

Examples

 1from scludam import Query, search_object, search_table
 2
 3iden = "ngc2527"
 4
 5# search object to get some general information
 6object_info = search_object(iden)
 7print(f"Object found in {object_info.coords}")
 8object_info.table.write(
 9    f"examples/{iden}_object_metadata.txt", format="ascii", overwrite=True
10)
11
12default_table = Query().table
13
14# search the default table information (gaia edr3)
15tables = search_table(default_table)
16first_table = tables[0]
17print(f"name: {first_table.name}\n description: {first_table.description}\n")
18first_table.columns.write(
19    f"examples/{default_table}_columns_metadata.txt", format="ascii", overwrite=True
20)
21
22# create query around object
23query = (
24    Query()
25    .select(
26        "ra",
27        "dec",
28        "ra_error",
29        "dec_error",
30        "ra_dec_corr",
31        "pmra",
32        "pmra_error",
33        "ra_pmra_corr",
34        "dec_pmra_corr",
35        "pmdec",
36        "pmdec_error",
37        "ra_pmdec_corr",
38        "dec_pmdec_corr",
39        "pmra_pmdec_corr",
40        "parallax",
41        "parallax_error",
42        "parallax_pmra_corr",
43        "parallax_pmdec_corr",
44        "ra_parallax_corr",
45        "dec_parallax_corr",
46        "phot_g_mean_mag",
47    )
48    .where_in_circle(iden, 2.5)
49    .where(
50        [
51            ("parallax", ">", 0.2),
52            ("phot_g_mean_mag", "<", 18),
53        ]
54    )
55    # low noise_sig means "do not rely on excess noise, do not check it"
56    # high noise_sig means "you should check that excess noise is small"
57    .where_arenou_criterion()
58    .where_aen_criterion()
59)
60
61# count the number of rows that will be received if query is executed
62count = query.count()
63print(f'Stars found: {count["count_all"][0]}')
64
65# execute query and save data
66data = query.get()
67data.write(f"examples/{iden}_data.xml", format="votable", overwrite=True)

Options for each function and class in the example are described in the documentation below.

class scludam.fetcher.Config(MAIN_GAIA_TABLE: str = 'gaiadr3.gaia_source', MAIN_GAIA_RA: str = 'ra', MAIN_GAIA_DEC: str = 'dec', ROW_LIMIT: int = -1, MAIN_GAIA_ASTROMETRIC_EXCESS_NOISE: str = 'astrometric_excess_noise', MAIN_GAIA_ASTROMETRIC_EXCESS_NOISE_SIG: str = 'astrometric_excess_noise_sig', MAIN_GAIA_BP_RP: str = 'bp_rp', MAIN_GAIA_BP_RP_EXCESS_FACTOR: str = 'phot_bp_rp_excess_factor')[source]

Bases: object

Class to hold defaults for a query.

class scludam.fetcher.SimbadResult(coords: SkyCoord | None = None, table: Table | None = None)[source]

Bases: object

Class to hold the result of a search_object query.

Variables:
  • table (astropy.table.Table) – The table with the results of the query.

  • coords (astropy.coordinates.SkyCoord) – The coordinates of the object in ICRS system.

scludam.fetcher.search_object(identifier: str, cols: List[str] = ['coordinates', 'parallax', 'propermotions', 'velocity', 'dimensions', 'diameter'], **kwargs)[source]

Search an identifier in Simbad catalogues.

Parameters:
  • identifier (str) – Simbad identifier.

  • cols (List[str], optional) – Columns to be included in the result, by default [ “coordinates”, “parallax”, “propermotions”, “velocity”, “dimensions”, “diameter”, ]

Returns:

Object with the search results

Return type:

SimbadResult

class scludam.fetcher.TableInfo(name: str, description: str, columns: Table)[source]

Bases: object

Class to hold the result of a search_table query.

Variables:
  • name (str) – Name of the table.

  • columns (astropy.table.Table) – Table with the information of the columns of the table.

  • description (str) – Description of the table.

scludam.fetcher.search_table(search_query: str | None = None, only_names: bool = False, **kwargs)[source]

List available tables in gaia catalogue matching a search query.

Parameters:
  • search_query (str, optional) – query to match, by default None

  • only_names (bool, optional) – return only table names and descriptions, by default False

Returns:

List of tables found.

Return type:

List[TableInfo]

class scludam.fetcher.Query(table: str = 'gaiadr3.gaia_source', row_limit: int = -1, columns: list = _Nothing.NOTHING, extra_columns: list = _Nothing.NOTHING, conditions: List[Tuple[str, str, str, str | Number]] = _Nothing.NOTHING, orderby: str | None = None)[source]

Bases: object

Class to hold an ADQL query to be executed.

Variables:
  • table (str) – Name of the table to be queried, by default given by Config.MAIN_GAIA_TABLE

  • row_limit (int) – Maximum number of rows to be returned, by default given by Config.ROW_LIMIT

  • conditions (List[LogicalExpression]) – List of conditions to be applied to the query, by default []

  • columns (List[str]) – List of columns to be returned, by default [], meaning all.

  • extra_columns (List[str]) – List of extra columns to be included in the query given by custom conditions, by default []

  • orderby (str) – Column to be used for ordering.

Notes

It is recommended to not manually set the attributes of this class, except for table.

select(*args: str)[source]

Add columns to query.

Parameters:

*args (str) – Columns to be included in the query.

Returns:

instance of query.

Return type:

Query

top(row_limit: int)[source]

Set the number of rows to be returned.

Parameters:

row_limit (int) – number of rows to be returned.

Returns:

instance of query.

Return type:

Query

where(condition: Tuple[str, str, str | Number] | List[Tuple[str, str, str | Number]])[source]

Add a condition to the query.

Parameters:

condition (Union[Condition, List[Condition]]) – Condition or list of Conditions to be added to the query. Each Condition is a tuple of the form (expression1, operator, expression2): (str, str, Union[str, Number])

Returns:

instance of query.

Return type:

Query

where_or(condition: Tuple[str, str, str | Number] | List[Tuple[str, str, str | Number]])[source]

Add conditions to the query following in CNF.

CNF is conjunctive normal form: AND (c1 OR c2 OR …).

Parameters:

condition (Union[Condition, List[Condition]]) – Condition or list of Conditions to be added to the query.

Returns:

instance of query.

Return type:

Query

where_in_circle(coords_or_name: Tuple[Number, Number] | SkyCoord | str, radius: int | float | Quantity, ra_name: str = 'ra', dec_name: str = 'dec')[source]

Add a condition to the query to select objects within a circle.

The circle is drawn in the spherical coordinates space (ra, dec). It also adds the dist (distance from the center) column to column list and adds orderby distance to the query.

Parameters:
  • coords_or_name (Union[Coord, SkyCoord, str]) – Coordinates of the center of the circle or name of the identifier to be searched using search_object.

  • radius (Union[int, float, astropy.units.quantity.Quantity]) – value of the radius of the circle. If int or float, its taken as degrees.

  • ra_name (str, optional) – ra column name, by default config.MAIN_GAIA_RA

  • dec_name (str, optional) – dec column name, by default config.MAIN_GAIA_DEC

Returns:

instance of query.

Return type:

Query

where_aen_criterion(aen_value: Number = 2, aen_sig_value: Number = 2, aen_name: str = 'astrometric_excess_noise', aen_sig_name: str = 'astrometric_excess_noise_sig')[source]

Add astrometric excess noise rejection criterion based on GAIA criteria.

It also adds the aen and aen_sig columns to column list.

Parameters:
  • aen_value (Number, optional) – astrometric excess noise threshold value, by default 2

  • aen_sig_value (Number, optional) – astrometric excess noise score threshold value, by default 2

  • aen_name (str, optional) – column name for astrometric excess noise, by default config.MAIN_GAIA_ASTROMETRIC_EXCESS_NOISE

  • aen_sig_name (str, optional) – column name for astrometric escess noise score, by default config.MAIN_GAIA_ASTROMETRIC_EXCESS_NOISE_SIG

Returns:

instance of query

Return type:

Query

Notes

The criteria [1] used is: * exclude objects if * astrometric_excess_noise_score > aen_sig_value AND * astrometric_excess_noise > aen_value

References

where_arenou_criterion(bp_rp_name: str = 'bp_rp', bp_rp_ef_name: str = 'phot_bp_rp_excess_factor')[source]

Add rejection criterion based on Arenou et al. (2018).

It also adds the bp_rp and bp_rp_ef columns to column list.

Parameters:
  • bp_rp_name (str, optional) – bp_rp column name, by default config.MAIN_GAIA_BP_RP

  • bp_rp_ef_name (str, optional) – bp_rp excess factor column name, by default config.MAIN_GAIA_BP_RP_EXCESS_FACTOR

Returns:

instance of query

Return type:

Query

Notes

The criteria [2] used is: * include objects if: 1 + 0.015(BP-RP)^2 < E <1.3 + 0.006(BP-RP)^2 * where E is photometric BP-RP excess factor.

References

build_count()[source]

Build the count query.

It allows to preview a query without executing it.

Returns:

string query in ADQL.

Return type:

str

build()[source]

Build the query.

Returns:

string query in ADQL.

Return type:

str

get(**kwargs)[source]

Execute the query.

It launches an asynchronous gaia job. It takes some time to execute the query and parse the results. Parameters are passed through kwargs to astroquery.gaia.Gaia.launch_job_async.

Parameters:
  • dump_to_file (bool) – If True, results will be stored in file, false by default.

  • output_file (str) – Name of the output file if dump_to_file is True.

Returns:

Table with the results if dump_to_file is False.

Return type:

astroquery.table.table.Table

count(**kwargs)[source]

Execute the count query.

It launches an asynchronous gaia job. It takes some time to execute the query and parse the results. It only returns a table with a single count_all column. Parameters are passed through kwargs to astroquery.gaia.Gaia.launch_job_async.

Parameters:
  • dump_to_file (bool) – If True, results will be stored in file, false by default.

  • output_file (str) – Name of the output file if dump_to_file is True.

Returns:

table with the results if dump_to_file is False

Return type:

astroquery.table.table.Table

scludam.fetcher.search_objects_near_data(df: DataFrame, allow_missing_values: bool = True, fields: List[str] = [], **kwargs)[source]

Search for objects in an area defined by dataframe.

search all simbad objects in an area defined by a dataframe by the maxs and mins of the columns in the dataframe with GAIA colnames.

Parameters:
  • df (pd.DataFrame) – dataframe with GAIA data. Must contain ra and dec columns. May also contain “pmra”, “pmdec”, “parallax”, “phot_g_mean_mag”, “phot_rp_mean_mag”, “phot_bp_mean_mag”. Other fields are ignored

  • allow_missing_values (bool, optional) – Allow simbad objects with missing value for the columns in the dataframe to appear in the result table, by default True.

  • fields (list, optional) – extra simbad fields to add, apart from ["parallax", "propermotions", "diameter", "fluxdata(G)", "fluxdata(B)", "fluxdata(R)"], by default []

Returns:

Simbad result table. RA and DEC columns are parsed into degrees with decimal places.

Return type:

astroquery.table.table.Table

Raises:

ValueError – No supported columns present in the dataframe.

scludam.fetcher.simbad2gaiacolnames(table: Table)[source]

Convert a simbad table colnames to a gaia table colnames.

Only supports translation for ‘RA’, ‘DEC’, ‘PMRA’, ‘PMDEC’, ‘PLX_VALUE’, ‘FLUX_G’, ‘FLUX_R’, ‘FLUX_B’. It also adds ‘bp_rp’ if ‘FLUX_B’ and ‘FLUX_R’ are present.

Parameters:

table (astroquery.table.table.Table) – simbad table

Returns:

gaia table

Return type:

astroquery.table.table.Table