Generate atlas

Generate an atlas of YSOVAR lightcurves

This module collects all procedures that are required to make the atlas. This starts with reading in the csv file from the YSOVAR2 database and includes the calculation of the some fits and quantities. More specific tasks for the analysis of the lightcurve can be found in YSOVAR.lightcurves, more stuff for plotting in YSOVAR.plot.

The basic structure for the YSOVAR analysis is the YSOVAR.atlas.YSOVAR_atlas. To initialize an atlas object pass in a numpy array with all the lightcurves:

from YSOVAR import atlas
data = atlas.dict_from_csv('/path/to/my/irac.csv', match_dist = 0.)
MyRegion = atlas.YSOVAR_atlas(lclist = data)

The YSOVAR.atlas.YSOVAR_atlas is build on top of a astropy.table.Table (documentation here) object. See that documentation for the syntax on how to acess the data or add a column.

This YSOVAR.atlas.YSOVAR_atlas auto-generates some content in the background, so I really encourage you to read the documentation (We promise it’s only a few lines because we are too lazy to type much more).

YSOVAR.atlas.IAU2radec(isoy)

convert IAU name do decimal degrees.

Parameters:

isoy : string

IAU Name

Returns:

ra, dec : float

ra and dec in decimal degrees

class YSOVAR.atlas.YSOVAR_atlas(*args, **kwargs)

The basic structure for the YSOVAR analysis is the YSOVAR_atlas. To initialize an atlas object pass in a numpy array with all the lightcurves:

from YSOVAR import atlas
data = atlas.dict_from_csv('/path/tp/my/irac.csv', match_dist = 0.)
MyRegion = atlas.YSOVAR_atlas(lclist = data)

The YSOVAR_atlas is build on top of a astropy.table.Table (documentation here) object. See that documentation for the syntax on how to acess the data or add a column.

Some columns are auto-generated, when they are first used. Some examples are

  • median
  • mean
  • stddev
  • min
  • max
  • mad (median absolute deviation)
  • delta (90% quantile - 10% quantile)
  • redchi2tomean
  • wmean (uncertainty weighted average).

When you ask for MyRegion['min_36'] it first checks if that column is already present. If not, if adds the new column called min_36 and calculates the minimum of the lightcurve in band 36 for each object in the atlas, that has m36 and t36 entries (for magnitude and time in band 36 respectively). Data read with dict_from_csv() automatically has the required format.

More functions may be added to this magic list later. Check:

import YSOVAR.registry
YSOVAR.registry.list_lcfuncs()

to see which functions are implemented. More function can be added.

Also, table columns can be added to an YSOVAR_atlas object manually, giving you all the freedom to do arbitrary calculations to arrive at those vales.

add_catalog_data(catalog, radius=0.0002777777777777778, names=None, **kwargs)

add information from a different Table

The tables are automatically cross matched and values are copied only for objects that have a counterpart in the current table.

Parameters:

catalog : astropy.table.Table

This is the table where the new information is provided.

radius : np.float

matching radius in degrees

names : list of strings

List column names that should be copied. If this is None (the default) copy all columns. Column names have to be unique. Thus, make sure that no column of the same name aleady exisits (this will raise an exception).

All other keywords are passed to :func:`YSOVAR.atlas.makecrossids` (see there :

for the syntax). :

add_mags(data, cross_ids, band, channel)

Add lightcurves to some list of dictionaries

Parameters:

data : astropy.table.Table or np.rec.array

data table with new mags

cross_ids : list of lists

for each elements in self, cross_ids says which row in data should be included for this object

band : list of strings

[name of mag, name or error, name of time]

channel : string

name of this channel in the lightcurve. Should be short and unique.

autocalc_newcol(name)

automatically calculate some columns on the fly

calc(name, bands, timefilter=None, data_preprocessor=None, colnames=[], colunits=[], coldescriptions=[], coltypes=[], overwrite=True, t_simul=None, **kwargs)

calculate some quantity for all sources

This is a very general interface to the catalog that allows the user to initiate calculations with any function defined in the registry. A new column is added to the datatable that contains the result. (If the column exists before, it is overwritten).

Parameters:

name : string

name of the function in the function registry for YSOVAR (see registry)

bands : list of strings

Band identifiers In some cases, it can be useful to calculate a quantity for the error (e.g. the mean error). In this case, just give the band as e.g. 36_error. (This only works for simple functions.)

timefilter : function or None

If not None, this function accepts a np.ndarray of observation times and it should return an index array selecteing those time to be included in the calculation. The default function selects all times. An example how to use this keyword include certain times only is shown below:

cat.calc('mean', '36', timefilter = lambda x : x < 55340)

data_preprocessor : function or None

If not None, for each source, a row from the present table is extracted as source = self[id]. This yields a YSOVAR_atlas object with one row. This row is passed to data_preprocessor, which can modify the data (e.g. smooth a lightcurve), but should keep the structure of the object intact. Here is an example for a possible data_preprocessor function:

def smooth(source):
    lc = source.lclist[0]
    w = np.ones(5)
    if 'm36' in lc and len(lc['m36'] > 10):
        lc['m36'] = np.convolve(w/w.sum(),lc['m36'],mode='valid')
        # need to keep m36 and t36 same length
        lc['t36'] = lc['t36'][0:len(lc['m36'])]
    return source

colnames : list of strings

Basenames of columns to be hold the output of the calculation. If not present already, the bands are added automatically with dtype = np.float:

cat.calc('mean', '36', colnames = ['stuff'])

would add the column stuff_36. If this is left empty, its default is set based on the function. If colnames has less elements than the function returns, only the first few are kept.

colunits : list of strings

Units for the output columns. If this is left empty, its default is set based on the function. If colunits has fewer elements than there are new columns, the unit of the remaining columns will be None.

coldescriptions : list of strings

Descriptions for the output columns. If this is left empty, its default is set based on the function. If coldescriptions has fewer elements than there are new columns, the description of the remaining columns will be ''.

coltypes: list :

list of dtypes for autogenerated columns. If the list is empty, its set to the default of the function called.

overwrite : bool

If True, values in existing columns are silently overwritten.

t_simul : float

max distance in days to accept datapoints in band 1 and 2 as simultaneous In L1688 and IRAS 20050+2720 the distance between band 1 and 2 coverage is within a few minutes, so a small number is sufficent to catch everything and to avoid false matches. If None is given, this defaults to self.t_simul.

All remaining keywords are passed to the function identified by ``name``. :

calc_allstats(band)

calcualte all simple statistical descriptors for a single band

This function calculates all simple statistical quantities that can be autogenerated for a certain band. The required columns are added to the data table.

This does no include periodicity, which requires certain user selected parameters.

Parameters:

band : string

name of band for which the calcualtion should be performed

classify_SED_slope(bands=['mean_36', 'mean_45', 'Kmag', '3.6mag', '4.5mag', '5.8mag', '8.0mag'], colname='IRclass')

Classify the SED slope of an object

This function calculates the SED slope for each object according to the prescription outlined by Luisa in the big data paper.

It uses all available datapoints in the IR from the bands given If no measurement is present (e.g. missing or upper limit only) this band is ignored. The procedure performs a least-squares fit with equal weight for each band and then classifies the resulting slope into class I, flat-spectrum, II and III sources.

Parameters:

bands : list of strings

List of names for all bands to be used. Bands must be defined in YSOVAR.atlas.sed_bands.

colname : string

The classification will be placed in this column. If it exists it is overwritten.

is_there_a_good_period(power, minper, maxper, bands=['36', '45'], FAP=False)

check if a strong periodogram peak is found

This method checks if a period exisits with the required power and period in any of of the bands given in bands. If the peaks in several bands fullfill the criteria, then the band with the peak of highest power is selected. Output is placed in the columns good_peak, good_FAP and good_period.

New columns are added to the datatable that contains the result. (If a column existed before, it is overwritten).

Parameters:

power : float

minimum power or maximal FAP for “good” period

minper : float

lowest period which is considered

maxper : float

maximum period which is considered

bands : list of strings

Band identifiers, e.g. ['36', '45'], can also be a list with one entry, e.g. ['36']

FAP : boolean

If True, then power is interpreted as maximal FAP for a good period; if False then power means the minimum power a peak in the periodogram must have.

lclist
t_simul = 0.01
YSOVAR.atlas.check_dataset(data, min_number_of_times=5, match_dist=0.0002777777777777778)

check dataset for anomalies, cross-match problems etc.

Of course, not every problem can be detected here, but every time I find something I add a check so that next time this routine will warn me of the same problem.

Parameters:

data : list of dicts

as read in with e.e. dict_from_csv

YSOVAR.atlas.coord_CDS2RADEC(dat)

transform RA and DEC from CDS table to degrees

CDS tables have a certain format of string columns to store coordinates (RAh, RAm, RAs, DE-, DEd, DEm, DEs). This procedure parses that and calculates new values for RA and DEC in degrees. These are added to the Table as RAdeg and DEdeg.

Parameters:

dat : YSOVAR.atlas.YSOVAR_atlas or astropy.table.Table

with columns in the CDS format (e.g. from reading a CDS table with astropy.io.ascii)

YSOVAR.atlas.coord_add_RADEfromhmsdms(dat, rah, ram, ras, design, ded, dem, des)

transform RA and DEC in table from hms, dms to degrees

Parameters:

dat : YSOVAR.atlas.YSOVAR_atlas or astropy.table.Table

with columns in the CDS format (e.g. from reading a CDS table with astropy.io.ascii)

rah, ram, ras, ded, dem, des: np.ndarray :

RA and DEC hms, dms values

design: +1 or -1 :

Sign of the DE coordinate (integer or float, not string)

YSOVAR.atlas.coord_hmsdms2RADEC(dat, ra=['RAh', 'RAm', 'RAs'], dec=['DEd', 'DEm', 'DEs'])

transform RA and DEC from table to degrees

Tables where RA and DEC are encoded as three numeric columns each like hh:mm:ss and dd:mm:ss can be converted into decimal deg. This procedure parses that and calculates new values for RA and DEC in degrees. These are added to the Table as RAdeg and DEdeg.

Warning

This format is ambiguous for sources with dec=+/-00:xx:xx, because python does not differentiate between +0 and -0.

Parameters:

dat : YSOVAR.atlas.YSOVAR_atlas or astropy.table.Table

with columns in the format given above

ra : list of three strings

names of RA column names for hour, min, sec

dec : list of three strings

names of DEC column names for deg, min, sec

YSOVAR.atlas.coord_strhmsdms2RADEC(dat, ra='RA', dec='DEC', delimiter=':')

transform RA and DEC from table to degrees

Tables where RA and DEC are encoded as string columns each like hh:mm:ss dd:mm:ss can be converted into decimal deg. This procedure parses that and calculates new values for RA and DEC in degrees. These are added to the Table as RAdeg and DEdeg.

Parameters:

dat : YSOVAR.atlas.YSOVAR_atlas or astropy.table.Table

with columns in the format given above

ra : string

name of RA column names for hour, min, sec

dec : string

name of DEC column names for deg, min, sec

delimiter : string

delimiter between elements, e.g. : in 01:23:34.3.

YSOVAR.atlas.dict_cleanup(data, channels, min_number_of_times=0, floor_error={})

Clean up dictionaries after add_ysovar_mags

Each object in the data list can be constructed from multiple sources per band in multiple bands. This function averages the coordinates over all contributing sources, converts and sorts lists of times and magnitudes and makes multiband lightcurves.

Parameters:

data : list of dictionaries

as obtained from YSOVAR.atlas.add_ysovar_mags()

channels : dictionary

This dictionary traslantes the names of channels in the csv file to the names in the output structure, e.g. that for ‘IRAC1’ will be ‘m36’ (magnitudes) and ‘t36’ (times).

min_number_of_times : integer

Remove all lightcurves with less than min_number_of_times datapoints from the list

floor_error : dict

Floor errors will be added in quadrature to all error values. The keys in the dictionary should be the same as in the channels dictionary.

Returns:

data : list of dictionaries

individual dictionaries are cleaned up as decribed above

YSOVAR.atlas.dict_from_csv(csvfile, match_dist=0.0002777777777777778, min_number_of_times=5, channels={'IRAC2': '45', 'IRAC1': '36'}, data=[], floor_error={'IRAC2': 0.007, 'IRAC1': 0.01}, mag='mag1', emag='emag1', time='hmjd', bg=None, source_name='sname', verbose=True, readra='ra', readdec='de', sourceid='ysovarid', channelcolumn='fname')

Build YSOVAR lightcurves from database csv file

Parameters:

cvsfile : sting or file object

input csv file

match_dist : float

maximum distance to match two positions as one sorce

min_number_of_times : integer

Remove all sources with less than min_number_of_times datapoints from the list

channels : dictionary

This dictionary translates the names of channels in the csv file to the names in the output structure, e.g. that for IRAC1 will be m36 (magnitudes) and t36 (times).

data : list of dicts

New entries will be added to data. It can be empty (the default).

mag : string

name of magnitude column

emag : string

name of column holding the error on the mag

time : string

name of column holding the time of observation

bg : string or None

name of column holding the bg for each observations (None indicates that the bg column is not present).

floor_error : dict

Floor errors will be added in quadrature to all error values. The keys in the dictionary should be the same as in the channels dictionary.

verbose : bool

If True, print progress status.

Returns:

data : empty list or list of dictionaries

structure to hold all the information

TBD: Still need to deal with double entries in lightcurve (and check manually...) :

YSOVAR.atlas.get_sed(data, sed_bands={'Bmag': ['e_Bmag', 0.43, 4000.87], '5.8mag': ['e_5.8mag', 5.8, 115.0], 'mean_36': ['e_3.6mag', 3.6, 280.9], 'simbad_B': [None, 0.43, 4000.87], '4.5mag': ['e_4.5mag', 4.5, 179.7], 'Vmag': ['e_Vmag', 0.623, 3597.28], 'Imag': ['e_Imag', 0.798, 2587], 'Kmag': ['e_Kmag', 2.159, 666.7], 'simbad_V': [None, 0.623, 3597.28], 'Rmag': ['e_Rmag', 0.759, 3182], '3.6mag': ['e_3.6mag', 3.6, 280.9], 'mean_45': ['e_4.5mag', 4.5, 179.7], 'imag': ['e_imag', 0.763, 2515.7], 'Jmag': ['e_Jmag', 1.235, 1594], 'rmag': ['e_rmag', 0.622, 3173.3], 'Hmag': ['e_Hmag', 1.662, 1024], '8.0mag': ['e_8.0mag', 8.0, 64.13], '24mag': ['e_24mag', 24.0, 7.14], 'Umag': ['e_Umag', 0.355, 1500], 'nomad_Rmag': [None, 0.759, 3182], 'nomad_Bmag': [None, 0.43, 4000.87], 'Hamag': ['e_Hamag', 0.656, 2974.4], 'nomad_Vmag': [None, 0.623, 3597.28]}, valid=False)

make SED by collecting info from the input data

Parameters:

data : YSOVAR.atlas.YSOVAR_atlas or astropy.table.Table

input data that has arrays of magnitudes for different bands

sed_bands : dict

keys must be the name of the field that contains the magnitudes in each band, entries are lists of [name of error field, wavelength in micron, zero_magnitude_flux_freq in Jy]

valid : bool

If true, return only bands with finite flux, otherwise return all bands that exist in both data and sed_bands.

Returns:

wavelen : np.ndarray

central wavelength of bands in micron

mags : np.ndarray

magnitude in band

mags_error : np.ndarray

error on magnitude

sed : np.ndarray

flux in Jy

YSOVAR.atlas.makecrossids(data1, data2, radius, ra1='RAdeg', dec1='DEdeg', ra2='ra', dec2='dec', double_match=False)

Cross-match two lists of coordinates, return closest match

This routine is not very clever and not very fast. It should be fine up to a hundred thousand entries per list.

Parameters:

data1 : astropy.table.Table or np.recarray

This is the master data, i.e. for each element in data1, the results wil have one (or zero) index numbers in data2, that provide the best match to this entry in data1.

data2 : astropt.table.Table or np.recarray

This data is matched to data1.

radius : np.float or array

maximum radius to accept a match (in degrees); either a scalar or same length as data2

ra1, dec1, ra2, dec2 : string

key for access RA and DEG (in degrees) the the data, i.e. the routine uses data1[ra1] for the RA values of data1.

double_match : bool

If true, one source in data2 could be matched to several sources in data1. This can happen, if a source in data2 lies between two sources of data1, which are both within radius. If this switch is set to False, then a strict one-on-one matching is enforced, selecting the closest pair in the situation above.

Returns:

cross_ids : np.ndarray

Will have len(data1). For each elelment it contains the index of data2 that provides the best match. If no match within radius is found, then entry will be -99999.

YSOVAR.atlas.makecrossids_all(data1, data2, radius, ra1='RAdeg', dec1='DEdeg', ra2='ra', dec2='dec', return_distances=False)

Cross-match two lists of coordinates, return all matches within radius

This routine is not very clever and not very fast. If should be fine up to a hundred thousand entries per list.

Parameters:

data1 : astropy.table.Table or np.recarray

This is the master data, i.e. for each element in data1, the results wil have the index numbers in data2, that provide the best match to this entry in data1.

data2 : astropy.table.Table or np.recarray

This data is matched to data1.

radius : np.float or array

maximum radius to accept a match (in degrees)

ra1, dec1, ra2, dec2 : string

key for access RA and DEG (in degrees) the the data, i.e. the routine uses data1[ra1] for the RA values of data1.

return_distances : bool

decide if distances should be returned

Returns:

cross_ids : list of lists

Will have len(data1). For each elelment it contains the indices of data2 that are within radius. If no match within radius is found, then the entry will be [].

distances : list of lists

If return_distances==True this has the same format a cross_ids and contains the distance to the match in degrees.

YSOVAR.atlas.merge_lc(d, bands, t_simul=0.01)

merge lightcurves from several bands

This returns a lightcurve that contains only entries for those times, where all required bands have an entry.

Parameters:

d : dictionary

as obtained from YSOVAR.atlas.add_ysovar_mags()

bands : list of strings

labels of the spectral bands to be merged, e.g. [‘36’,‘45’]

t_simul : float

max distance in days to accept datapoints in band 1 and 2 as simultaneous In L1688 and IRAS 20050+2720 the distance between band 1 and 2 coverage is within a few minutes, so a small number is sufficent to catch everything and to avoid false matches.

Returns:

tab : astropy.table.Table

This table contains the merged lightcurve and contains times, fluxes and errors.

YSOVAR.atlas.phase_fold(time, period)

Phase fold a set of time on a period

Parameters:

time : np.ndarray

array of times

period : np.float

YSOVAR.atlas.radec_from_dict(data, RA='ra', DEC='dec')

return ra dec numpy array for list of dicts

Parameters:

data : list of several dict

RA, DEC : strings

keys for RA and DEC in the dictionary

Returns:

radec : np record array with RA, DEC columns

YSOVAR.atlas.sed_slope(data, sed_bands={'Bmag': ['e_Bmag', 0.43, 4000.87], '5.8mag': ['e_5.8mag', 5.8, 115.0], 'mean_36': ['e_3.6mag', 3.6, 280.9], 'simbad_B': [None, 0.43, 4000.87], '4.5mag': ['e_4.5mag', 4.5, 179.7], 'Vmag': ['e_Vmag', 0.623, 3597.28], 'Imag': ['e_Imag', 0.798, 2587], 'Kmag': ['e_Kmag', 2.159, 666.7], 'simbad_V': [None, 0.623, 3597.28], 'Rmag': ['e_Rmag', 0.759, 3182], '3.6mag': ['e_3.6mag', 3.6, 280.9], 'mean_45': ['e_4.5mag', 4.5, 179.7], 'imag': ['e_imag', 0.763, 2515.7], 'Jmag': ['e_Jmag', 1.235, 1594], 'rmag': ['e_rmag', 0.622, 3173.3], 'Hmag': ['e_Hmag', 1.662, 1024], '8.0mag': ['e_8.0mag', 8.0, 64.13], '24mag': ['e_24mag', 24.0, 7.14], 'Umag': ['e_Umag', 0.355, 1500], 'nomad_Rmag': [None, 0.759, 3182], 'nomad_Bmag': [None, 0.43, 4000.87], 'Hamag': ['e_Hamag', 0.656, 2974.4], 'nomad_Vmag': [None, 0.623, 3597.28]})

fit the SED slope to data for all bands in data and sed_bands

Parameters:

data : YSOVAR.atlas.YSOVAR_atlas or astropy.table.Table

input data that has arrays of magnitudes for different bands

sed_bands : dict

keys must be the name of the field that contains the magnitudes in each band, entries are lists of [name of error field, wavelength in micron, zero_magnitude_flux_freq in Jy]

Returns:

slope : float

slope of the SED determined with a least squares fit. Return np.nan if there is too little data.

YSOVAR.atlas.val_from_dict(data, name)

return ra dec numpy array for list of dicts

Parameters:

data : list of dict

name : strings

keys for entry in the dictionary

Returns:

col : list of values