Generate atlas¶
Generate an atlas of YSOVAR lightcurves
This module collects all procedures that are required to make the
atlas. This starts with reading in the csv file from the YSOVAR2
database and includes the calculation of the some fits and quantities.
More specific tasks for the analysis of the lightcurve can be found in
YSOVAR.lightcurves
, more stuff for plotting in YSOVAR.plot
.
The basic structure for the YSOVAR analysis is the
YSOVAR.atlas.YSOVAR_atlas
.
To initialize an atlas object pass in a numpy array with all the lightcurves:
from YSOVAR import atlas
data = atlas.dict_from_csv('/path/to/my/irac.csv', match_dist = 0.)
MyRegion = atlas.YSOVAR_atlas(lclist = data)
The YSOVAR.atlas.YSOVAR_atlas
is build on top of a astropy.table.Table
(documentation here) object. See that
documentation for the syntax on how to acess the data or add a column.
This YSOVAR.atlas.YSOVAR_atlas
auto-generates some content in the
background, so I really encourage you to read the documentation (We promise it’s
only a few lines because we are too lazy to type much more).
-
YSOVAR.atlas.
IAU2radec
(isoy)¶ convert IAU name do decimal degrees.
Parameters: isoy : string
IAU Name
Returns: ra, dec : float
ra and dec in decimal degrees
-
class
YSOVAR.atlas.
YSOVAR_atlas
(*args, **kwargs)¶ The basic structure for the YSOVAR analysis is the
YSOVAR_atlas
. To initialize an atlas object pass in a numpy array with all the lightcurves:from YSOVAR import atlas data = atlas.dict_from_csv('/path/tp/my/irac.csv', match_dist = 0.) MyRegion = atlas.YSOVAR_atlas(lclist = data)
The
YSOVAR_atlas
is build on top of a astropy.table.Table (documentation here) object. See that documentation for the syntax on how to acess the data or add a column.Some columns are auto-generated, when they are first used. Some examples are
- median
- mean
- stddev
- min
- max
- mad (median absolute deviation)
- delta (90% quantile - 10% quantile)
- redchi2tomean
- wmean (uncertainty weighted average).
When you ask for
MyRegion['min_36']
it first checks if that column is already present. If not, if adds the new column calledmin_36
and calculates the minimum of the lightcurve in band36
for each object in the atlas, that hasm36
andt36
entries (for magnitude and time in band36
respectively). Data read withdict_from_csv()
automatically has the required format.More functions may be added to this magic list later. Check:
import YSOVAR.registry YSOVAR.registry.list_lcfuncs()
to see which functions are implemented. More function can be added.
Also, table columns can be added to an
YSOVAR_atlas
object manually, giving you all the freedom to do arbitrary calculations to arrive at those vales.-
add_catalog_data
(catalog, radius=0.0002777777777777778, names=None, **kwargs)¶ add information from a different Table
The tables are automatically cross matched and values are copied only for objects that have a counterpart in the current table.
Parameters: catalog : astropy.table.Table
This is the table where the new information is provided.
radius : np.float
matching radius in degrees
names : list of strings
List column names that should be copied. If this is None (the default) copy all columns. Column names have to be unique. Thus, make sure that no column of the same name aleady exisits (this will raise an exception).
All other keywords are passed to :func:`YSOVAR.atlas.makecrossids` (see there :
for the syntax). :
-
add_mags
(data, cross_ids, band, channel)¶ Add lightcurves to some list of dictionaries
Parameters: data : astropy.table.Table or np.rec.array
data table with new mags
cross_ids : list of lists
for each elements in self,
cross_ids
says which row in data should be included for this objectband : list of strings
[name of mag, name or error, name of time]
channel : string
name of this channel in the lightcurve. Should be short and unique.
-
autocalc_newcol
(name)¶ automatically calculate some columns on the fly
-
calc
(name, bands, timefilter=None, data_preprocessor=None, colnames=[], colunits=[], coldescriptions=[], coltypes=[], overwrite=True, t_simul=None, **kwargs)¶ calculate some quantity for all sources
This is a very general interface to the catalog that allows the user to initiate calculations with any function defined in the
registry
. A new column is added to the datatable that contains the result. (If the column exists before, it is overwritten).Parameters: name : string
name of the function in the function registry for YSOVAR (see
registry
)bands : list of strings
Band identifiers In some cases, it can be useful to calculate a quantity for the error (e.g. the mean error). In this case, just give the band as e.g.
36_error
. (This only works for simple functions.)timefilter : function or None
If not
None
, this function accepts a np.ndarray of observation times and it should return an index array selecteing those time to be included in the calculation. The default function selects all times. An example how to use this keyword include certain times only is shown below:cat.calc('mean', '36', timefilter = lambda x : x < 55340)
data_preprocessor : function or None
If not None, for each source, a row from the present table is extracted as
source = self[id]
. This yields aYSOVAR_atlas
object with one row. This row is passed todata_preprocessor
, which can modify the data (e.g. smooth a lightcurve), but should keep the structure of the object intact. Here is an example for a possibledata_preprocessor
function:def smooth(source): lc = source.lclist[0] w = np.ones(5) if 'm36' in lc and len(lc['m36'] > 10): lc['m36'] = np.convolve(w/w.sum(),lc['m36'],mode='valid') # need to keep m36 and t36 same length lc['t36'] = lc['t36'][0:len(lc['m36'])] return source
colnames : list of strings
Basenames of columns to be hold the output of the calculation. If not present already, the bands are added automatically with
dtype = np.float
:cat.calc('mean', '36', colnames = ['stuff'])
would add the column
stuff_36
. If this is left empty, its default is set based on the function. Ifcolnames
has less elements than the function returns, only the first few are kept.colunits : list of strings
Units for the output columns. If this is left empty, its default is set based on the function. If
colunits
has fewer elements than there are new columns, the unit of the remaining columns will beNone
.coldescriptions : list of strings
Descriptions for the output columns. If this is left empty, its default is set based on the function. If
coldescriptions
has fewer elements than there are new columns, the description of the remaining columns will be''
.coltypes: list :
list of dtypes for autogenerated columns. If the list is empty, its set to the default of the function called.
overwrite : bool
If True, values in existing columns are silently overwritten.
t_simul : float
max distance in days to accept datapoints in band 1 and 2 as simultaneous In L1688 and IRAS 20050+2720 the distance between band 1 and 2 coverage is within a few minutes, so a small number is sufficent to catch everything and to avoid false matches. If
None
is given, this defaults toself.t_simul
.All remaining keywords are passed to the function identified by ``name``. :
-
calc_allstats
(band)¶ calcualte all simple statistical descriptors for a single band
This function calculates all simple statistical quantities that can be autogenerated for a certain band. The required columns are added to the data table.
This does no include periodicity, which requires certain user selected parameters.
Parameters: band : string
name of band for which the calcualtion should be performed
-
classify_SED_slope
(bands=['mean_36', 'mean_45', 'Kmag', '3.6mag', '4.5mag', '5.8mag', '8.0mag'], colname='IRclass')¶ Classify the SED slope of an object
This function calculates the SED slope for each object according to the prescription outlined by Luisa in the big data paper.
It uses all available datapoints in the IR from the bands given If no measurement is present (e.g. missing or upper limit only) this band is ignored. The procedure performs a least-squares fit with equal weight for each band and then classifies the resulting slope into class I, flat-spectrum, II and III sources.
Parameters: bands : list of strings
List of names for all bands to be used. Bands must be defined in
YSOVAR.atlas.sed_bands
.colname : string
The classification will be placed in this column. If it exists it is overwritten.
-
is_there_a_good_period
(power, minper, maxper, bands=['36', '45'], FAP=False)¶ check if a strong periodogram peak is found
This method checks if a period exisits with the required power and period in any of of the bands given in
bands
. If the peaks in several bands fullfill the criteria, then the band with the peak of highest power is selected. Output is placed in the columnsgood_peak
,good_FAP
andgood_period
.New columns are added to the datatable that contains the result. (If a column existed before, it is overwritten).
Parameters: power : float
minimum power or maximal FAP for “good” period
minper : float
lowest period which is considered
maxper : float
maximum period which is considered
bands : list of strings
Band identifiers, e.g.
['36', '45']
, can also be a list with one entry, e.g.['36']
FAP : boolean
If
True
, thenpower
is interpreted as maximal FAP for a good period; ifFalse
thenpower
means the minimum power a peak in the periodogram must have.
-
lclist
¶
-
t_simul
= 0.01¶
-
YSOVAR.atlas.
check_dataset
(data, min_number_of_times=5, match_dist=0.0002777777777777778)¶ check dataset for anomalies, cross-match problems etc.
Of course, not every problem can be detected here, but every time I find something I add a check so that next time this routine will warn me of the same problem.
Parameters: data : list of dicts
as read in with e.e. dict_from_csv
-
YSOVAR.atlas.
coord_CDS2RADEC
(dat)¶ transform RA and DEC from CDS table to degrees
CDS tables have a certain format of string columns to store coordinates (
RAh
,RAm
,RAs
,DE-
,DEd
,DEm
,DEs
). This procedure parses that and calculates new values for RA and DEC in degrees. These are added to the Table asRAdeg
andDEdeg
.Parameters: dat :
YSOVAR.atlas.YSOVAR_atlas
orastropy.table.Table
with columns in the CDS format (e.g. from reading a CDS table with
astropy.io.ascii
)
-
YSOVAR.atlas.
coord_add_RADEfromhmsdms
(dat, rah, ram, ras, design, ded, dem, des)¶ transform RA and DEC in table from hms, dms to degrees
Parameters: dat :
YSOVAR.atlas.YSOVAR_atlas
orastropy.table.Table
with columns in the CDS format (e.g. from reading a CDS table with
astropy.io.ascii
)rah, ram, ras, ded, dem, des: np.ndarray :
RA and DEC hms, dms values
design: +1 or -1 :
Sign of the DE coordinate (integer or float, not string)
-
YSOVAR.atlas.
coord_hmsdms2RADEC
(dat, ra=['RAh', 'RAm', 'RAs'], dec=['DEd', 'DEm', 'DEs'])¶ transform RA and DEC from table to degrees
Tables where RA and DEC are encoded as three numeric columns each like
hh:mm:ss
anddd:mm:ss
can be converted into decimal deg. This procedure parses that and calculates new values for RA and DEC in degrees. These are added to the Table asRAdeg
andDEdeg
.Warning
This format is ambiguous for sources with dec=+/-00:xx:xx, because python does not differentiate between
+0
and-0
.Parameters: dat :
YSOVAR.atlas.YSOVAR_atlas
orastropy.table.Table
with columns in the format given above
ra : list of three strings
names of RA column names for hour, min, sec
dec : list of three strings
names of DEC column names for deg, min, sec
-
YSOVAR.atlas.
coord_strhmsdms2RADEC
(dat, ra='RA', dec='DEC', delimiter=':')¶ transform RA and DEC from table to degrees
Tables where RA and DEC are encoded as string columns each like hh:mm:ss dd:mm:ss can be converted into decimal deg. This procedure parses that and calculates new values for RA and DEC in degrees. These are added to the Table as RAdeg and DEdeg.
Parameters: dat :
YSOVAR.atlas.YSOVAR_atlas
orastropy.table.Table
with columns in the format given above
ra : string
name of RA column names for hour, min, sec
dec : string
name of DEC column names for deg, min, sec
delimiter : string
delimiter between elements, e.g.
:
in01:23:34.3
.
-
YSOVAR.atlas.
dict_cleanup
(data, channels, min_number_of_times=0, floor_error={})¶ Clean up dictionaries after add_ysovar_mags
Each object in the data list can be constructed from multiple sources per band in multiple bands. This function averages the coordinates over all contributing sources, converts and sorts lists of times and magnitudes and makes multiband lightcurves.
Parameters: data : list of dictionaries
as obtained from
YSOVAR.atlas.add_ysovar_mags()
channels : dictionary
This dictionary traslantes the names of channels in the csv file to the names in the output structure, e.g. that for ‘IRAC1’ will be ‘m36’ (magnitudes) and ‘t36’ (times).
min_number_of_times : integer
Remove all lightcurves with less than min_number_of_times datapoints from the list
floor_error : dict
Floor errors will be added in quadrature to all error values. The keys in the dictionary should be the same as in the channels dictionary.
Returns: data : list of dictionaries
individual dictionaries are cleaned up as decribed above
-
YSOVAR.atlas.
dict_from_csv
(csvfile, match_dist=0.0002777777777777778, min_number_of_times=5, channels={'IRAC2': '45', 'IRAC1': '36'}, data=[], floor_error={'IRAC2': 0.007, 'IRAC1': 0.01}, mag='mag1', emag='emag1', time='hmjd', bg=None, source_name='sname', verbose=True, readra='ra', readdec='de', sourceid='ysovarid', channelcolumn='fname')¶ Build YSOVAR lightcurves from database csv file
Parameters: cvsfile : sting or file object
input csv file
match_dist : float
maximum distance to match two positions as one sorce
min_number_of_times : integer
Remove all sources with less than min_number_of_times datapoints from the list
channels : dictionary
This dictionary translates the names of channels in the csv file to the names in the output structure, e.g. that for
IRAC1
will bem36
(magnitudes) andt36
(times).data : list of dicts
New entries will be added to data. It can be empty (the default).
mag : string
name of magnitude column
emag : string
name of column holding the error on the mag
time : string
name of column holding the time of observation
bg : string or None
name of column holding the bg for each observations (None indicates that the bg column is not present).
floor_error : dict
Floor errors will be added in quadrature to all error values. The keys in the dictionary should be the same as in the channels dictionary.
verbose : bool
If True, print progress status.
Returns: data : empty list or list of dictionaries
structure to hold all the information
TBD: Still need to deal with double entries in lightcurve (and check manually...) :
-
YSOVAR.atlas.
get_sed
(data, sed_bands={'Bmag': ['e_Bmag', 0.43, 4000.87], '5.8mag': ['e_5.8mag', 5.8, 115.0], 'mean_36': ['e_3.6mag', 3.6, 280.9], 'simbad_B': [None, 0.43, 4000.87], '4.5mag': ['e_4.5mag', 4.5, 179.7], 'Vmag': ['e_Vmag', 0.623, 3597.28], 'Imag': ['e_Imag', 0.798, 2587], 'Kmag': ['e_Kmag', 2.159, 666.7], 'simbad_V': [None, 0.623, 3597.28], 'Rmag': ['e_Rmag', 0.759, 3182], '3.6mag': ['e_3.6mag', 3.6, 280.9], 'mean_45': ['e_4.5mag', 4.5, 179.7], 'imag': ['e_imag', 0.763, 2515.7], 'Jmag': ['e_Jmag', 1.235, 1594], 'rmag': ['e_rmag', 0.622, 3173.3], 'Hmag': ['e_Hmag', 1.662, 1024], '8.0mag': ['e_8.0mag', 8.0, 64.13], '24mag': ['e_24mag', 24.0, 7.14], 'Umag': ['e_Umag', 0.355, 1500], 'nomad_Rmag': [None, 0.759, 3182], 'nomad_Bmag': [None, 0.43, 4000.87], 'Hamag': ['e_Hamag', 0.656, 2974.4], 'nomad_Vmag': [None, 0.623, 3597.28]}, valid=False)¶ make SED by collecting info from the input data
Parameters: data :
YSOVAR.atlas.YSOVAR_atlas
orastropy.table.Table
input data that has arrays of magnitudes for different bands
sed_bands : dict
keys must be the name of the field that contains the magnitudes in each band, entries are lists of [name of error field, wavelength in micron, zero_magnitude_flux_freq in Jy]
valid : bool
If true, return only bands with finite flux, otherwise return all bands that exist in both
data
andsed_bands
.Returns: wavelen : np.ndarray
central wavelength of bands in micron
mags : np.ndarray
magnitude in band
mags_error : np.ndarray
error on magnitude
sed : np.ndarray
flux in Jy
-
YSOVAR.atlas.
makecrossids
(data1, data2, radius, ra1='RAdeg', dec1='DEdeg', ra2='ra', dec2='dec', double_match=False)¶ Cross-match two lists of coordinates, return closest match
This routine is not very clever and not very fast. It should be fine up to a hundred thousand entries per list.
Parameters: data1 :
astropy.table.Table
or np.recarrayThis is the master data, i.e. for each element in data1, the results wil have one (or zero) index numbers in data2, that provide the best match to this entry in data1.
data2 : astropt.table.Table or np.recarray
This data is matched to data1.
radius : np.float or array
maximum radius to accept a match (in degrees); either a scalar or same length as data2
ra1, dec1, ra2, dec2 : string
key for access RA and DEG (in degrees) the the data, i.e. the routine uses data1[ra1] for the RA values of data1.
double_match : bool
If true, one source in data2 could be matched to several sources in data1. This can happen, if a source in data2 lies between two sources of data1, which are both within
radius
. If this switch is set toFalse
, then a strict one-on-one matching is enforced, selecting the closest pair in the situation above.Returns: cross_ids : np.ndarray
Will have len(data1). For each elelment it contains the index of data2 that provides the best match. If no match within radius is found, then entry will be -99999.
-
YSOVAR.atlas.
makecrossids_all
(data1, data2, radius, ra1='RAdeg', dec1='DEdeg', ra2='ra', dec2='dec', return_distances=False)¶ Cross-match two lists of coordinates, return all matches within radius
This routine is not very clever and not very fast. If should be fine up to a hundred thousand entries per list.
Parameters: data1 :
astropy.table.Table
or np.recarrayThis is the master data, i.e. for each element in data1, the results wil have the index numbers in data2, that provide the best match to this entry in data1.
data2 :
astropy.table.Table
or np.recarrayThis data is matched to data1.
radius : np.float or array
maximum radius to accept a match (in degrees)
ra1, dec1, ra2, dec2 : string
key for access RA and DEG (in degrees) the the data, i.e. the routine uses data1[ra1] for the RA values of data1.
return_distances : bool
decide if distances should be returned
Returns: cross_ids : list of lists
Will have len(data1). For each elelment it contains the indices of data2 that are within radius. If no match within radius is found, then the entry will be [].
distances : list of lists
If
return_distances==True
this has the same format across_ids
and contains the distance to the match in degrees.
-
YSOVAR.atlas.
merge_lc
(d, bands, t_simul=0.01)¶ merge lightcurves from several bands
This returns a lightcurve that contains only entries for those times, where all required bands have an entry.
Parameters: d : dictionary
as obtained from
YSOVAR.atlas.add_ysovar_mags()
bands : list of strings
labels of the spectral bands to be merged, e.g. [‘36’,‘45’]
t_simul : float
max distance in days to accept datapoints in band 1 and 2 as simultaneous In L1688 and IRAS 20050+2720 the distance between band 1 and 2 coverage is within a few minutes, so a small number is sufficent to catch everything and to avoid false matches.
Returns: tab : astropy.table.Table
This table contains the merged lightcurve and contains times, fluxes and errors.
-
YSOVAR.atlas.
phase_fold
(time, period)¶ Phase fold a set of time on a period
Parameters: time : np.ndarray
array of times
period : np.float
-
YSOVAR.atlas.
radec_from_dict
(data, RA='ra', DEC='dec')¶ return ra dec numpy array for list of dicts
Parameters: data : list of several dict
RA, DEC : strings
keys for RA and DEC in the dictionary
Returns: radec : np record array with RA, DEC columns
-
YSOVAR.atlas.
sed_slope
(data, sed_bands={'Bmag': ['e_Bmag', 0.43, 4000.87], '5.8mag': ['e_5.8mag', 5.8, 115.0], 'mean_36': ['e_3.6mag', 3.6, 280.9], 'simbad_B': [None, 0.43, 4000.87], '4.5mag': ['e_4.5mag', 4.5, 179.7], 'Vmag': ['e_Vmag', 0.623, 3597.28], 'Imag': ['e_Imag', 0.798, 2587], 'Kmag': ['e_Kmag', 2.159, 666.7], 'simbad_V': [None, 0.623, 3597.28], 'Rmag': ['e_Rmag', 0.759, 3182], '3.6mag': ['e_3.6mag', 3.6, 280.9], 'mean_45': ['e_4.5mag', 4.5, 179.7], 'imag': ['e_imag', 0.763, 2515.7], 'Jmag': ['e_Jmag', 1.235, 1594], 'rmag': ['e_rmag', 0.622, 3173.3], 'Hmag': ['e_Hmag', 1.662, 1024], '8.0mag': ['e_8.0mag', 8.0, 64.13], '24mag': ['e_24mag', 24.0, 7.14], 'Umag': ['e_Umag', 0.355, 1500], 'nomad_Rmag': [None, 0.759, 3182], 'nomad_Bmag': [None, 0.43, 4000.87], 'Hamag': ['e_Hamag', 0.656, 2974.4], 'nomad_Vmag': [None, 0.623, 3597.28]})¶ fit the SED slope to data for all bands in
data
andsed_bands
Parameters: data :
YSOVAR.atlas.YSOVAR_atlas
orastropy.table.Table
input data that has arrays of magnitudes for different bands
sed_bands : dict
keys must be the name of the field that contains the magnitudes in each band, entries are lists of [name of error field, wavelength in micron, zero_magnitude_flux_freq in Jy]
Returns: slope : float
slope of the SED determined with a least squares fit. Return
np.nan
if there is too little data.
-
YSOVAR.atlas.
val_from_dict
(data, name)¶ return ra dec numpy array for list of dicts
Parameters: data : list of dict
name : strings
keys for entry in the dictionary
Returns: col : list of values