Atmospheric retrieval with petitRADTRANS#

This is a tutorial for atmospheric retrievals with petitRADTRANS for which we will use the IRTF spectrum of the L3 type brown dwarf 2MASS J15065441+1321060. The spectrum covers the \(Y\) to \(L\) bands with a resolving power of \(R = 2000\).

Free retrievals are computationally expensive due to the high number of parameter dimensions and the scattering radiative transfer that is important in cloudy atmospheres (see Mollière et al. 2020). Similar to FitModel, the nested sampling supports multiprocessing so it recommended to run the retrievals on a cluster. Before starting, petitRADTRANS should be installed together with the line and continuum opacities (see installation instructions).

When running a retrieval, it is important to comment out any of the functions that access the Database because writing to the HDF5 file is not possible with multiprocessing. Therefore, the data of the planet or brown dwarf should first be added to the database (e.g. on a local machine), then AtmosphericRetrieval and run_multinest can be executed (preferably on a cluster), and finally the results can be extracted and plotted while commenting out the retrieval part or using a separate script (e.g. on a local machine).

Getting started#

We start by adding the library path of MultiNest to the DYLD_LIBRARY_PATH environment variable such that PyMultiNest can find the compiled library (see installation instructions).

[1]:
import os
os.environ['DYLD_LIBRARY_PATH'] = '/Users/tomasstolker/applications/MultiNest/lib'

Next, we import the species toolkit and initiate the workflow with an instance of the SpeciesInit class. This will create the HDF5 database and the configuration file in the working folder. If one of these files was already present, then the existing database and/or configuration will be used.

[2]:
import species
species.SpeciesInit()
Initiating species v0.4.0... [DONE]
Creating species_config.ini... [DONE]
Database: /Users/tomasstolker/applications/species/docs/tutorials/species_database.hdf5
Data folder: /Users/tomasstolker/applications/species/docs/tutorials/data
Working folder: /Users/tomasstolker/applications/species/docs/tutorials
Creating species_database.hdf5... [DONE]
Creating data folder... [DONE]
[2]:
<species.core.init.SpeciesInit at 0x110560b80>

Adding data to the database#

The database is used by species as the central storage for models, data, and results. The data in the HDF5 file is accessed through the Database class. As mentioned already, it is best to add the data beforehand and comment out the database part when executing the retrieval part on a cluster with multi-core processing. To access the HDF5 file, we start by creating an instance of Database.

[3]:
database = species.Database()

Next, we will download the IRTF spectrum of 2MASS J15065441+1321060, which is an L3 type field brown dwarf.

[4]:
import urllib.request
urllib.request.urlretrieve('http://irtfweb.ifa.hawaii.edu/~spex/IRTF_Spectral_Library/Data/L3_2MASSJ1506+1321.fits',
                           'L3_2MASSJ1506+1321.fits')
[4]:
('L3_2MASSJ1506+1321.fits', <http.client.HTTPMessage at 0x110560940>)

Now that we have a FITS file with a spectrum, we use the add_object method of Database to add the spectrum and distance of 2MASS J15065441+1321060 to the Database. The distance was calculated from the parallax of the Gaia Early Data Release 3. We also need to specify the spectral resolution, which is used for smoothing the model spectra to the instrument resolving power. For simplicity, we assume a constant resolving power from \(Y\) to \(L\) band, whereas the actual resolution in the \(L\) band is a bit larger.

[5]:
database.add_object('2MASS J15065441+1321060',
                    distance=(11.71, 0.03),
                    app_mag=None,
                    flux_density=None,
                    spectrum={'IRTF': ('L3_2MASSJ1506+1321.fits', None, 2000.)},
                    deredden=None)
Adding object: 2MASS J15065441+1321060
   - Distance (pc) = 11.71 +/- 0.03
   - Spectrum:
/Users/tomasstolker/applications/species/species/data/database.py:939: UserWarning: Transposing the data of IRTF because the first instead of the second axis has a length of 3.
  warnings.warn(
/Users/tomasstolker/applications/species/species/data/database.py:956: UserWarning: Found 804 fluxes with NaN in the data of IRTF. Removing the spectral fluxes that contain a NaN.
  warnings.warn(
      - Database tag: IRTF
      - Filename: L3_2MASSJ1506+1321.fits
      - Data shape: (5399, 3)
      - Wavelength range (um): 0.93 - 4.10
      - Mean flux (W m-2 um-1): 7.79e-15
      - Mean error (W m-2 um-1): 1.84e-16
   - Spectral resolution:
      - IRTF: 2000.0

Now that the data is present in the database, we can run the retrieval! Since running the actual retrieval takes quite some time, we will simply download the results that were stored (in the output_folder; see below) after running the retrieval on a cluster.

[6]:
import urllib.request
urllib.request.urlretrieve('https://home.strw.leidenuniv.nl/~stolker/species/retrieval.tgz',
                           'retrieval.tgz')
[6]:
('retrieval.tgz', <http.client.HTTPMessage at 0x15049b520>)

Let’s unpack this compressed TAR archive.

[7]:
import tarfile
with tarfile.open('retrieval.tgz') as tar:
    tar.extractall('./')

Atmospheric retrieval of abundances, clouds, and P-T structure#

A retrieval is started by creating an instance of AtmosphericRetrieval. Several parameters need to be provided, including the database tag to the data (i.e. object_name), the line and cloud species that should be included in the forward model (i.e. line_species and cloud_species), and if scattering should be turned on with the radiative transfer (i.e scattering). Scattering will make the calculation of the forward model much slower but is important for cloudy atmospheres.

[8]:
retrieve = species.AtmosphericRetrieval(object_name='2MASS J15065441+1321060',
                                        line_species=['CO_all_iso_HITEMP', 'H2O_HITEMP', 'CH4', 'NH3', 'CO2', 'Na_allard', 'K_allard', 'TiO_all_Exomol', 'VO_Plez', 'FeH', 'H2S'],
                                        cloud_species=['MgSiO3(c)_cd', 'Fe(c)_cd'],
                                        output_folder='multinest',
                                        wavel_range=(0.9, 4.2),
                                        scattering=True,
                                        inc_spec=True,
                                        inc_phot=False,
                                        pressure_grid='smaller',
                                        weights=None)
Object: 2MASS J15065441+1321060
Distance: 11.71
Line species:
   - CO_all_iso_HITEMP
   - H2O_HITEMP
   - CH4
   - NH3
   - CO2
   - Na_allard
   - K_allard
   - TiO_all_Exomol
   - VO_Plez
   - FeH
   - H2S
Cloud species:
   - MgSiO3(c)_cd
   - Fe(c)_cd
Line-by-line species: None
Scattering: True
Getting object: 2MASS J15065441+1321060... [DONE]
Spectroscopic data:
   - IRTF
     Wavelength range (um) = 0.93 - 4.10
     Spectral resolution = 2000.00
Initiating 180 pressure levels (bar): 1.00e-06 - 1.00e+03
Weights for the log-likelihood function:
   - IRTF = 1.00e+00

Next, we can run the actual retrieval with run_multinest. The nested sampling algorithm supports multiprocessing so make sure to use MPI when running on a cluster. For testing purpose this is not required though since the retrieval also runs without MPI.

The run_multinest method has some parameters that are used by the forward model (e.g. specifying the type of chemistry with chemistry and the P-T parametrization with pt_profile). There is also a dictionary required as argument of bounds, to set the boundaries of the priors that are either uniform or log-uniform priors. The latter is for parameters typically starting with log_. There is an additional parameter called prior, which can be used as Gaussian prior on any of the parameters (including the planet mass).

Not all the parameters that are included in bounds are mandatory so some will only be included in the model if they are provided in the dictionary. Furthermore, there are various parametrizations for the clouds available. The model that is used is partially determined by the parameters that are included. In fact, in the example below, only the surface gravity (logg) and radius (radius) are mandatory parameters.

In this case, the C/O ratio and metallicity are free parameter because we are using a chemical equilibrium model. We also retrieve the cloud mass fractions relative to the equilibrium abundances (mgsio3_fraction and fe_fraction), the sedimentation parameter (fsed; determines the vertical extent of the clouds), the eddy diffusion coefficient (log_kzz; determines the particles sizes of the clouds), and the width of the log-normal size distribution (sigma_lnorm) for the cloud particles.

With plotting=True, each model iteration would save several plots, which can be inspected for testing purpose. It is best set the argument to False when running the full retrieval. There is also the possibility to continue a retrieval that was interrupted by setting resume=True (as long as the input parameters have not changed) because MultiNest continuously updates the data in the output_folder. Since we have downloaded the completed retrieval results, we will set resume=True. In this case, the retrieval will finish directly because the posterior distribution was already fully sampled at the requested accuracy, so in principle the run_multinest could also be skipped.

[9]:
retrieve.run_multinest(bounds={'logg': (2.5, 6.0),
                               'c_o_ratio': (0.1, 1.5),
                               'metallicity': (-3., 3.),
                               'radius': (0.5, 2.),
                               'mgsio3_fraction': (-2., 1.),
                               'fe_fraction': (-2., 1.),
                               'fsed': (0., 20.),
                               'log_kzz': (2., 15.),
                               'sigma_lnorm': (1.2, 5.)},
                       chemistry='equilibrium',
                       quenching='pressure',
                       pt_profile='monotonic',
                       fit_corr=None,
                       n_live_points=1000,
                       resume=True,
                       plotting=False,
                       pt_smooth=0.3)
Importing petitRADTRANS... [DONE]
Importing chemistry module... [DONE]
Importing rebin module... [DONE]
Setting up petitRADTRANS...
  Read line opacities of CO_all_iso_HITEMP...
 Done.
  Read line opacities of H2O_HITEMP...
 Done.
  Read line opacities of CH4...
 Done.
  Read line opacities of NH3...
 Done.
  Read line opacities of CO2...
 Done.
  Read line opacities of Na_allard...
 Done.
  Read line opacities of K_allard...
 Done.
  Read line opacities of TiO_all_Exomol...
 Done.
  Read line opacities of VO_Plez...
 Done.
  Read line opacities of FeH...
 Done.
  Read line opacities of H2S...
 Done.

  Read in opacity of cloud species MgSiO3 ...
  Read in opacity of cloud species Fe ...
 Done.

  Read CIA opacities for H2-H2...
  Read CIA opacities for H2-He...
 Done.

Fitting 25 parameters:
   - logg
   - radius
   - t0
   - t1
   - t2
   - t3
   - t4
   - t5
   - t6
   - t7
   - t8
   - t9
   - t10
   - t11
   - t12
   - t13
   - t14
   - metallicity
   - c_o_ratio
   - log_p_quench
   - fsed
   - log_kzz
   - sigma_lnorm
   - mgsio3_fraction
   - fe_fraction
Number of pressure levels used with the radiative transfer: 60
Storing the model parameters: multinest/params.json
Storing the Radtrans arguments: multinest/radtrans.json
Sampling the posterior distribution with MultiNest...
/Users/tomasstolker/.pyenv/versions/3.9.6/envs/species3.9/lib/python3.9/site-packages/pymultinest/run.py:208: DeprecationWarning: inspect.getargspec() is deprecated since Python 3.0, use inspect.signature() or inspect.getfullargspec()
  nargs = len(inspect.getargspec(LogLikelihood).args) - inspect.ismethod(LogLikelihood)
 *****************************************************
 MultiNest v3.10
 Copyright Farhan Feroz & Mike Hobson
 Release Jul 2015

 no. of live points = 1000
 dimensionality =   25
 running in constant efficiency mode
 resuming from previous job
 *****************************************************
 Starting MultiNest
Acceptance Rate:                        0.099888
Replacements:                             144352
Total Samples:                           1445138
Nested Sampling ln(Z):             122069.213373
Importance Nested Sampling ln(Z):  122071.270711 +/-  0.023535
 ln(ev)=   122069.27548137963      +/-  0.36713385315822933
 Total Likelihood Evaluations:      1445138
 Sampling finished. Exiting MultiNest

As expected, the retrieval finishes directly after starting because it read the completed results from the output_folder.

Nested sampling output folder#

The output data from the nested sampling with MultiNest is stored in the output_folder. After running the retrieval, we can use the add_retrieval method of Database to store the posterior samples from the output_folder, together with the relevant attributes, in the HDF5 database. The argument of tag is used as name tag in the database and will be used later on for processing the results. It is also possible to calculate \(T_\mathrm{eff}\) for each sample but this takes a long time because each spectrum needs to be calculated over a broad wavelength range.

[10]:
database.add_retrieval(tag='2mj1506',
                       output_folder='multinest',
                       inc_teff=False)
Storing samples in the database... [DONE]

Instead of using the inc_teff parameter of add_retrieval, we use the get_retrieval_teff method to estimate \(T_\mathrm{eff}\) from a small number of samples (30 in the example below). The value is stored in the database as attribute of the group that is specified with the argument of tag.

[11]:
database.get_retrieval_teff(tag='2mj1506',
                            random=30)
Calculating Teff from 30 posterior samples...
Importing petitRADTRANS... [DONE]
  Read line opacities of CO_all_iso_HITEMP...
 Done.
  Read line opacities of H2O_HITEMP...
 Done.
  Read line opacities of CH4...
 Done.
  Read line opacities of NH3...
 Done.
  Read line opacities of CO2...
 Done.
  Read line opacities of Na_allard...
 Done.
  Read line opacities of K_allard...
 Done.
  Read line opacities of TiO_all_Exomol...
 Done.
  Read line opacities of VO_Plez...
 Done.
  Read line opacities of FeH...
 Done.
  Read line opacities of H2S...
 Done.

  Read in opacity of cloud species MgSiO3 ...
  Read in opacity of cloud species Fe ...
 Done.

  Read CIA opacities for H2-H2...
  Read CIA opacities for H2-He...
 Done.

Getting posterior spectra 30/30... [DONE]
Teff (K) = 1878.36 -0.29 +0.18
Storing Teff as attribute of results/fit/2mj1506/samples... [DONE]
[11]:
(1878.3065052723323, 0.26026231664840904)

Plotting the posterior distributions#

We can now read the posterior samples from the database and use the plot functionalities to visualize the results. Let’s first plot the marginalized posterior distributions with the plot_posterior function. This function makes use of the corner.py package. Since there are many free parameters, we will exclude the parameters for P-T profile by setting inc_pt_param=False. By setting output=None, the plot is shown instead of written to a file.

[12]:
species.plot_posterior(tag='2mj1506',
                       offset=(-0.3, -0.35),
                       vmr=False,
                       inc_luminosity=False,
                       inc_mass=False,
                       inc_pt_param=False,
                       inc_loglike=False,
                       output=None)
Median sample:
   - logg = 5.63e+00
   - radius = 9.36e-01
   - t0 = 1.27e+03
   - t1 = 1.27e+03
   - t2 = 1.27e+03
   - t3 = 1.27e+03
   - t4 = 1.27e+03
   - t5 = 1.27e+03
   - t6 = 1.28e+03
   - t7 = 1.29e+03
   - t8 = 1.42e+03
   - t9 = 1.69e+03
   - t10 = 1.97e+03
   - t11 = 2.59e+03
   - t12 = 3.54e+03
   - t13 = 5.00e+03
   - t14 = 1.21e+04
   - metallicity = 9.11e-01
   - c_o_ratio = 6.11e-01
   - log_p_quench = -4.30e+00
   - fsed = 7.56e+00
   - log_kzz = 5.42e+00
   - sigma_lnorm = 1.33e+00
   - mgsio3_fraction = -1.13e+00
   - fe_fraction = -8.35e-01
   - distance = 1.17e+01
   - pt_smooth = 3.00e-01
Plotting the posterior... [DONE]
../_images/tutorials_atmospheric_retrieval_32_1.png

ReadRadtrans and random spectra#

In order to post-process the posterior samples, we need to recreate the Radtrans object of petitRADTRANS. Todo so, we use the get_retrieval_spectra method of Database to create and instance of ReadRadtrans with the adopted parameters that were set with the retrieval. The Radtrans object is stored as an attribute of the ReadRadtrans object. The ReadRadtrans object can simply be passed to some of the species functions later on and can typically be ignored by the user. The method also returns a list of random spectra (30 in the example below) that have been recalculated at a resolving power of \(R = 2000\). Each spectrum is stored in a ModelBox together with the atmospheric parameters.

[13]:
samples, radtrans = database.get_retrieval_spectra(tag='2mj1506',
                                                   random=30,
                                                   wavel_range=(0.5, 6.),
                                                   spec_res=2000.)
Importing petitRADTRANS... [DONE]
  Read line opacities of CO_all_iso_HITEMP...
 Done.
  Read line opacities of H2O_HITEMP...
 Done.
  Read line opacities of CH4...
 Done.
  Read line opacities of NH3...
 Done.
  Read line opacities of CO2...
 Done.
  Read line opacities of Na_allard...
 Done.
  Read line opacities of K_allard...
 Done.
  Read line opacities of TiO_all_Exomol...
 Done.
  Read line opacities of VO_Plez...
 Done.
  Read line opacities of FeH...
 Done.
  Read line opacities of H2S...
 Done.

  Read in opacity of cloud species MgSiO3 ...
  Read in opacity of cloud species Fe ...
 Done.

  Read CIA opacities for H2-H2...
  Read CIA opacities for H2-He...
 Done.

Getting posterior spectra 30/30... [DONE]

Let’s have a look at the content of first ModelBox by using the open_box method.

[14]:
samples[0].open_box()
Opening ModelBox...
model = petitradtrans
type = mcmc
wavelength = [0.50018561 0.50068604 0.50118698 ... 5.98480622 5.99079402 5.99678781]
flux = [7.32114683e-17 7.22185847e-17 7.13031149e-17 ... 4.76919325e-16
 4.64840278e-16 4.87250011e-16]
parameters = {'logg': 5.6291986156652465, 'radius': 0.935744976888985, 'distance': 11.71, 't0': 1263.955805276215, 't1': 1263.960685842528, 't2': 1263.9841885576873, 't3': 1264.082788575479, 't4': 1264.376871521415, 't5': 1265.0094174003439, 't6': 1278.909878683883, 't7': 1297.3412002386997, 't8': 1424.8332095764874, 't9': 1689.2781923980383, 't10': 1965.4314811515796, 't11': 2592.336749711141, 't12': 3537.179224960164, 't13': 5005.745467842519, 't14': 12123.335414971883, 'pt_smooth': 0.3, 'c_o_ratio': 0.6106362084583902, 'metallicity': 0.9088223117627487, 'log_p_quench': -4.446186143458798, 'fsed': 7.5001695805339565, 'sigma_lnorm': 1.3241474639828645, 'log_kzz': 5.427093933818429, 'mgsio3_fraction': -1.140464620930166, 'fe_fraction': -0.8365908259110784, 'mass': 143.84252294698956}
quantity = flux
contribution = [[4.92221626e-10 4.51295881e-10 4.63108286e-09 ... 1.94790171e-06
  3.03387750e-06 1.41816204e-06]
 [6.82333682e-10 6.29572594e-10 6.77218121e-09 ... 2.75628062e-06
  4.29210049e-06 2.00645539e-06]
 [9.21978150e-10 8.58254679e-10 9.35613317e-09 ... 3.89979701e-06
  6.07117305e-06 2.83840242e-06]
 ...
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 ... 0.00000000e+00
  0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 ... 0.00000000e+00
  0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 ... 0.00000000e+00
  0.00000000e+00 0.00000000e+00]]
bol_flux = None

Plotting the P-T profiles, opacities, and clouds#

We will now create some additional plots for analyzing the retrieved atmospheric structure. First, we will use the plot_pt_profile function for selecting random samples (100 in the example below) from the posterior distribution and recalculating the P-T profiles. We set the database tag were the posterior samples and attributes are stored and additionally the extra_axis (i.e. the top \(x\) axis) that we want to use for showing the average particle radius of the MgSiO\(_3\) and Fe clouds as function of pressure. These are average particle sizes but particle smaller than 1 nm are excluded from the actual size distribution. The condensation profiles of these cloud species are also shown with colored dashed lines.

The parametrization of the P-T profiles used 15 free nodes that are monotonically increasing with pressure and shown for the median parameters. The free nodes are interpolated and smoothed with a Gaussian kernel of \(\sigma = 0.3\) dex in pressure, hence there is a slight difference between the sampled temperature nodes and the actual P-T profiles.

The P-T profile from the median sample is shown with a black solid line will the 100 random samples are shown with thin gray lines. Given the large number of spectral data points, we get a very good constraints on all retrieved parameters, so it would probably warrant the inclusion of additional parameters to give the model more freedom.

[15]:
species.plot_pt_profile(tag='2mj1506',
                        random=100,
                        xlim=(500., 6000.),
                        offset=(-0.07, -0.14),
                        output=None,
                        radtrans=radtrans,
                        extra_axis='grains',
                        rad_conv_bound=False)
Plotting the P-T profiles... [DONE]
../_images/tutorials_atmospheric_retrieval_40_1.png

Next, we use the plot_opacities function for plotting the line and continuum opacities for the atmospheric structure of the median retrieved parameters. We pass again the ReadRadtrans object as argument of radtrans, which was earlier returned by get_retrieval_spectra. The single scattering albedo is calculated as \(\omega = \frac{\kappa_\mathrm{cont,scat}}{\kappa_\mathrm{line} + \kappa_\mathrm{cont,abs} + \kappa_\mathrm{cont,scat}}\). The scattering opacity is dominated by the cloud decks of the two clouds species but the \(\lambda^{-4}\) dependence of the Rayleigh scattering form gas is also seen in the mass opacities, but the impact on the spectrum at infrared wavelengths is typically negligible.

[16]:
species.plot_opacities(tag='2mj1506',
                       offset=(-0.1, -0.14),
                       output=None,
                       radtrans=radtrans)
Plotting opacities: None... [DONE]
../_images/tutorials_atmospheric_retrieval_42_1.png

For the cloud model, we used the parametrization from Mollière et al. (2020), which determines the cloud base through the intersection of the condensation profiles of the cloud species with the P-T profile. The vertical cloud structure is controlled by the \(f_\mathrm{sed}\) parameter and the average particle sizes by the eddy diffusion coefficient, \(K_\mathrm{zz}\). Furthermore, it uses a log-normal size distribution for the cloud particle with the width, \(\sigma_\mathrm{g}\), as free parameter. We can plot the size distributions as function of pressure with the plot_clouds. Here, we need to specify one of the two cloud species that were included with the retrieval and we need to provide again the ReadRadtrans object.

[17]:
species.plot_clouds(tag='2mj1506',
                    offset=(-0.12, -0.15),
                    output=None,
                    radtrans=radtrans,
                    composition='MgSiO3')
Plotting MgSiO3 clouds... [DONE]
../_images/tutorials_atmospheric_retrieval_44_1.png

Creating a Box with the observed spectrum#

After getting an impression of the retrieved atmospheric structure, we will create a plot of the data and model spectra. To do so, we need several functions to extract the relevant data from the database. First, we create an ObjectBox with the data (only a spectrum in this case) of 2MASS J15065441+1321060 by using the get_object method of Database.

[18]:
object_box = database.get_object('2MASS J15065441+1321060')
Getting object: 2MASS J15065441+1321060... [DONE]

Let’s have a look at the content of the ObjectBox with open_box. This method can be used on any Box object.

[19]:
object_box.open_box()
Opening ObjectBox...
name = 2MASS J15065441+1321060
filters = []
mean_wavel = {}
magnitude = {}
flux = {}
distance = [11.71  0.03]
spectrum = {'IRTF': (array([[9.3171155e-01, 6.2615632e-15, 1.2361384e-15],
       [9.3198341e-01, 5.5975380e-15, 1.1189908e-15],
       [9.3225533e-01, 6.5201217e-15, 9.6139224e-16],
       ...,
       [4.0973549e+00, 1.8983273e-15, 4.6716408e-16],
       [4.0981603e+00, 1.8976329e-15, 4.6592032e-16],
       [4.0989652e+00, 1.6915271e-15, 4.6575150e-16]], dtype=float32), None, None, 2000.0)}

Best-fit spectrum and residuals#

We already drew 30 random spectra from the posterior distribution with get_retrieval_spectra, but in addition we will compute the spectrum with the best-fit parameters. Here, we adopt the median retrieved parameters as the best-fit values. We start extracting the median values with the get_median_sample method of Database.

[20]:
best = database.get_median_sample(tag='2mj1506')

The atmospheric parameters are stored in a dictionary. Some of the parameters, such as the distance and pt_smooth may not have been free parameters with the retrieval though, but are required for computing the spectrum.

Next, we will compute the petitRADTRANS spectrum for the best-fit parameters at \(R = 2000\). The get_model method of the ReadRadtrans object requires a the dictionary with parameters as argument. We also set plot_contribution=True such that a plot of the emission contribution as function of pressure wavelength is shown.

[21]:
model_box = radtrans.get_model(model_param=best,
                               spec_res=2000.,
                               wavel_resample=None,
                               plot_contribution=True)
../_images/tutorials_atmospheric_retrieval_55_0.png

With the parameter dictionary, the ObjectBox with data, and the ReadRadtrans object, we can calculate the residuals (i.e. data minus model) of the best-fit spectrum. To do so, we use the get_residuals function, which calculates the residuals (relative to the data uncertainties) for each spectrum (only one in this case) and photometric flux, together with the reduced \(\chi^2\).

[22]:
res_box = species.get_residuals(datatype='model',
                                spectrum='petitradtrans',
                                parameters=best,
                                objectbox=object_box,
                                inc_phot=True,
                                inc_spec=True,
                                radtrans=radtrans)
Calculating residuals... [DONE]
Residuals (sigma):
   - IRTF: min: -33.64, max: 47.57
Reduced chi2 = 26.52
Number of degrees of freedom = 5373

Let’s have a look at the content of the returned ResidualsBox with open_box.

[23]:
res_box.open_box()
Opening ResidualsBox...
name = 2MASS J15065441+1321060
photometry = None
spectrum = {'IRTF': array([[ 0.93171155, -1.54446644],
       [ 0.93198341, -2.2995716 ],
       [ 0.93225533, -1.30490749],
       ...,
       [ 4.09735489, -0.08044848],
       [ 4.09816027, -0.08215352],
       [ 4.09896517, -0.63053548]])}
chi2_red = 26.518987161160503

For comparison, we calculate a spectrum without the clouds to see their effect of the spectrum. This is done by simply setting the cloud mass fractions to \(10^{-100}\) and rerunning get_model with the updated parameter dictionary.

[24]:
no_cloud = best.copy()
no_cloud['mgsio3_fraction'] = -100.
no_cloud['fe_fraction'] = -100.
model_no_cloud = radtrans.get_model(no_cloud, spec_res=2000.)

Furthermore, we calculate a spectrum without accounting for scattering from (mainly) clouds to see the importance of scattering in the radiative transfer of cloudy atmospheres. We need to set the do_scat_emis and test_ck_shuffle_comp attributes of the Radtrans object (i.e. the rt_object of ReadRadtrans ) to False and rerun get_model with the best-fit parameters.

[25]:
radtrans.rt_object.do_scat_emis = False
radtrans.rt_object.test_ck_shuffle_comp = False
model_no_scat = radtrans.get_model(best, spec_res=2000.)

Plotting the SED with data and model spectra#

We are now ready to create a plot of the spectral energy distribution (SED) to compare the data with the model spectra! The plot_spectrum function requires a list of Box objects as argument of boxes and the ResidualsBox is passed as separate argument of residuals. For each box we can set the plot style, by providing a list with dictionaries as argument of plot_kwargs, in the same order as the list of boxes. Items in the list can be set to None, in which case some default values are used. Finally, there is a handful of parameters that can be adjusted for the appearance of the plot (see the API documentation of plot_spectrum for details).

[26]:
species.plot_spectrum(boxes=[samples, model_box, model_no_cloud, model_no_scat, object_box],
                      filters=None,
                      plot_kwargs=[{'ls': '-', 'lw': 0.1, 'color': 'gray'},
                                   {'ls': '-', 'lw': 0.5, 'color': 'black'},
                                   {'ls': '-', 'lw': 0.3, 'color': 'tab:orange', 'label': 'No clouds'},
                                   {'ls': '-', 'lw': 0.3, 'color': 'tab:green', 'label': 'No scattering'},
                                   {'IRTF': {'ls': '-', 'lw': 0.3, 'color': 'tab:blue', 'label': '2MASS J15065441+1321060'}}],
                      residuals=res_box,
                      xlim=(0.8, 4.2),
                      ylim=(0., 2.2e-14),
                      ylim_res=(-9., 9.),
                      scale=('linear', 'linear'),
                      offset=(-0.6, -0.05),
                      figsize=(8, 3),
                      legend=[{'loc': 'upper right', 'fontsize': 10.}, None],
                      output=None)
Plotting spectrum... [DONE]
../_images/tutorials_atmospheric_retrieval_66_1.png

We can clearly see the impact of the clouds, which dampens the molecular absorption bands. The comparison also shows that scattering is important since the spectral fluxes would be overestimated otherwise. Finally, let’s zoom in on the \(Y\) and \(J\) bands to reveal the spectral features from species such as H\(_2\)O, FeH, Na, and K more clearly.

[27]:
species.plot_spectrum(boxes=[samples, model_box, model_no_cloud, object_box],
                      filters=None,
                      plot_kwargs=[{'ls': '-', 'lw': 0.1, 'color': 'gray'},
                                   {'ls': '-', 'lw': 0.5, 'color': 'black'},
                                   {'ls': '--', 'lw': 0.3, 'color': 'tab:orange', 'label': 'No clouds'},
                                   {'IRTF': {'ls': '-', 'lw': 0.5, 'color': 'tab:blue', 'label': '2MASS J15065441+1321060'}}],
                      residuals=res_box,
                      xlim=(0.9, 1.4),
                      ylim=(0.2e-14, 2e-14),
                      ylim_res=(-9., 9.),
                      scale=('linear', 'linear'),
                      offset=(-0.6, -0.05),
                      figsize=(8, 3),
                      legend=[{'loc': 'lower right', 'fontsize': 10.}, None],
                      output=None)
Plotting spectrum... [DONE]
../_images/tutorials_atmospheric_retrieval_68_1.png