-
Notifications
You must be signed in to change notification settings - Fork 13
Overview of `cube_cut.py`
This cube_cut.py
script houses a single class: CutoutFactory
. The class is written to read a cube file generated from either TicaCubeFactory
or CubeFactory
, and generate cutouts of a user-designated size, centered around a user-designated pair of coordinates. Because of the nature of the TESS mission and how it operates--taking observations of large "Sectors" of the sky, incrementally over time--individual TESS science products will contain image data that may or may not have the target-of-interest centered in the field of view, and/or may contain data that is not useful or relevant to your science. This can make analysis difficult when working with the full CCD images of TESS observations.
The advantage of creating cutouts from of these images is that it will make science easier by centering your target to the field of view so it is easily identifiable. The cutouts are stacked along the time axis and aligned with the WCS solutions provided by CutoutFactory
, so flux/error values can therefore be extracted easily to generate light curve plots. Users working with moving targets can also benefit from the CutoutFactory
functionality as it supports making cutout files for moving targets from Mission-delivered SPOC cubes. Note, TICA cutouts for moving targets are not yet available at the time that this documentation was last updated.
The CutoutFactory
class takes in a TESS observation cube (of product type SPOC or TICA) and outputs a cutout file formatted similarly [CEB: perhaps say "compatible with a TESS TPF". That was the guiding principle, wanting users to be able to apply their TESS pipelines to our TPFS.] to a TESS target pixel file (TPF), with the image data stored in the second extension (EXT=1
) of the HDU list. See the Astrocut documentation for more information on the cutout TPF format. The cutout file contains cutouts of the FFIs from the cube stack, with the designated target centered in the frame. The main method of CutoutFactory
is cube_cut
which calls upon all the other methods under CutoutFactory
to generate the cutout. The order in which all the methods are called is as follows:
__init__
_parse_table_info
_get_cutout_limits
_get_cutout
_get_full_cutout_wcs
_fit_cutout_wcs
_get_cutout_wcs_dict
_build_tpf
The functionality of each of these methods will be discussed briefly in the sections following. Should there be any questions/comments regarding this wiki page, please file an issue under the Issues tab in this repository or contact us through the Archive Help Desk at [email protected].
This method declares an attribute called img_kwds
which contains image header keywords that will be available in the cutout TPF, as well as other attributes which will be called later in the class.
For the WCS alignment, the FFI in the middle of the FFI cube stack is selected to be the reference image from which the WCS information is taken. In this method, the middle FFI is selected, and its WCS information is extracted and assigned to the attribute cube_wcs
. The image header keywords from this FFI are also taken and stored in the img_kwds
attribute.
This method begins by taking the input coordinates of a target and converting it to pixel coordinates using the WCS information extracted from the middle FFI in _parse_table_info
. Then, for each axis (rows and columns), it gets the limit by taking the center pixel coordinates of a target, subtracting by 1
[VERIFY: CEB: yup, FITS files are 1-up and Python is 0-up so we subtract one to go from FITS convection to Python convention] to be in accordance with NumPy's zero-indexing, then adding and subtracting by 1/2 the requested cutout size for that axis (dim
) to get the upper and lower limits, respectively.
TODO: Maybe add an example here? [HL - Visual example may be helpful here, to demonstrate.]
Uses the limits calculated in _get_cutout_limits
and applies them to an input cube to generate cutouts from all the FFIs. The cutout limits are taken and assigned to the variables xmin
, xmax
, ymin
, ymax
to denote the upper and lower limits for each axis.
For SPOC products, the transposed cube, from which cutouts are made, is of shape (nCols, nRows, nFFIs, 2) where nCols represents the number of columns a full-frame image (FFI) contains, nRows represents the number of rows an FFI contains, nFFIs represents the number of FFIs included in the cube, and 2 represents the science and propagated errors axes of the cube. nCols and nRows are taken and assigned to the xmax_cube
and ymax_cube
variables, respectively. For TICA products, the transposed cube is of shape (nCols, nRows, nFFIs, 1) where 1 represents the science axis of the cube. Note that the TICA pipeline which produces the TICA products does not propagate errors, so the TICA cubes are created without an error axis. [HL - May want to explain here why the cube is transposed.]
The cutout is taken by indexing with xmin
, xmax
, ymin
, ymax
on the transposed cube. This portion of the cube is then transposed along the longest axis, indexed at 0
for science and assigned the variable img_cutout
, then indexed at 1
for error and assigned the variable uncert_cutout
. Note that the TICA pipeline which produces the TICA products does not propagate errors, so the error axis in the output cutout TPF is populated with zeros and should be ignored.
The aperture array that will live in the third extension (EXT=2
) of the cutout TPF is created in the next step and filled with 1
s.
For edge cases where xmin
and ymin
may land below 0 or xmax
and ymax
may land beyond the respective dimension of the CCD array, padding must be added to the pixels to make the cutout the requested size. To do this, an empty array of shape (3, 2)
called padding
is assigned. For a given case, a tuple element in the array is populated with the appropriate padding width (i.e., how much the script needs to pad onto the cutout). If there is no padding necessary, this padding
array will remain empty (only have zeros) and the padding logic will be skipped. If there is any non-zero value in the padding
array, the padding logic will run, and the cutout will be padded with NaNs in the appropriate areas using np.pad
.
The method then returns all the new cutout arrays: the image cutout, the error cutout, and the aperture array.
This method takes cube_wcs
(i.e. the WCS of the middle FFI), converts it to header format, and assigns it to the variable wcs_header
. In this new WCS header, the CRPIX1
and CRPIX2
values are subtracted by the calculated LOWER limit in the X and Y direction. From here the physical WCS keywords and keyword values are added. You can read more about these keywords in NASA's FITS Standard Keyword Dictionary. [VERIFY: CEB: this is not entirely true, you are conflating two sets of keywords, see pt 1 in my email.] CRVAL1P
and CRVAL2P
represent the coordinate values of our reference point in X and Y (respectively), and not the array location (like CRPIX1
and CRPIX2
), so 1
is added back to these values to remove the NumPy indexing.
Once these headers are declared and values are assigned to them, wcs_header
is returned as a WCS object.
[CEB: You have the order wrong here, _fit_cutout_wcs
is called first, so this isn't making an empty dictionary, it's filling the dictionary from the fitted WCS. And the reason we do it this way instead of pulling directly from cutout_wcs
is because for the bintable header TESS TPFs expect very specific header keywords that are not standard for WCS objects.]
The cutouts WCS keywords are created and stored in an empty dictionary called cutout_wcs_dict
, which serves to house all the WCS information for the cutouts. The reason this new dictionary is created instead of using the information from cutout_wcs
is because information will have to be updated and new information will have to be stored when the WCS solutions get recalculated for the new cutouts in the following step.
Approximately 100 evenly-spread pixels are selected across the entire array of each FFI to be used for the WCS matching process. The method begins by determining the step-size used to select these pixels such that they are evenly spread across the CCD. To do this, a while
loop is created that incrementally increases the step size i
such that each step taken across the X and Y dimensions will collect a total of 100 pixels across the image that will then be used for the WCS matching.
Once i
is determined, the X coordinates and Y coordinates of each of these pixels is determined, and assigned to the variables xvals
and yvals
, respectively. The xvals
and yvals
are determined by working backwards and forward from the mid-way point of the axis in question. So for example, when finding the X coordinates, the script works backwards from the mid-way point, i.e. 2078//2
to -1
in steps of -i
(negative because we are going backwards). Then the script works forwards from the mid-way point up to the entire width of the image, in steps of i
, to find the remaining half of the X coordinates. The process is repeated for the corresponding Y coordinates. [VERIFY/Question: Why is the process done this way instead of just like xvals = list(range(0, x, i))
? CEB: So that the grid remains centered on the center grid pixel, and therefor the resulting WCS object is centered on the target coordinates.]
The script ensures that the last index in xvals
and yvals
is x - 1
and y - 1
for X and Y, respectively. If not, that final index is manually added. If the first index is not 0, then 0 is also manually added to both lists.
Tuples of X and Y coordinates for each pixel are then created by taking the Cartesian product of the xvals
and yvals
lists. These tuples are then assigned to the variable pix_inds
. These pixel coordinates are then converted to sky coordinates (in degrees) using the cutout WCS solutions from _get_full_cutout_wcs
and astropy.coordinates.SkyCoord
.
The best fit linear WCS is then calculated using the the X and Y values from pix_inds
and the matched sky coordinates calculated in the previous step. This WCS object is then stored as the cutout_wcs
attribute.
Finally, the quality of fit is determined by taking ALL of the pixels in the FFI, converting their coordinates to sky coordinates using both the cutout_wcs
from _get_full_cutout_wcs
and the linear_wcs
from this method, and determining the degree of separation between the coordinates. The variance is then calculated by taking the square root of the sum of the distances squared (i.e. sigma = np.sqrt(sum(dists.value**2))
). This value is stored in the cutout_wcs_fit
dictionary under the keyword 'WCS_SIG'
. The largest calculated separation is stored in the same dictionary under 'WCS_MSEP'
.
This method updates the primary header with keywords to match that of a standard TESS Mission-produced TPF, and populates empty keywords. At this point, the cutout file keywords for TICA are modified to match the cutout file keywords for SPOC (and therefore the TESS TPF keywords), and the old versions of the keywords are deleted to avoid duplicates. For example, the keywords CAMERA
is added to the TICA header to denote which camera in the sector this FFI was taken in, and the TICA analog of CAMERA
, i.e., CAMNUM
, is deleted.
In addition, some keywords are derived from other keywords in this step. For example, DATE-OBS
is not available in TICA FFIs/cubes, so it is calculated by adding the BJDREFI
keyword value to TSTART
.
Once the table HDU is created with the table filled with columns (FLUX
, FLUXERR
, FFI_FILE
, etc.) populated for each FFI cutout, the table HDU's header is updated with keyword descriptions for some of the keywords, and WCS information is added to the table header columns when appropriate.
Image keywords are added to the table header. Some keywords are ignored because they are specific to a single FFI, and not to the entire cube.
In this method, the primary header from the cube HDU list is copied and applied to the headers in the other 2 extensions of the cutout file. All header keywords and keyword values are included.
This method builds the actual cutout TPF, adds header keywords, and formats it to match that of a TESS TPFs. It begins by updating the primary header of the cutout file, then it starts building out the binary table in extension 1 which will house the actual data. It adds columns such as TIME
, FLUX
, FLUX_ERR
, and FFI_FILE
, with each row populated by the corresponding data for a single FFI cutout. Each column is then appended to an empty list called cols
which is then turned into a binary table object called table_hdu
, and its header is populated with WCS information (_add_column_wcs
), relevant image keywords (_add_img_kwds
), etc.
The aperture HDU is then built, with the data unit being a 2d array the same size as the requested cutouts. Extra keywords are added to the aperture HDU, including the goodness of fit calculated by _fit_cutout_wcs
.
Lastly, the cutout HDU list is built using the primary HDU, table HDU, and aperture HDU.