This repository contains scripts for processing, filtering, and visualizing X-ray Emission Spectroscopy (XES) data collected from SLAC National Accelerator Laboratory's LCLS facility. The pipeline enables femtosecond time-resolved observations of quantum changes in protein complexes by pairing XES data with Serial Femtosecond X-ray Crystallography (SFX) data.
The processing pipeline consists of four main components:
- Raw Data Processing (
mpiVDSXESandXTCAVlaser.py) - Converts raw Epix100 XES detector data into Virtual Data Set (VDS) structures - Laser Event Filtering (
filter_laser_events.py) - Identifies and filters events based on laser on/off states - Crystal Hit Filtering (
filter_cheetah_hits.py) - Filters events based on SFX crystal hits using Cheetah analysis results - Visualization (
plot_xes.py) - Creates plots and quantifies XES spectra
- Python 3.9+
- Access to SLAC's HPC environment with Psana
- MPI4py for parallel processing
- Required Python packages:
h5py,numpy,scipy,matplotlib,psana
git clone https://github.com/jtcarrion/TR_XES_processing.git
cd TR_XES_processingThe main processing script converts raw Epix100 XES detector data into HDF5 format with VDS structures.
mpirun -np <num_processes> python mpiVDSXESandXTCAVlaser.py -e <experiment> -r <run_number> -o <output_directory>Required Parameters:
-e, --exp: Experiment string (e.g.,mfxls0816)-r, --run: Run number to process-o, --outdir: Output directory for processed files
Optional Parameters:
--laser-on-code: EVR code for laser on events (default: -1, ask LCLS staff)--store-raw: Store raw images without pedestal/gain/geometry correction--max-events: Maximum number of events to process (default: all)--rowmin/--rowmax: Row range for XES integration (default: 0 to -1)--rotation: Image rotation in degrees (default: -89)--threshold: Pixel threshold value (default: 10)--eventsperfile: Events per output file (default: 10000)--no-xtcav: Skip XTCAV processing if detector is broken
Output:
- Creates HDF5 files with processed spectra, laser states, and timing information
- Generates a VDS file combining all processed data
Identifies laser on/off events and saves fiducial timestamps.
python filter_laser_events.py -e <experiment> -r <run_number>Output:
laser_on_fiducials.h5: Fiducials for laser on eventslaser_off_fiducials.h5: Fiducials for laser off events
Filters events based on Cheetah crystal hit analysis and creates averaged spectra.
python filter_cheetah_hits.py -f <xes_files> -c <cheetah_hits_file>Required Parameters:
-f, --files: Space-separated list of XES H5 files-c, --CheetahHits: Cheetah hits file (LST format)
Optional Parameters:
-w, --window: Smoothing window size (default: 1, no smoothing)-o, --output: Output filename for plots
Output:
- Text files with averaged spectra for different event types
- PNG plots showing laser on/off comparisons
- Background-corrected spectra
Creates 2D and 1D visualizations of the processed XES data.
python plot_xes.py -f <data_file> --upper_bound <upper> --lower_bound <lower>Required Parameters:
-f, --data: Input H5 file--upper_bound: Upper bound for XES spectra integration--lower_bound: Lower bound for XES spectra integration
Output:
2D_sum.png: 2D visualization of thresholded data1D_xes.png: 1D XES spectrum plot1D_xes.txt: Numerical data for 1D spectrum
# 1. Process raw data (using 8 MPI processes)
mpirun -np 8 python mpiVDSXESandXTCAVlaser.py -e mfxls0816 -r 123 -o ./processed_data
# 2. Filter laser events
python filter_laser_events.py -e mfxls0816 -r 123
# 3. Filter crystal hits (assuming you have Cheetah hits file)
python filter_cheetah_hits.py -f ./processed_data/VDS-r0123-epix.h5 -c cheetah_hits.lst
# 4. Create visualizations
python plot_xes.py -f ./processed_data/VDS-r0123-epix.h5 --upper_bound 100 --lower_bound 50- Raw XES Data: Epix100 detector data from LCLS MFX hutch
- XTCAV Data: X-ray pulse characterization data
- EVR Data: Event receiver data for laser timing
- Cheetah Hits: Crystal hit analysis results from Cheetah software
- VDS Files: Virtual Data Set structures containing processed spectra
- H5 Datasets: Organized data with laser states, timing, and spectra
- Averaged Spectra: Background-corrected spectra for different event types
- Visualizations: 2D and 1D plots of XES data
The pipeline significantly improves SNR by:
- Filtering laser on/off events to reduce background noise
- Using crystal hit filtering to focus on relevant events
- Background correction using non-hit background
- Allows for a full range of energy states across a wide variety of atomic species
- Precise timing synchronization between XES and SFX data
- MPI-based parallel processing for large datasets
- Efficient handling of high-throughput XFEL data
- Scalable to SLAC's HPC environment
- Missing EVR Codes: Contact LCLS staff for correct laser event codes
- XTCAV Issues: Use
--no-xtcavflag if detector is malfunctioning - Memory Issues: Reduce
--eventsperfilefor large datasets - File Permissions: Ensure write access to output directories
- Use appropriate MPI process count for your dataset size
- Monitor memory usage with large datasets
- Consider using
--max-eventsfor testing on subset of data