RecB_article/Methods.tex at main · DanielThedie/RecB_article · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
\section*{Materials and Methods}

\subsection*{Strain construction}
\ecoli\ MG1655 and derivatives were used in this study. A list of all strains and plasmids used are presented in Supp. Tables~\ref{SItab:strains} and~\ref{SItab:plasmids}.
MEK2623 (RecB-Halo RecA-SYFP2) was constructed by transferring a RecA-SYFP2 tandem fusion by P1 transduction, using EL2514~\cite{Wiktor2021} as a donor and MEK65~\cite{Lepore2019a} as a receiver. The clones were selected on kanamycin plates and checked by Sanger sequencing. MEK2622 (\dreca) was constructed by transferring the \emph{recA} deletion by P1 transduction, using DL654 as a donor, and MEK65 as a receiver. MEK2629 was obtained by transforming the pDT6 plasmid (a gift from Prof.\ Mark Dillingham) into the MEK65 strain by heat shock.

\subsection*{Microscopy samples preparation}
For all experiments, the cells were grown in M9 supplemented with 0.2\% (w/v) glucose, 2 mM MgSO$_4$, 0.1 mM CaCl$_2$, 1X MEM Essential and MEM non-essential amino acids (Gibco).
\subsubsection*{Cell culture}
Cells were grown overnight from a -80\celsius\ glycerol stock in 5 mL medium at 37\celsius, shaking at 150 rpm. In the morning, the cells were diluted 1:300 into 15 mL of medium and grown at 37\celsius, shaking at 150 rpm until they reached an \od\ of 0.2--0.3 (mid-exponential phase).
Alternatively, a 15 mL culture was inoculated using a -80\celsius\ glycerol stock, and placed at 37\celsius\ with stirring in a turbidostat device (OGIBio), which measured \od\ and diluted cells at 3 min intervals, to keep them at a constant \od\ of 0.2.
\subsubsection*{Halo labelling}
We used a labelling protocol similar to our previous study~\cite{Lepore2025}. A volume of cells equivalent to 1 mL at \od=0.2 was centrifuged (8000 rpm) and resuspended in 1 mL of fresh medium. 5 µL of JF549 dye (Janelia Fluor Halo-tag Ligand, Promega, resuspended in DMSO) was added (final concentration 1 µM) and the cells were further incubated for 1h at 37\celsius\ in the dark, shaking at 150 rpm.
\subsubsection*{DNA labelling}
When relevant, cellular DNA was labelled by adding Sytox Green (Invitrogen, catalog number S7020) at a concentration of 500 nM, at the same time as the JF549 dye using similar incubation conditions (1h at 37\celsius\ in the dark, shaking at 150 rpm).
\subsubsection*{Dye removal}
The same procedure was followed to remove unbound JF549 and Sytox Green dye. The cells were centrifuged for 3 min at 8000 rpm, the supernatant discarded, and the pellet resuspended in 1 mL of fresh medium and transferred to a new tube to avoid the dyes sticking to the plastic of the tube. This procedure was repeated 3 times to fully remove unbound dye. The cells were finally resuspended in 200 µL medium.
\subsubsection*{Gam overexpression}
For experiments where Gam was over-expressed, arabinose was added to the cells during the labelling step, at 1\% (w/v) final concentration. Arabinose (at the same concentration) was also added to the medium for subsequent washes, and in the agarose pad to maintain exposure during imaging.
\subsubsection*{Sample preparation}
Agarose pads were prepared by dissolving 2\% (w/v) agarose in culture medium. In experiments with ciprofloxacin, the antibiotic was added to the agarose pad at the desired final concentration. After the dye removal step, 5 µL of cells were added on the agarose pad, and left to settle for 10 min at 37\celsius\ in the dark before imaging.

\subsection*{Microscopy}
\subsubsection*{Microscope setup}
Imaging was performed using an inverted microscope (Nikon Ti-E) equipped with a 100X TIRF Nikon objective (NA 1.49, oil immersion) and a 1.5X Nikon magnification lens (pixel size = 107 nm). Fluorescence excitation was achieved using 488- and 561-nm lasers (Coherent OBIS) in Highly Inclined Laminar Optical sheet (HILO) configuration. Excitation light and fluorescence emission were separated using a dual-wavelength dichroic filter (TRF59904, Chroma), and the fluorescence signal was detected on an Electron Multiplying Charge-Coupled Device (EMCCD) camera (iXion Ultra 897, Andor). The hardware was controlled and images were saved using MetaMorph (Molecular Devices; v7.8.13.0). The HILO configuration was established using the iLas variable angle TIRF control window (Gattaca), with an angle of 57\degree. All experiments were performed at 37\celsius, using an Okolab microscope cage incubator equipped with dark panels.

\subsubsection*{Data acquisition}
For each sample, a Metamorph journal was used to acquire images on 40 different positions, each separated by 150 µm. The full camera sensor (512$\times$512 pixels) was used. Acquisition parameters are described in Supp. Table~\ref{SItab:acquisition_channels}. Brightfield images were acquired in all experiments, as well as two fluorescence channels: JF549 (RecB) and either SYFP2 (RecA) or Sytox Green (nucleoid). At each position, 50 images were acquired at 2-second intervals for RecB and RecA, or a single image was acquired for the nucleoid (Supp. Fig.~\ref{SIFig:acquisition_pattern}). The prolonged imaging of the RecB-Halo fusion was only possible thanks to the exceptional photostability of the JF549 dye, which displayed slow photobleaching and no blinking (Supp. Note~\ref{note:dye_bleaching} and Supp. Fig.~\ref{SIFig:dye_bleaching}).

\subsection*{Data analysis}
The general data analysis workflow is described in Supp. Figure~\ref{SIFig:analysis_workflow}. In brief, after acquisition, the raw microscopy images were stored in \texttt{.tif} format on a dedicated Omero server (OME). They were directly accessed by the BACMMAN ImageJ plugin~\cite{Ollion2019}, which performed image processing tasks such as denoising of fluorescence images, cell segmentation (including manual curation of incorrectly segmented cells), nucleoid segmentation, and fluorescent spot detection. Measurement tables in CSV format were exported from BACMMAN and loaded in Jupyter Notebooks using the \href{https://gitlab.com/MEKlab/pyberries}{PyBerries package} (version 0.2.25) in Python (version 3.11). The data tables were manipulated using PyBerries and the pandas library (version 2.0.2). Figures were generated in Jupyter Notebooks using the Seaborn library (version 0.13.2). Figures that contained several panels were assembled in Inkscape (version 1.3.2).

\subsubsection*{BACMMAN pipeline}
\paragraph*{Deep-learning denoising of RecB-Halo fluorescence images}
We improved a self-supervised denoising neural network by making it use the previous and next frames in the timelapse and performing deconvolution simultaneously. Our method is based on a previously developed method~\cite{Ollion2021} which uses an encoder-decoder convolutional neural network that performs denoising (DNet), as well as a fully connected neural network that recovers the noise distribution used to compute the self-supervised loss. In order to include previous and next frames, we modified the architecture of DNet so that the encoder is shared between frames (t-1, t and t+1), i.e.\ it processes each frame independently. At each contraction level, the residual tensors corresponding to each frame are combined using a 1$\times$1 convolution and fed to the decoder. The same applies for the feature tensors and the last level of the decoder. The decoder only predicts the central frame (t) and the same self-supervised loss is used. We observe a dramatic improvement of denoising performance, which implies that the proposed architecture benefits from using information from adjacent frames. It has been shown that performing denoising and deconvolution simultaneously dramatically improves deconvolution performance~\cite{Kobayashi2020}. We computed the experimental PSF of our system by averaging hundreds of bead images cropped in a 33$\times$33 pixels area around each bead, and normalised so that values sum to one. The obtained averaged images were used during training as kernel for a 2D convolution at the end of the neural network. At prediction time, convolution was removed to obtain a deconvolved image.

\paragraph{Cell segmentation}
The 16-image brightfield Z-stack was first cropped to 5 images on one side of the focus, as required by our segmentation algorithm. The 5-image stack was used as input to Talissman, a U-net-based segmentation algorithm. In brief, the U-net model predicts a Euclidean distance map, where the value of each pixel is its predicted distance to the nearest background pixel. A watershed algorithm is then applied to retrieve cell contours. This approach allowed us to accurately segment cells from bright-field images, including when they formed tight clusters. Any overlapping cells were manually removed from the analysis. Following segmentation, post-filters were applied to dilate the segmented regions slightly and to remove any cells that were in contact with the edge of the image and might, therefore, be cropped. Finally, all segmentation masks were visually inspected and curated to remove cells that were incorrectly segmented ($\sim$1\% of total cells).

\paragraph*{Detection of RecB-Halo spots}
Fluorescent spots were seg\-men\-ted using a seeded watershed algorithm on the Laplacian transform of the denoised RecB image. The quality of the segmentation was visually assessed by overlaying the segmentation mask on the raw fluorescence images. The segmentation parameters were optimised by trial and error to minimise the error rate (assessed by visual inspection of raw fluorescence images overlaid with the segmentation masks), but no manual curation was applied to avoid introducing user bias in the results.

\paragraph*{Classification of RecA structures}
We designed a U-net-based deep-learning model capable of classifying objects (in this case, cells) based on the type of RecA structure they contained (Supp. Figure~\ref{SIFig:object_class}). The model was trained on a dataset of 4,069 cells, manually labelled as containing either a RecA focus, a RecA filament, or neither of those (homogenous diffuse fluorescence). Because there were different numbers of each predicted class in the training set, the loss function (categorical cross-entropy) was weighted by the inverse class frequency (weights were 21, 101 and 133 for the diffuse, filament and focus classes, respectively). The model was trained on 128$\times$128 pixel crops of the RecA-SYFP2 fluorescence images, with the corresponding cell segmentation mask as input. Data augmentation was performed using the \href{https://github.com/jeanollion/dataset_iterator}{dataset iterator} package by applying random rotations, flips, and elastic deformations. The model was trained for 1000 epochs with a batch size of 3, using the Adam optimiser. The learning rate was initially set to 0.0004, and reduced by a factor or 10 when the validation loss did not improve for 30 consecutive epochs (to a minimum of 1$\times$10$^{-6}$). The model's predictions were evaluated against a test set of 15 manually-labelled images that were not used in training, achieving an accuracy of 84\%.

\paragraph*{Nucleoid segmentation}
Sytox Green fluorescence images were processed with a rolling-ball noise subtraction (radius 6 pixels), and nucleoids segmented by a seeded watershed algorithm. The seed threshold was determined by Otsu's method~\cite{Otsu1979}. Any segmented regions that were in contact with each other within the same cell were merged, and segmented regions smaller than 5 pixels were excluded. Regions with weak signal-to-noise ratio (SNR < 2) were also excluded. For each segmented region, the SNR was computed as follows:
\begin{equation}
    SNR = \dfrac{\overline{F}-\overline{B}}{std(B)}
\end{equation}
with $\overline{F}$ the mean fluorescence intensity in the object, $\overline{B}$ the mean background intensity, and $std(B)$ the standard deviation of background intensities.

\paragraph*{Measurements}
As the final step of the pipeline, BACMMAN performed several measurements on the objects created (cells, RecB spots and nucleoids). These measurements included cell length, cell area, raw fluorescence intensity, background-subtracted fluorescence intensity, number of RecB spots in the cell, and position of the RecB spots and nucleoids along the long and short axis of the cell. For each dataset, BACMMAN produced one csv table per object.

\subsubsection*{PyBerries}
\paragraph*{Data import and format}
The PyBerries package, combined with the pandas library, enabled easy import of the multiple CSV tables produced by BACMMAN, and performing operations such as grouping, aggregations, normalisation and curve fitting.

\paragraph*{Fitting Halo-RecB spot lifetimes}
The lifetime of individual Rec\-B spots was computed in BACMMAN.\@ The resulting lifetime histogram was fitted with a bi-exponential decay function of the type $y=a_1.e^{-k_1.t} + a_2.e^{-k_2.t}$. To improve the reliability of the fits, a bootstrapping procedure was used. For each ciprofloxacin concentration, an array of RecB spot lifetimes of the same size as the original data was drawn with replacement (i.e.\ each lifetime could potentially be drawn several times), and the resulting histogram was fitted with the bi-exponential decay model. This procedure was repeated 100 times, yielding 100 different estimates for each fit parameter. The best fit parameter was obtained by taking the median of all estimates.

Because it is not possible to differentiate whether the disappearance of a fluorescent spot was a result of photobleaching or of unbinding and returning to the pool of diffusing molecules, we considered that the fitted ``spot disappearance rates'' $k_1$ and $k_2$ were a sum of RecB's dissociation rate from DNA and the dye's bleaching rate ($k_1=k_{d1}+k_b$ and $k_2=k_{d2}+k_b$, with $k_{d1}$ and $k_{d2}$ the two RecB dissociation rates, and $k_b$ the bleaching rate). The bleaching rate was calculated for each dataset, as described in Supp. Note~\ref{note:dye_bleaching} and Supp. Figure~\ref{SIFig:dye_bleaching}, and subtracted to retrieve the true RecB dissociation rates from DNA.\@

\paragraph*{Estimation of the rate of RecB recruitment to DSBs}
For each timelapse, we quantified the total number of RecB spots that were bound to DSBs. This could not be achieved by thresholding of the lifetime histograms, as a significant number of short-lived spots may correspond to DSB-bound RecB molecules. To estimate the total number of DSB-bound RecB, we considered that all RecB spots that disappeared with a slower rate (as determined by the bi-exponential decay fit) were DSB-bound. We multiplied the total number of spots that appeared during the timelapse by the proportion of slow-dissociating spots retrieved from the lifetime histogram fits, which gave us an estimate of the number of DSB-bound RecB per timelapse, from which a number of recruitment events per cell per hour was calculated. This rate was corrected for photobleaching of the fluorescent dye, to account for non-fluorescent RecBCD complexes binding to DSBs. This was done independently for each dataset, by computing the average proportion of the remaining initial fluorescence over time. The number of RecB binding events recorded was divided by this average proportion of fluorescence to estimate the real number of RecB binding events. For example: for 2 RecB recruitments per cell per hour observed, and an average of 50\% of the initial fluorescence remaining over the timelapse, the estimated number of RecB recruitments per cell per hour would be $2/0.5 = 4$. This estimation method assumes that RecB binding events are equally likely at all points of the timelapse.

\subsection*{Code availability}
Instructions to install the BACMMAN ImageJ plugin can be found on the \href{https://github.com/jeanollion/bacmman/wiki/Installation}{BACMMAN wiki}. The deep-learning models used for cell segmentation, fluorescence denoising and RecA structure classification can be found in BACMMAN's deep-learning models library. To access the library, in BACMMAN go to the Import/Export menu and select Online DL Model library. Under Username enter ``el-karoui-lab'', leave the password field empty and click on Connect. The models can be found in the ``Thedie-et-al'' folder. BACMMAN dataset configurations can be found in the \href{https://github.com/jeanollion/bacmman/wiki/Online-Configuration-Library}{BACMMAN configuration library}. To access the library, in BACMMAN go to the Import/Export menu and select Online Configuration Library. Under Username enter ``el-karoui-lab'', leave the password field empty and click on Connect. The configurations can be found in the ``Thedie-et-al'' folder. The PyBerries package can be found on the \href{https://pypi.org/project/PyBerries/}{Python Package Index}, and its source code is available on \href{https://gitlab.com/MEKlab/pyberries}{Gitlab}. The deep-learning model used to classify cells according to their RecA-associated fluorescence is available on \href{https://gitlab.com/MEKlab/bacmman-object-classifier}{Gitlab}. The Jupyter Notebooks used to make the article figures are available on \href{https://github.com/DanielThedie/RecB_article}{Github}.