BEBI103 Fall 2019
"The dream of a cell is to become two cells"
-François Jacob

Caulobacter Crescentus

Rosita Fu, Theresa Marlin, Erika Salzman

Abstract


Division of an entire bacterial colony is the canonical example of exponential growth. However, the rate of growth of an individual bacterium is less obvious; complex cellular dynamics within the replication machinery and cytoskeletal reactions can change the density of the cell during division, and there could be multiple positive and negative feedback dependencies on the current size that act to promote or constrict growth. Here, two parent Caulobacter Crescentus cells undergoing division were studied, the data provided by Norbert Scherer's Lab @ the University of Chicago. Ultimately, we observe a discernible pattern of exponential growth between divisions. We will discuss the specific image processing techniques used to extract the relevant data from these images, as well as the modeling methods used to reach our conclusion of exponential growth.

Experimental Methods

The Scherer Lab developed a technique to switch off cell adhesion in C. Crescentus, allowing for low-density visualization that would prevent overcrowding in the field of view. Hundreds of generational divisions were studied with a microfluidic device, and the imaged data was subsequently made available to us via Justin Bois.

Figure 1: Experimental setup. A field of C. Crescentus


Image Processing


The modules in python's skimage helped us convert our images to a binary file, where pixels were either marked as part of the bacterium or not. To do this, a sobel filter was applied to help us define a threshold to assist us in segmentation. The source code is provided below, as well as a discussion of why we selected particular values and variables to help us in the conversion process.

Source Code: img_processing.py. img_visualization.py.
Figure 2a: Screen-recording of pre-filtered frames of bacterium 1.
Frame rate: 1 frame/minute, interpixel spacing of 0.052 µm, images acquired at 24∘C


Since noise can cause one pixel to look very different from its neighbors, the Marr Hildreth algorithm was used to apply a Sobel filter, utilizing a nice computation technique taking the log of the gaussian, operating under the principle that pixels along an edge share similar values.

Figure 2b: Screen-recording of applied sobel-filters for bacterium 1.


Segmentation Methodology


thresh: threshold

A threshold was set manually determining a quantitative cut-off point for the divided vs. undivided state. Biologically, we know that the bacteria drift from each other as soon as separation occurs, so we are looking for a threshold that will accurately capture that moment. Watching divisions frame-by-frame, we noticed the frame rate introduced a minor inconsistency during each division, since it most likely misses the instantaneous moment of actual division, and captures the division at different stages each round. This introduces the researcher's choice in deciding what actually qualifies as division. A threshold of 0.0045 was chosen after observing sobel mapping value behaviors around division frames.

selem: structuring element size

This structuring element was defined for closing and thresholding purposes. Since we know the daughter cell gets flushed away as soon as it finishes dividing, the transformation of "closing" artificially joins the two bacteria. This is a semi-fix for the inter-frame division problem (we most likely do not observe the moment of instantaneous division). We watched the divisions frame-by-frame, viewing how the binary frames progressed, and we settled on a closing parameter that made sure the initial size of the next growth cycle could be as representative of 'instantaenous' as possible. We chose to make this value the same for both bacterium since it made sense that if both bacteria were both around the same size (i.e., ~2μm^2), and the camera apparatus remained the same, then having the same closing parameter would not introduce too much error. If we were aiming for more precision, with a lot more bacteria of differing sizes, we would scale the structuring element area proportionally with the bacteria's time-averaged size.

Figure 2c: Screen-recording after manually thresholding frames of bacterium 1.


Segmentation Corrections


After arriving at our binary image, we make the assumption that our bacteria is the second largest element in the array (the first being the background... the errors arising from this assumption are dealt with below). The pixels are then counted for each frame and the areas stored in a csv file. However, before doing so, we noticed when watching the divisions frame-by-frame, there were some segmentation errors. We discuss here how we treated the data:

max_growth: maximum growth rate between frames

A variable max_growth was defined to set a limitation on how much the bacteria could grow between frames (within one minute). When the edge-detection wasn't able to separate neighboring bacteria from the one of interest, it resulted in false data point spikes. This was a significant issue with the bacterium 2. We set this value manually to 0.025, which means a 2.5% growth rate per minute. Note that for all divisions, we observed with our human eyes no growth surpassed this rate within one frame—we are not eliminating frames to smooth plots, but rather to deal with the errors when the machine counts the areas.

wiggle: handles massive drops in area

This variable helps segment growth cycles with time by setting a lower bound on what constitutes as growth and what is simply due to thresholding fluctuations. We considered a drop greater than 200 - 250 pixels significant for these two bacteria. If we were to formalize these values, a function could be written to approximate average growth, and a bound 1-2 standard deviations away from the mean could be set.

frames_away: latency

This accounts for periods of latency where the bacteria is in the process of preparing for division, and not actually significantly growing. There is a hidden variable growth in the plucking() method; the product of these two values sets bounds on growth. If this variable is 2, and the area 4 frames away from the current frame is outside 8x in both directions, then the bacteria is not experiencing much growth and is peacefully floating about.

Results

Figure 3a: Growth cycles for first bacterium.
Colors denote alternating growth cycles.
Figure 3b: The dimensions of the second plot are difficult to render.
The link below gives a closer look at the segmentation difficulties of bacterium 2.

Bacterium 2 Zoom



Modeling Growth: Linear or Exponential?


Residuals


Residuals were calculated from the theoretical values generated from the parametrization of the empirical data, cycle by cycle. The following boxplots show the linear vs. exponential model's residuals for every growth cycle of bacterium 1. Note that the y-axes are aligned for comparison, and that we are treating each growth event as a categorical variable (this is not a bar graph). Source Code: residuals() function from modeling.py

Figure 4a: Boxplots of linear (left) and exponential (right) squared residuals
for all 20 growth cycles of bacterium 1

The 91 cycles of bacterium 2 could not be compared comfrotably side-by-side on the monitor, but the first 20 cycles are shown below. The links under the caption lead to unabridged plots, with y-axes aligned accordingly, as well as the difference between linear and exponential sq residuals.

Figure 4b: First 20 growth cycles' boxplots of linear (left) and exponential (right) squared residuals
Difference between lin-exp for all cycles of bacterium 2 (bottom).
linear | exponential | differences

The difference plot is bisected by the zero line. Greater probability mass sitting above this bisecting line indicates a greater deviation from the empirical data exhibited by the linear model. The generative exponential model thus shows a relatively closer fit to the empirical data. Furthermore, the very long necks of the residual boxplots for both bacteria do not appear to be normally distributed, suggesting that the error behaves heteroscedastically.

Akaike information criterion (AIC)


Without going into more detail, the AIC can be generally understood to measure the gap between true generative distributions and model distributions. The lower the AIC value, the more likely such model is closer to the true generative distribution, and the more probable it will lead to closer predictions of new data. The AIC is given below, where l denotes the log-likelihood and p denotes the number of free paramters:

The weight of a model is given by:

Upon performing such calculations, the AIC weights for both models were generated for every growth cycle of both bacteria. For bacterium 1, 16/20 growth events had a negative difference in weights between the linear and exponential model, and 53/92 growth events for bacterium 2. Since the difference in weights resulted in more negative than positive values for both bacteria, we conclude that the exponential model is the more likely model.

Source Code: akaike() function from modeling.py

Further Analysis


Below, we explore the behaviors of interdivisional times and fractional return in the two bacteria. The growth_data() function in 'modeling.py' was used to extract relevant data for the plots below.

Interdivision times


A question we might ask is whether or not the duration of growth cycles is speeding up, slowing down, remaining fairly constant, or conforming to some other distribituion. You could imagine the reasons for the cell to divide differently as it ages. The ECDF below show that within our time frame of study, these times could be modeled as normally distributed.

Bacterium 1 | Bacterium 2
Figure 5: Interdivisional times fitted to a normal distribution.

Another way to indirectly visualize this is to plot the minimum and maximum areas of each growth cycle on the same plot. Assuming each cycle follows consistent growth laws, the corresponding peaks and troughs suggest that the cell spends a constant time dividing during each division; the ECDF's more clearly reveal how these times might be distributed.

Figure 6: Minimum and maximum areas splined with respect to time.

As discussed in this paper, the caulobacter appear to grow biphasically, meaning a primary growth pattern resulting in constant increments between generations, and then a secondary pattern with constant growth proportional to initial area. From the modeling above, we saw a dependency on initial area within each cycle, and we can see from the minimum and maximum area plot above, that successive generations do indeed have relatively constant growths; in other words, the max. - min. area in each cycle fluctuates around a constant.

Fractional Return


Another question we could ask is how we expect the parent cell size in the next cycle to vary with the maximum joint size of both parent and daughter cell from the previous cycle. In other words, how much 'substance' the parent bacterium is willing to depart with. For example, if the division hovers around a fraction of 50/50, it would mean that the parent size ~= daughter size after division, and we would expect numerically the initial size in the next cycle to be half of the final size in the previous cycle.

Hovering over individual data points, We can see from the plot below a sort of 'preferential model': the bacterial response is to divide above or below the equilibrium threshold, giving less "stuff" to the daughter cell if it is below its equilibrium size (higher fraction), and more (smaller fraction) when the bacterium cannot sustain its own size. I believe that this is what the "critical capacity" size the paper refers to!

Figure 7: Fractions of birth size from current cycle / division size from previous cycle
splined with respect to time.

Summary


Equipped with the elegant experimental design of the Scheher Lab, we were able to observe how individual bacterium grow with respect to time. From these images, we studied their behaviors within cycles, between cycles, and determined that not only do bacterial colonies grow exponentially, but the rate at which the actual size of individual cells also grows exponentially. We observed that the parent bacterium divides a bit selfishly, retaining more than half of its size between growth cycles. Despite the considerable reactions and chaotic processes during cell division, the underlying growth followed a predictable pattern.

A Bayesian exploration of this dataset and model assessment can be found here.

Data Downloads



The original TIFF file datasets were too massive to be pushed to GitHub, but they are provided in the link below. The csv's containing all relevant area values (tidied data, segmented areas) are also provided for convenience. All figures, including plots and screen-recorded gifs can be downloaded (numbered correspondingly) from the GitHub repo linked below. Formal publication and further discussion by the Scherer Lab is also available in their article.

original data | csv-roi files | article | GitHub Repository

Acknowledgments


We thank Iyer-Biswas of the Scherer lab for sharing this data, the BeBi103 TA's for their time and patience, Griffin Chure for this aesthetic source code, Gonzalez and Woods for their graphical explanations of structuring elements, the general community of StackExchange, whom without most of our code and analysis would not be possible, and lastly, Justin Bois for making this course possible.