A probabilistic model for crystal growth applied to protein deposition at the microscale

A probabilistic model for crystal growth applied to protein deposition at the microscale

V. J. Bolós, R. Benítez, A. Eleta-López, J. L. Toca-Herrera

Dpto. Matemáticas para la Economía y la Empresa, Facultad de Economía,
Universidad de Valencia. Avda. Tarongers s/n, 46022 Valencia, Spain.
e-mail: vicente.bolos@uv.es  rabesua@uv.es

Self-Assembly Group, CIC nanoGUNE,
Tolosa Hiribidea 76, 20018, Donostia/San Sebastián, Spain.
e-mail: a.eleta@nanogune.eu

Department of NanoBiotechnology, Institute for Biophysics,
University of Natural Resources and Applied Life Sciences (BOKU).
Muthgasse 11, 1190 Vienna, Austria.

e-mail: jose.toca-herrera@boku.ac.at
February 2018

A probabilistic discrete model for 2D protein crystal growth is presented. This model takes into account the available space and can describe growing processes of different nature due to the versatility of its parameters which gives the model great flexibility. The accuracy of the simulation is tested against a real protein (SbpA) crystallization experiment showing high agreement between the proposed model and the actual images of the nucleation process. Finally, it is also discussed how the regularity of the interface (i.e. the curve that separates the crystal from the substrate) affects to the evolution of the simulation.

1 Introduction

A crystal is a (three dimensional) periodic arrangement of repeating ‘structural motifs’, which can be atoms, molecules, or ions [1]. Technically, the process in which a (small) crystal becomes larger is called crystal growth. Crystallization is commonly referred to the fact of creating nucleation points first, followed by crystal growth. Crystal growth is important for the understanding of biomineralization, molecular diffusion and adsorption, or fractal formation [2, 3, 4]. Theoretical crystallization models distinguish between thermodynamic and kinetic conditions [5, 6]. For problems concerning growth rate and growth from limited resources, the Avrami [7, 8] and the Gompertz functions can be used, respectively, for data modelling [9].

Two dimensional in-vivo protein crystallization is commonly utilized to study the function and structure of proteins [10]. In-situ crystal growth can also be used to test growth and molecular adsorption models. In this work we have taken advantage of a bacterial protein, SbpA from Lysinibacillussphaericus CCM2177, which forms crystalline arrays on different interfaces [11]. In particular, SbpA is able to self-assemble from solution building square lattices on silicon dioxide and self-assembly monolayers [12]. Former studies on SbpA crystallization indicate on one hand that the process consists of a transition from amorphous to crystalline state [13], and on another hand, that protein-substrate determines the properties of the crystalline domain [14].

In this work we propose an approach to describe and model 2D protein crystal growth. The structure of the paper is as follows. In Section 2 we present the probabilistic model and define all the parameters involved in it. This model takes into account the available space for growing and also reproduces different shapes and porosity (or lacunarity). In Section 3 we analyze the different parameters and how they affect the development of the simulation. All these different tunable parameters give the model a very high flexibility and allow us to successfully simulate real protein crystallization processes. Finally, in Section 4 we test the model by simulating the growth of a SbpA crystal. We conclude the paper discussing how the regularity of the interface affects the evolution of the crystallization process.

2 The model

The model we present here is a discrete-space discrete-time model for a square region in which a regular square mesh is defined. At each step , one cell is filled by the crystal, and the corresponding time is denoted by , with the initial time. In our model, is computed at each step and it determines only the growth rate of the crystal, but it does not affect to the structure, i.e. its shape.

2.1 Structure

The space filled by the crystal when the th cell is occupied is given by the occupation matrix , defined by if cell is occupied, and if it is free.

In this way, we also define a probability matrix so that the probability of the cell to be occupied at the th step is given by . Nevertheless, in the simulation procedure we shall use a relative probability matrix which gives the number of chances that has a cell to be occupied, and it is related with in this way:

For example, if we want that each cell has exactly the same probability of being occupied at the beginning, then we can set . On the other hand, if we want the cells on a particular region to be twice as likely to be occupied at the beginning, then we can set for such cells and otherwise. Moreover, we can also force to have, for example, only one crystallization nucleus taking except for one cell, as it is done in Figure 5 with a cell in the middle of the square region.

In our model, proteins will depose onto the substrate, occupying cells, following the next three structure rules:

  1. If a cell is occupied, it cannot be occupied again, i.e. if , then .

  2. If a cell is occupied, then the probability of occupation of the adjacent free cells is increased, see (1).

  3. The probability of occupation of a free cell depends also on the available area in a neighborhood of that cell, see (1) and (2).

This rules are independent of and hence they only determine the structure of the crystal, not its growth rate. Note that the first rule implies that this is a fully 2D model, i.e. no height increase is considered here.

Also note that, taking into account the second rule, the space scale is much larger than the characteristic length of the crystal structure. That is, each cell does not represent a single crystal, but a larger quantity. In fact, the scale is such that one occupied cell only increases the probability of occupation of the adjacent cells.

In our model, we propose that the chance of occupation of a free cell at the th step is given by


where is a positive constant, is the number of adjacent occupied cells, and is a function related to the available area surrounding the free cell (see (2)). The real parameter determines the importance of the adjacent occupied cells and it is usually set to (see Section 3). Note that, if a free cell has not any adjacent occupied cell, then its chance of occupation is .

Remark 2.1.

In order to speed up the algorithm, the third rule can be modified by only considering free cells adjacent to an occupied cell and so, equation (1) would only apply to free cells with . Hence, the chance of occupation of a free cell with no adjacent occupied cells (i.e. ) would be instead of . It gives practically the same results when the number of crystallization nuclei is low and the algorithm is much faster.

For determining we have to take into account that in the diagonal directions there is a weight factor of (see Figure 1). This weight factor is set in order to avoid square-shaped evolution produced by the discretization in space, where circular ones should appear.

Figure 1: (a) Influence of an occupied cell (center) on the adjacent free cells. The weight in the diagonal directions is for avoiding square-shaped evolutions. (b) For a fully surrounded free cell (center), its total number of occupied adjacent cells is .

With respect to , first we have to define some concepts. The effective radius, , of a free cell in a given direction with an opening angle is the distance from the cell to the nearest occupied cell lying in the range of this direction. Then, the corresponding effective area is the area of the circular section with radius and angle (see Figure 2). The underlying idea is that the effective area is a zone where the free cell can capture proteins, but in this model, it is very simplified.

Figure 2: Effective area of a free cell in the north direction with an opening angle .

Although could be a parameter, in our model we take because it greatly simplifies the discretized algorithm and it produces results matching the real measures (see Section 3). Moreover, we only take into account effective radii lesser than a given parameter called influence radius, , beyond which, occupied cells do not hinder the crystal growth. So, if (even ) then we take . Next, we define the maximum effective radius of a free cell at the th occupation stage as the maximum of the corresponding effective radius, but only in the eight main directions (for operational purposes): east, northeast, north, northwest, west, southwest, south and southeast.

Hence, for a free cell, we define


where , , and hence, . The parameter determines the importance of the maximum effective radius in the model and it is usually taken close to . Setting means that the probability of occupation of a cell does not depend on the maximum effective radius and then, rule 3 is not satisfied. On the other hand, we usually set because in this case is the ratio between the effective area and the maximum possible effective area (i.e. the area of the circular section of radius and angle ). For convenience, for an occupied cell. With this statement, rule 1 is a consequence of equation (1).

2.2 Kinetics

Let be the area defined by the crystal measured in occupied cells. Note that, although in this model is a step function, in this work we model the physical phenomena in which can be considered as a derivable real variable function. Nevertheless, we are going to consider the occupied area proportion


that ranges from to . In this section, we are going to set and hence . Moreover, will denote the number of expected occupied cells per time unit at . It is not too important for the kinetics in a qualitative sense because, for example, a simulation with will produce practically the same results as with (leaving all other parameters unchanged), but in half the time.

With respect to the crystal growth rate, we can fit the the Avrami function, designed for modeling crystallization and some chemical reactions [7, 8]:


where and . However, in our case, where the crystal growth is 2-dimensional, can be considered equal to . Regarding , it is given by the initial crystal growth rate, since


The Avrami function fits well in the first stages of the crystal growth, but when the free space resource becomes scarce there are bigger differences (see Figure 8) since the Avrami function does not take into account this kind of resource. To solve this problem in the last stages, we can fit the Gompertz function, designed for growth with limited resources (in our case, the free space) [9]:


where . But note that this function can only model the crystal growth rate in an advanced state, since the initial occupied area in the Gompertz model is not zero. So, it can be done a piecewise fit using the Avrami function (4) for the first stages, and the Gompertz function (6) for the rest, when the lack of free space becomes significant.

Finally, as an alternative for the piecewise fit, we can use a unique “hybrid” Avrami-Gompertz function:


where . Note that (7) can only be applied in a time interval with because, according to (7), for . Moreover, the parameter in (7) is also equal to , as in (4), but the parameter in (7) is not the same as in (6).

3 The parameters

There are two kinds of parameters: structural and kinetic. The structural parameters determine the shape of the crystal, while the kinetic parameters are responsible for the crystal growth rate. We should also mention the parameter (size of the square mesh) which is of great importance since its value affects both, the structure and the kinetics of the crystal growth.

3.1 Structural parameters

  • : the “occupation chance multiplier” for free cells adjacent to an occupied one, see (1). Mainly, it is responsible for the number of crystallization nuclei along the process (see Figure 3).

    Figure 3: Number of crystallization nuclei along the process, from the beginning () to of total occupied area (), for different values of . We represent (solid lines) the means of simulations for , and respectively, with and for all cell . The rest of the parameters are , and , but they are not determinant. It is shown (dotted lines) that the number of crystallization nuclei follows a power law of the form with , (), , (), and , (). Note that parameter represents the expected number of nuclei at the end of the process () and, from this result, it seems to hold for .
  • : the opening angle, see Figure 2. In our model and so, it is not taken as a parameter. Nevertheless, it can be changed causing effects on the crystal shape: the greater the angle is, the bigger the “fjords” are (see Figure 4).

    Figure 4: Shape at (i.e. of total occupied area, see (3)) of crystallization nucleus for opening angle (left) and (right), taking , , and . The greater the angle is, the bigger the “fjords” are.
  • : the “sponginess” parameter, see (1). Determines the effect of adjacent occupied cells on a free cell, and it is usually set to . In Figure 5 it is shown how this parameter (along with ) affects the shape of the crystal.

    Figure 5: Shape at (i.e. of total occupied area, see (3)) of crystallization nucleus for different values of and , taking . The parameter determines the “sponginess” of the crystal, and represents the “difficulty of filling”. We have taken , although it is not determinant in these figures.
  • : the “difficulty of filling” parameter, see (2). It ranges from to and determines the importance of the maximum effective radius in the model. It is also related to the rate at which void regions are filled. Setting means that the probability of occupation of a cell does not depend on the maximum effective radius. In Figure 5 it is shown how this parameter (along with ) affects the shape of the crystal.

  • : the influence radius, see Figure 2. Determines the range at which occupied cells hinder the crystal growth, and so the width of the “channels” (see Figure 6).

    Figure 6: Evolution of crystallization nuclei (for occupied area proportion from to ) with different values of , taking , and . The parameter determines the width of the “channels”.
  • : the “effective dimension” parameter, see (2). It must be positive and, as it is explained before, in our model it is set to . Also determines how the maximum effective radius affects the crystal growth.

3.2 Kinetic parameters

  • : Parameters of the Avrami function (4), that models the first stages of the process. In our case, where the crystal growth is 2-dimensional, it can be considered . On the other hand, is determined by the initial crystal growth rate (see (5)).

  • : Parameters of the Gompertz function (6), that models the last stages of the process.

4 Results

4.1 Protein recrystallization

In order to validate the model we tested it against the growth of a bacterial which crystallizes forming structures called S-layers (SbpA). This SbpA protein was recrystallized on a SiO substrate and the overall process was scanned by an AFM obtaining height images (see [12] for the specific details of the experiment). In Figure 7, the first row of images shows the crystallization process for times ranging from 10 to 110 minutes. The crystal growth continues steadily until more than the 98% of the substrate is covered 12 hours later.

Taking the images of the crystal deposition at different time stamps, the total surface occupied by the crystal was estimated by transforming the images to black and white 8-bits images and using a self-developed MATLAB code which computes the number of black pixels over the total amount of pixels of the image. The second row of images in Figure 7 depicts, as an example, the transformation into black and white images of the original images shown in the top row.

Once the images are transformed and the occupied fraction is estimated, a growth curve is fit to the data according to the growth models described in Section 2.2 (see Figure 8). It is noteworthy how the Avrami model fits the data accurately at the beginning, but the growth rate for later moments is too high because it does not consider the limitations of space. As the crystal grows, the available space is reduced and so is the crystal growth rate. On the other hand, the Gompertz model captures the slow growth rate from the first 30 minutes, but it is not suitable for the first stages of the crystallization process. In fact, since the Gompertz growth model does not vanish at , it cannot be applied there. Finally, the Avrami-Gompertz model is able to accurately fit the growth rate of the recrystallization process both at the beginning and at the final steady state moments. The nonlinear least square fits were performed using the R statistical software [15] and the Levenberg-Marquardt algorithm provided by the minpack.LM package [16].

Figure 7: SbpA crystal growth sequence at different times. After the last time stamp at minutes, the crystal was stabilized, growing very slowly until the complete filling of the substrate more than 12 hours after the beginning.
Figure 8: Estimation of the occupied fraction at different time stamps, from to minutes (black dots) together with the three different kinetic growth models: Avrami, Gompertz and Avrami-Gompertz.

In order to assess the validity of the model as a description of the recrystallization process at the micro scale, we run several simulations using as initial condition the black and white image of the crystallization at minutes. The choice of this initial condition is based on the fact that, with a 50% of occupied fraction, despite the stochastic nature of the model, the simulation would give results that could be compared to the experimental data. Figure 9 shows the evolution of the recrystallization process for one of the runs of the simulation (all of them gave results which were indistinguishable at a glance). The top row shows the black and white pictures obtained from the original AFM images while the second row depicts the simulation results at the same occupation fractions. The equivalence between the occupation fraction and the time stamp was obtained by the Avrami-Gompertz kinetic model. The initial (leftmost) pictures are the same in both rows. It is remarkable the close agreement between the simulation results and the experimental data.

Figure 9: Comparison between real images (up) and simulation (down) at different times, with initial data corresponding to the real data at 20 minutes. The values of the structural parameters taken in the simulation are , and (with ). The kinetic parameters have been set to fit the real data, according to the Avrami-Gompertz model (7).

4.2 Influence of the contour regularity

Once the growth model has proven suitable for describing the protein recrystallization process, the question of the dependence of the growth rate on the regularity of the interface contour arises. The rationale behind all this is based on the fact that every system will evolve so its free energy is minimized. In crystal growth processes, the energy is highly dependent on the surface tension, or the line tension (for a 2D interface) [17]. Therefore, the higher the length of this tension line is, the higher the free energy will be. Thus if the system seeks to minimize the free energy in the fastest possible way, the crystal should grow the most in the parts were the interface line is longer. Since the length of a curve is closely related to its regularity, we should find that a crystal with an irregular contour grows faster (under the same conditions) than a crystal with a regular one.

In order to assess how the regularity of the contour affects the growth process in our model, a simulation experiment was conducted with the following initial condition: a regular and an irregular nucleus both with the same size (measuring 10000 occupied cells) were placed in opposite corners of a square big enough so the existence of any of the two nuclei did not affect the evolution of the other (see Figure 10 left picture). We run 2000 iterations of the simulation setting the parameters to , and . We also set for cells with no adjacent occupied cells so no new nuclei could appear. Figure 10 shows how the regular nucleus (bottom left corner) grows slower than the irregular nucleus (top right corner) and how its contour becomes less and less regular as the crystal deposition continues.

For a quantitative estimation of the differences between the regular and irregular nuclei, 100 simulations were conducted as described above for and . Figure 11 shows how the growth of the regular nucleus is slower than the irregular one. For example, in the case of , from the 2000 new cells grown by the simulation, only around 800 were added to the regular nucleus, while the other 1200 were added to the irregular one. It is also remarkable how, at the beginning, the differences in the growth rates are higher, and as the contour of the regular nucleus becomes more and more irregular, the growth rates are equalized. This also becomes clear in Figure 12, in which the ratio between the probabilities of occupation of a cell adjacent to both nuclei is plotted vs. the number of new deposited cells (). It can be seen how the curves are above 1, meaning that the irregular nucleus has more occupation probability than the regular one and how the curves decrease as the crystals grow, due to the fact that the differences in regularity between the two nuclei are less marked. In the long run, the quotients are asymptotically stabilized around a value slightly greater than 1, because the irregular nucleus has grown more in the initial stages of the simulation and these differences remain over time.

Figure 10: Evolution of two nuclei (irregular and regular), initially both with occupied cells. The values of the structural parameters taken in the simulation are , and (with ). We have taken for cells with no adjacent occupied cells to prevent the formation of new nuclei.
Figure 11: Evolution of the number of occupied cells for irregular and regular nuclei. We computed the mean of 100 simulations with the same parameter values as those used in the simulation of Figure 10 with (left) and (right).
Figure 12: Evolution of the probability of occupation quotient, given by the probability of occupation of a cell adjacent to the irregular nucleus divided by the probability of occupation of a cell adjacent to the regular nucleus, with and .


  • [1] Charles Kittle. Introduction to Solid State Physics, 7th ed. Wiley India Pvt. Limited, 2007.
  • [2] Arthur Veis and Jason R. Dorvee. Biomineralization mechanisms: A new paradigm for crystal nucleation in organic matrices. Calcified Tissue International, 93(4):307–315, Oct 2013.
  • [3] Chengbin Huang, Shigang Ruan, Ting Cai, and Lian Yu. Fast surface diffusion and crystallization of amorphous griseofulvin. The Journal of Physical Chemistry B, 121(40):9463–9468, 2017. PMID: 28885850.
  • [4] N. M. Semenova, A. A. Dick, V. N. Skokov, and V. P. Koverda. Fractal structurization under crystallization of amorphous germanium in three-layer films ag-ge-ag. physica status solidi (a), 139(2):287–293, 1993.
  • [5] C. Himawan, V.M. Starov, and A.G.F. Stapley. Thermodynamic and kinetic aspects of fat crystallization. Advances in Colloid and Interface Science, 122(1):3 – 33, 2006. ECIC-XVII, XVIIth European Chemistry at Interfaces Conference, 27 June-1 July, 2005, Department of Chemical Engineering, Loughborough University.
  • [6] Deliang Zhou, Geoff G. Z. Zhang, Devalina Law, David J. W. Grant, and Eric A. Schmitt. Thermodynamics, molecular mobility and crystallization kinetics of amorphous griseofulvin. Molecular Pharmaceutics, 5(6):927–936, 2008. PMID: 18986160.
  • [7] Melvin Avrami. Kinetics of phase change. i general theory. The Journal of Chemical Physics, 7(12):1103–1112, 1939.
  • [8] Melvin Avrami. Kinetics of phase change. ii transformation-time relations for random distribution of nuclei. The Journal of Chemical Physics, 8(2):212–224, 1940.
  • [9] Anna Kane Laird. Dynamics of tumor growth. British Journal of Cancer, 18(3):490–502, sep 1964.
  • [10] Jonathan P.K. Doye and Wilson C.K. Poon. Protein crystallization in vivo. Current Opinion in Colloid and Interface Science, 11(1):40 – 46, 2006.
  • [11] Uwe B Sleytr, Margit Sára, Dietmar Pum, and Bernhard Schuster. Characterization and use of crystalline bacterial cell surface layers. Progress in Surface Science, 68(7):231 – 278, 2001.
  • [12] Aitziber Eleta López, Susana Moreno-Flores, Dietmar Pum, Uwe B. Sleytr, and José L. Toca-Herrera. Surface dependence of protein nanocrystal formation. Small, 6, 2010.
  • [13] Sungwook Chung, Seong-Ho Shin, Carolyn R. Bertozzi, and James J. De Yoreo. Self-catalyzed growth of s layers via an amorphous-to-crystalline transition limited by folding kinetics. Proceedings of the National Academy of Sciences, 107(38):16536–16541, 2010.
  • [14] Ainhoa Lejardi, Aitziber Eleta López, José R. Sarasua, U. B. Sleytr, and José L. Toca-Herrera. Making novel bio-interfaces through bacterial protein recrystallization on biocompatible polylactide derivative films. The Journal of Chemical Physics, 139(12):121903, 2013.
  • [15] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2017.
  • [16] Timur V. Elzhov, Katharine M. Mullen, Andrej-Nikolai Spiess, and Ben Bolker. minpack.lm: R Interface to the Levenberg-Marquardt Nonlinear Least-Squares Algorithm Found in MINPACK, Plus Support for Bounds, 2016. R package version 1.2-1.
  • [17] A.-L. Barabási and H. E. Stanley. Fractal Concepts in Surface Growth. Cambridge University Press, Cambridge, 1995.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description