CMB_2025v15n4

Computational Molecular Biology 2025, Vol.15, No.4, 193-207 http://bioscipublisher.com/index.php/cmb 197 framework, viewing experimental observations as "correction signals" for the model, and updating the posterior distribution through the prior distribution and likelihood function. In this way, different types of data can all be put into the same logical system to achieve multi-dimensional fusion in a probabilistic manner (Meyer et al., 2014). In recent years, machine learning has also begun to be involved in the integration of models and experiments. Some studies use machine learning to first predict the approximate range of model parameters and then guide experiments to focus on measuring key parameters, saving a considerable amount of work. Some people use neural networks as "proxy models" to first learn the complex patterns in experimental data and then combine them with mechanism models to improve the accuracy of predictions. Another more systematic approach is the so-called "DBTL loop" (Design-Build-test-learn), which means first using the model to predict Design experiments, then feeding the data back to the model for updates after the experiments are completed, followed by the next round of improvements. Researchers worked in this way. They designed a model-experimental closed-loop to ensure that the synthetic loop maintained stable functionality across different hosts. However, there are also many pitfalls in the integration of data and models. The experiment itself is noisy and the model is approximate, so the uncertainty of the results is almost inevitable. Therefore, when integrating, researchers usually conduct confidence interval evaluations or posterior predictive tests to see if the model can truly explain all the data. If there is any type of data that never matches, it is necessary to consider whether the model is too simplified or a key parameter is missing. Sometimes, new mechanisms also need to be added to the model, such as introducing host growth effects or resource competition modules. 4.3 Model validation and error evaluation criteria The model has been built and the parameters have been determined, but this does not mean it is reliable. The next step is to verify - to see if the model can actually live up to its promises. Usually, people will first compare the model's prediction results with independent experimental data to see if it can reproduce the real trend or numerical characteristics. For instance, is the steady-state expression level calculated by the model close to that measured experimentally? Can the period of the oscillation model match the rhythm observed experimentally? If none of these match, there is an 80% chance that there is a problem with the model structure or parameter Settings, and it needs to be corrected back. Just looking at the picture is not enough; there must be quantitative standards for the gap. Several commonly used indicators by researchers include mean square error (MSE), mean absolute error (MAE), and coefficient of decision (R2), etc., which are used to measure how far the prediction is from the experiment. For dynamic systems, methods such as Dynamic Time Warping (DTW) are also used to compare the matching degree between the simulated curve and the experimental curve (Dahlquist et al., 2015). Sometimes people do not only look at the overall error but also separately examine certain key features, such as the deviation of peak values, amplitudes or steady-state values. Furthermore, statistical tests (such as chi-square test, F-test, etc.) can be used to determine whether the error of the model is within a reasonable range. However, model validation is not merely about "matching the data". A good model should remain accurate even under unfitted conditions, such as when the concentration of the inducer, temperature, or environmental background is changed, and the prediction remains reasonable. If a model can only perform well on the training data but crashes once the conditions change, it is mostly overfitting and requires the deletion of redundant parameters or the addition of regularization. Sensitivity analysis can also come in handy here - if the model is overly sensitive to a certain parameter and the parameter itself is not accurately estimated, the credibility of the entire prediction will be compromised. At this point, either increase the experimental accuracy or reconstruct the model to make it less affected by such parameters. 5 Simulation and Calculation Methods 5.1 Numerical simulation algorithms (euler, runge-kutta, etc.) When studying genetic circuits, numerical simulation is almost an unavoidable step. Most models are composed of a bunch of nonlinear differential equations. It is basically impossible to write analytical solutions for them, so

Made with FlippingBook

RkJQdWJsaXNoZXIy MjQ4ODYzNA==