Environmental variability in aquatic ecosystems: Avenues for future multifactorial experiments

The relevance of considering environmental variability for understanding and predicting biological responses to environmental changes has resulted in a recent surge in variability‐focused ecological research. However, integration of findings that emerge across studies and identification of remaining knowledge gaps in aquatic ecosystems remain critical. Here, we address these aspects by: (1) summarizing relevant terms of variability research including the components (characteristics) of variability and key interactions when considering multiple environmental factors; (2) identifying conceptual frameworks for understanding the consequences of environmental variability in single and multifactorial scenarios; (3) highlighting challenges for bridging theoretical and experimental studies involving transitioning from simple to more complex scenarios; (4) proposing improved approaches to overcome current mismatches between theoretical predictions and experimental observations; and (5) providing a guide for designing integrated experiments across multiple scales, degrees of control, and complexity in light of their specific strengths and limitations.

challenges, and knowledge gaps to contribute to a research agenda on the effects of variability in aquatic systems. We also provide guidance on how to investigate variability effects experimentally, considering modeling and experimental design.

Abstract
The relevance of considering environmental variability for understanding and predicting biological responses to environmental changes has resulted in a recent surge in variability-focused ecological research. However, integration of findings that emerge across studies and identification of remaining knowledge gaps in aquatic ecosystems remain critical. Here, we address these aspects by: (1) summarizing relevant terms of variability research including the components (characteristics) of variability and key interactions when considering multiple environmental factors; (2) identifying conceptual frameworks for understanding the consequences of environmental variability in single and multifactorial scenarios; (3) highlighting challenges for bridging theoretical and experimental studies involving transitioning from simple to more complex scenarios; (4) proposing improved approaches to overcome current mismatches between theoretical predictions and experimental observations; and (5) providing a guide for designing integrated experiments across multiple scales, degrees of control, and complexity in light of their specific strengths and limitations.
Organisms inhabiting aquatic ecosystems regularly experience temporal variability in multiple environmental factors simultaneously. In contrast, researchers tend to disproportionately utilize temporally static models and experiments, raising concerns for accurately predicting ecological responses amidst altered environmental variability accompanying global change (Wang and Dillon 2014;Zhang et al. 2016). Thus, the consequences of variability in environmental factors, involving single and multiple factors, have received an increased emphasis in recent empirical and theoretical studies (Gunderson et al. 2016;Koussoroplis et al. 2017;Safaie et al. 2018;Kapsenberg and Cyronak 2019;Ryo et al. 2019;Kroeker et al. 2020;Jackson et al. 2021;Pansch et al. 2022). This growing body of research accompanies increased complexity in designing and conducting experimental studies, and in interpreting the implications of the results. Variability research currently lacks an integrated framework for explicitly connecting key questions related to environmental variability with specific experimental approaches, especially those aimed at understanding the consequences of concurrent changes in the variability of multiple environmental factors. While synthesis literature is available regarding variability in terrestrial systems (Colinet et al. 2015(Colinet et al. , 2018 fewer efforts have been made to review environmental variability knowledge and incorporate it into experimental designs for aquatic ecosystems.
Here, we synthesize relevant terms and frameworks for single and multifactorial variability research and identify major research challenges in recent experimental work. Finally, we outline a path forward by discussing the integration of theoretical and experimental design aspects in aquatic system research. Although this work is mainly focused on experimental approaches in environmental variability research, this should be done in conjunction with theoretical and field approaches, as shown in this synthesis. We focus on the temporal variability of abiotic environmental drivers; however, as mobile organisms can experience temporal variability due to existing spatial variability, we extend the scope of this contribution to additionally consider the spatial domain. This manuscript focuses on aquatic ecosystems, especially regarding the experimental aspects, but general variability background and terrestrial experimental examples are included for comparison and to fill research gaps.

Terms and definitions
Environmental variability characterizes natural ecosystems as having either deterministic (predictable, e.g., daily and seasonal light, pH, and temperature cycles) or stochastic (unpredictable, e.g., changes in estuarine and stream salinities) fluctuations (Fujiwara and Takada 2017;Kapsenberg and Cyronak 2019;Pansch and Hiebenthal 2019;Dobry et al. 2021). Natural variability in the same factor, in turn, may be perceived by organisms in a variety of ways and scales. The performance of photosynthetic organisms in aquatic systems, for instance, can be affected by light variability, such as changes in light intensity and spectrum, as seasonal or daily sinusoidal fluctuations, or rapid fluctuations due to vertical mixing of the water column, movement of organisms, and clouds (Shatwell et al. 2012;Hintz et al. 2022;Neun et al. 2022). Some environmental factors are not only influenced by abiotic but also by biotic processes. Water pH can show relevant daily fluctuations associated with photosynthesis rates in habitats with microalgae or seagrass dominance as well as stochastic variations associated with freshwater inputs (Rivest et al. 2017;Wahl et al. 2018;Cyronak et al. 2020). Similarly, oxygen and carbon dioxide availability are affected by temperature (gas solubility and water level stratification) and by the balance of respiration and photosynthesis rates (Flanagan and McCauley 2010;Yvon-Durocher et al. 2010;Roman et al. 2019;Rasmusson et al. 2020;Wahl et al. 2021). These examples show how natural systems comprise multiple factors varying simultaneously in different manners (i.e., at different scales), generating natural complex patterns of variability.
Anthropogenic activities can also generate, exacerbate or modify variability patterns in ecosystems, increasing stressful conditions for organisms (e.g., heatwaves, pollutant concentrations) (Striebel et al. 2016;Pansch et al. 2018;Woolway et al. 2021;Polazzo et al. 2022). Novel environmental variability produces effects at different temporal and spatial scales (Boyd et al. 2016;Kroeker et al. 2020), by affecting directly the performance of the organisms, or indirectly through species interactions (e.g., affecting consumers by changes in producers) (Berger et al. 2014;Litchman et al. 2015;Harris et al. 2018). In addition, global changes caused by anthropogenic activities such as continuous press disturbances may impact differently on organisms that experience naturally variable or more constant environments. It has been proposed that warming effects on organisms' performance can be exacerbated by thermal fluctuations in comparison to warming effects under constant conditions Cabrerizo et al. 2021;Gonz alez-Olalla et al. 2022). However, water pH fluctuations under ocean acidification scenarios show less conclusive results, where negative, neutral, or positive effects on performance have been detected for corals and coralline algae when exposed to fluctuations under low pH (Cornwall et al. 2013(Cornwall et al. , 2018Rivest et al. 2017). Such results highlight the relevance of considering natural variability when attempting to predict biotic responses to global change and the need for additional research examining a broad range of variables.
Natural and anthropogenic-induced dynamics of variability can thus generate a wide range of environmental variability patterns described by different characteristics (i.e., components), including the magnitude, frequency, and predictability of variation (Box 1). Such components are commonly assessed in variability research, depending on the investigation scale (e.g., single or multiple events, see Ryo et al. 2019 for synthesis in temporal dynamics), but ecologists still lack a common framework. Establishing consensus on how to define variability components and how to analyze their effects in a multifactorial context is not trivial and doing so would facilitate the query for specific research directions, enhance communication among researchers, and allow the comparison of outcomes. Thus, the first step for a variability research framework necessitates producing common definitions for the sources of variability, the components describing variability, and the types of variability effects in a multifactorial context (summarized in Box 1).

How to approach biological consequences of environmental variability
Our current experimental knowledge on biological responses (e.g., traits) to environmental factors is primarily generated in constant laboratory environments with a focus on changes in mean values, or as ramping assays (gradual change) in longer-lived organisms (Rezende et al. 2014). These responses can often be modeled as nonlinear functions of the environmental gradient (Angilletta 2006;Denny 2017). In a variable environment, however, plugging the average value of the environment in the nonlinear function can approximate, or either underestimate or overestimate the observed response (Niehaus et al. 2012). Such bias can result from Jensen's inequality (Jensen 1906), a mathematical property that implies different biological responses between variable and constant environments even if the two environments share the same average conditions (Hastings and Caswell 1979;Ruel and Ayres 1999). This property predicts that for convex (i.e., accelerating) regions of response functions, environmental variability increases the mean biological responses relative to constant environmental regimes with similar mean conditions, whereas for concave (i.e., decelerating) regions of response functions, variability effects are reversed (Ruel and Ayres 1999). To predict how much biological responses differ, that is, the variance effect, one can use statistical methods such as those developed in the Scale Transition Theory (Chesson et al. 2005). The magnitude of the increase or decrease of the response is defined by the function's curvature and the environmental factor's statistical variance. The Scale Transition Theory is now increasingly used in ecology to formulate and test hypotheses of thermal fluctuation effects on vital rates at different levels of biological organization from performance curves obtained under constant or static conditions in experimental (Kingsolver and Woods 2016;Bernhardt et al. 2018;Gerhard et al. 2019) and theoretical (Vasseur et al. 2014;Dowd et al. 2015;Denny 2017Denny , 2019 studies. For example, Bernhardt et al. (2018) confirmed the above predictions experimentally by comparing phytoplankton thermal performance curves under constant and variable thermal regimes, where observed performance under variable conditions matched the predicted values for the convex and concave parts of the curve measured under constant conditions (see also Morash et al. 2018 as an example on fish). Despite the Scale Transition Theory predictions are theoretically valid for other environmental variables than temperature (Denny and Dowd 2022), which show nonlinear performance curves (e.g., resource supply, Litchman 2000; Bestion et al. 2018), this approach has been less used for testing performance under constant and fluctuating conditions in other factors (see Section Challenges of integrating inferences from experimental data and theory on effects of environmental variability).
Scale Transition Theory is a valuable tool for understanding the effects of variability and for stating qualitative hypotheses, but like any statistical method, it has limitations. First, mathematically, the Scale Transition Theory equations are exact only in the case of precisely quadratic biological response functions. In all other cases, the equations are reasonable approximations as long as environmental variability remains relatively low. Second, even when the predictions are Box 1. Important terms (definitions) in multifactorial environmental variability ecology.

Factor identity
Environmental factor(s) that are considered (measured or manipulated), for example, nutrient and pollutant concentrations, temperature, light spectrum and intensity, oxygen concentration, salinity, pCO2/pH, hydrodynamic patterns. Such environmental factors have direct effects on organisms and indirect effects by altering interactions. The relevance of different factors might differ across systems and sites (e.g., within and between ponds, lakes, rivers, and marine systems). Scale Depending on the target organizational level and organism life span, variability is perceived differently (e.g., as continuous or fluctuating) and therefore different scales may have to be considered (Petersen et al. 2009;Boyd et al. 2016;Gunderson et al. 2016;Jackson et al. 2016;Kroeker et al. 2020) • Temporal: From minutes to subdaily to decadal timescales. Temporal scales are related to the duration of environmental events and biological processes (e.g., acute vs. chronic responses, organism's lifespan). • Spatial: From nm-μm to km across latitudinal and longitudinal gradients. In aquatic ecosystems the spatial scale can be described by different variables such as length, depth, area, volume, shape, and habitat heterogeneity.

Components of variability
There is a wide range of characteristics of environmental variability drivers used for describing natural dynamics or novel patterns.
Patterns of variability may become stressors in disturbed ecosystems. Such characteristics are frequently used in variability research depending on the investigation scale but lack a common language and definitions. Different metrics are commonly used for describing variability according to the scale of evaluation (Ryo et al. 2019): Short scales/single events • Magnitude: Is the amount of change in one factor and characterizes the intensity of one or more events (e.g., the amplitude of daily temperature fluctuations). • Rate of change: Indicates if the experienced variability is abrupt or gradual.
• Duration: Is the length of a specific event (Coble et al. 2016). For example, the time between start and end (dates) of heatwaves (Oliver et al. 2018). Large scale • Variance: Measures the distribution of an environmental variable around its mean (Coble et al. 2016) and provides information about deviation from mean values at different scales of interest. The amplitude is also commonly used to approximate the variance of deterministic patterns of variability (e.g., diurnal cycles). • Frequency: Is the number of regular cycles or events that occur within a given time period. Can be analyzed using spectral analysis and Fourier transformation (Dillon et al. 2016;Kroeker et al. 2020). The same concept can be used for spatial variation and is referred to as the grain size of environmental patches ). • Autocorrelation: This aspect has been analyzed as the color of noise (Travis 2001;Vasseur and Yodzis 2004), where variability is defined as white noise when no autocorrelation and constant variance are present in any scale. Red noise, in turn, has stronger autocorrelation and increasing variance with time and/or space. A higher autocorrelation usually implies higher predictability. Predictability has been also considered as the consistency in the magnitude and timing of environmental fluctuations (Kroeker et al. 2020). Multifactorial context When more than one factor is evaluated, there are different aspects and effects to consider. Combined (interactive) effects: • Additive: When the response to the combination of more than one factor together equals the sum of the responses of the single factors' addition. • Nonadditive (or interactive): When the response to the combination of more than one factor together differs from the sum of the responses of the single factors' addition. Synergistic: When the response to addition of more than one factor together is larger than the sum of the responses of the single factors' addition (Koussoroplis et al. 2017).
Antagonistic: When the response to addition of more than one factor together is smaller than the sum of the responses of the single factors' addition (Koussoroplis et al. 2017). Special cases of nonadditivity: • Cross-dependence: The influence of the level of one factor on the system response shape (response curve) to another factor (Fig. 2, Koussoroplis et al. 2017). • Covariance effect: The sensitivity of the system response to the covariance of multiple factors (Koussoroplis et al. 2017). Environmental factors can show positive (in-phase) or negative (out-of-phase) covariation dynamics (Fig. 2).

Gerhard et al.
Environmental variability in aquatic ecosystems mathematically exact, Scale Transition Theory predictions can differ from observations indicating that other mechanisms besides nonlinearity are in play (Niehaus et al. 2012;Koussoroplis et al. 2017). Variability may induce compensatory acclimation or stress that changes patterns of retrieved performance functions (Vajedsamiei et al. 2021b). Gradual plasticity (i.e., progressive phenotypic adjustments by individual organisms or populations in response to environmental change) amidst environmental fluctuations can generate mismatches between the realized performance and expected responses caused by the delay in phenotypic adjustments relative to environmental change (Kremer et al. 2018;Fey et al. 2021). This implies that realized performance depends not only on the variance of environmental factors but also on the temporal scale at which the factors fluctuate, a phenomenon termed time-dependent effects (Rezende et al. 2014;Dowd et al. 2015;Kingsolver and Woods 2016;Koussoroplis et al. 2017;Vajedsamiei et al. 2021b). Time-dependent effects have been experimentally studied for temperature variability, demonstrating how increased duration of exposure to stressful temperatures can narrow organismal thermal tolerances (Rezende et al. 2014), and how daily fluctuations can exacerbate acute responses to high temperatures (Kingsolver et al. 2015). Thus, gradual plasticity can enhance or decrease realized performance compared to expectations for thermal variation depending on the thermal acclimation history, as has been demonstrated experimentally (Kremer et al. 2018;Fey et al. 2021;Vajedsamiei et al. 2021b). Most information for building this framework is based on temperature studies, however, studies involving other environmental variables also support these ideas. Under slow light fluctuations, phytoplankton has been shown to integrate growth rates from extreme values by adjusting to changes in light intensity, while fast fluctuations reflect the average growth rates of constant conditions (Litchman 2000). Furthermore, simulations of salinity reduction caused by storms showed that not only the salinity level (i.e., magnitude), but the length and timing of this event alters the growth of a marine snail larvae (Richmond and Woodin 1996). Here, larvae exposed to later, longer or lower salinity events presented smaller size after the salinity manipulations (storms) were ended (Richmond and Woodin 1996).

Integrating multifactorial variability
The performance of biological systems is rarely determined by a single environmental factor (Vinebrooke et al. 2004;Boyd et al. 2016). Co-occurring factors may combine and lead to different effects than those arising from the two factors in isolation, driven by synergistic or antagonistic effects (Box 1, Gunderson et al. 2016;Jackson et al. 2016). Such interactions among factors can be highly complex since the curvature of the response to each factor (when considering a gradient) as well as the joint factor effect (additive, nonadditive) can change for different conditions (cross-dependence, Sperfeld et al. 2016;Koussoroplis et al. 2017). The effect of temperature and nutrients on primary producers is a well-studied example of complex interactions. Experiments show that high temperature increases phytoplankton biomass when nutrient supply is high, but the temperature effect disappears or even turns negative under low nutrient regimes (Hennemann and Petrucio 2010;De Senerpont Domis et al. 2014;Verbeek et al. 2018). When extending the levels of each factor to gradients, nutrient availability was shown to change phytoplankton's thermal performance curves by changing the thermal optima and breadth (Thomas et al. 2017;Bestion et al. 2018;Aranguren-Gassis et al. 2019). Thus, the effect size of the thermal variance changes in accordance with how nutrients shape thermal performance curves (Koussoroplis et al. 2017;Gerhard et al. 2019). More generally, this means that if interactive effects change along the environmental gradient, sensitivity to variance might change in effect size and even direction (Sperfeld et al. 2016).
Scale Transition Theory equations can be readily expanded to multivariate response functions, and provide important insights regarding the link between the way factors jointly act upon the biological response and their variance effects (Chesson et al. 2005;Denny 2016; Koussoroplis and Wacker 2016; Koussoroplis et al. 2017). If the effect of multiple factors on a biological response is additive, the effect of their variances is also additive, and the factors can be considered separately. However, when the effect of two environmental factors on a biological system is nonadditive (presence of interactive effects), the biological response also depends on the statistical covariance between the factors, that is, the covariance effect (Denny 2016; Koussoroplis and Wacker 2016; Koussoroplis et al. 2017Koussoroplis et al. , 2019. Theory predicts that covariance effects are proportional to the covariance between the factors and the degree of nonadditivity (i.e., the magnitude of the response to the combination of two factors compared to the sum of the effects of each factor alone). The direction of the covariance effect changes with the nonadditivity type (antagonistic or synergistic). For example, if two factors covary positively and their effect is synergistic the covariance effect will tend to increase the biological response, but the response will tend to decrease if the two factors act antagonistically (Koussoroplis et al. 2017). Given the high relevance of nonadditive effects on organism's performance (Gunderson et al. 2016;Jackson et al. 2016) and the correlation of factors at temporal and spatial scales (Cyronak et al. 2020;Denny and Dowd 2022) in aquatic systems, covariance effects are likely common in nature. Experimental manipulations of positive and negative covariation of temperature and food availability have supported general theoretical expectations showing a significant covariance effect on Daphnia life-history traits (Koussoroplis and Wacker 2016). Negative covariation between factors showed the lowest trait values mirroring the synergistic effects of these factors measured under constant conditions. Interestingly, in the positive covariation scenario the covariance effect on performance was higher than inferred by modeling (Koussoroplis and Wacker 2016). As for single factors, gradual phenotypic plasticity can lead to time-dependent effects in multifactorial scenarios and result in deviations of observed trait values from predictions ). An example in Daphnia showed that the covariance effect (effect size and direction) of two fluctuating resource regimes on growth is mediated by fluctuation frequency and, therefore, time for acclimation as well as nutritional reserve effects .
It is important to note that beyond the illustrative examples described here, other environmental variables also interact in complex ways altering performance of a variety of biological systems and therefore highlighting the relevance of considering such complex multifactorial scenarios in variability research. Temperature has been shown to interact with irradiance affecting phytoplankton growth rate, and the effect of this interaction is shaped by the photoperiod regime (

Variety of experimental approaches
Experimental investigations on the effects of environmental fluctuations on biological systems often show contradictory results (i.e., different sizes and direction of variability effects). Marine organisms have shown neutral, positive, and negative acclimation responses to thermal fluctuations Mor on Lugo et al. 2020;Vajedsamiei et al. 2021b), which may be attributed to differences among systems (natural variability driving adaptation), type of organisms (short-vs. long-lived), and the measured response (Jackson et al. 2021). The thermal dependence of measured traits (i.e., thermal performance curves) can vary substantially across levels of biological organization, and thus performance depends on the response variable that is measured. For example, thermal tolerance can decline when scaling up from molecular levels to populations as shown in terrestrial ecosystems (Rezende and Bozinovic 2019). This may influence estimates of potential detrimental effects of variability on performance and species persistence at high temperatures. Mismatches can also originate from the experimental scale used to test predictions. For long-lived organisms, short-term thermal performance curves measured during a thermal ramp (e.g., increasing at hourly intervals) might differ from longterm thermal performance curves. This discrepancy can generate differences between observations and expectations under diel thermal fluctuations because different scales of environmental change are apparent (Rezende et al. 2014). Such aspects are, at least in part, a consequence of the lack of common criteria for including variability components at different scales and biological levels of organization (Thompson et al. 2013;Colinet et al. 2015), which challenges the attempt to generalize across systems and test predictions.

Information about different environmental factors
For environmental variables other than temperature the information is sparse, and approaches differ among variables, but some comparisons between constant and fluctuating environments are available. Variable light affects phytoplankton performance differently than constant light (Shatwell et al. 2012;Theus et al. 2022) and the effect of fluctuations depends on the average irradiance and the period of fluctuations (i.e., the part of the response curve that is analyzed and the frequency of fluctuations) (Litchman 1998(Litchman , 2000Shatwell et al. 2012). Experimental studies using salinity gradients showed that marine mussels' vital rates (filtration, respiration, among others) might have nonlinear response curves (Guzm an-Agüero et al. 2013; Peteiro et al. 2018), but the effects of salinity fluctuations have not been addressed using the performance framework. However, an experiment in the estuarine copepod Acartia tonsa, showed that salinity fluctuations decreased gross growth efficiency in comparison to constant environments with the same average salinity (Martínez et al. 2020). Since changes in salinity require osmoregulation by the organisms (Evans and Kültz 2020), such effects might be associated with an increased energetic cost related to adjustments in osmoregulation (Martínez et al. 2020). It has also been shown that coralline microalgae decline growth rate in response to daily pH fluctuations (Cornwall et al. 2013). However, opposite patterns of growth responses to pH fluctuations in corals and coralline algae are shown (summarized in Rivest et al. 2017) as well as for other vital rates (e.g., calcification rates, Cornwall et al. 2018). Overall, knowledge of biotic processes under environmental fluctuations is limited in comparison to constant environments, but also the applied approaches differ among variables (e.g., the performance framework has been to date mostly applied for temperature, Denny and Dowd 2022).

Mismatch between theoretical predictions and experimental results
In addition to experimental approaches, theoretical studies have increasingly addressed the effects of variability on performance, but model predictions often lack experimental tests required to extrapolate their expectations to ecologically relevant conditions (Morash et al. 2018). How accurately reaction norms established for constant or ramp treatments anticipate the performance of ecological systems experiencing environmental fluctuations has been tested in aquatic ecosystems, and many studies showed deviations between observed and predicted patterns for a variety of organisms (anuran larvae, Here, insects showed faster development when colder temperatures were more frequent (i.e., a higher proportion of cold days), possibly due to an adaptive response to seasonal shifts (Khelifa et al. 2019). This highlights that the extrapolation of conceptual (i.e., proof of concept) experimental results to natural conditions must be done with caution and underscores the need of understanding processes underlying more complex scenarios that more closely approximate natural ecosystems.

Incorporation of abiotic and biotic complexity in variability research
Few studies have assessed more complex abiotic (i.e., environmental components of variability and multiple factors) and biotic (i.e., responses at different levels of biological organization and biological interactions) scenarios. Experimental efforts addressing complex environmental dynamics (abiotic complexity), involving not only fluctuations per se, but changes in fluctuation frequency (Pansch and Hiebenthal 2019;Wang et al. 2019;Kunze et al. 2022), predictability (Fey and Wieczynski 2017;Shama 2017;Rescan et al. 2020;Burton et al. 2020), or considering multiple factors (Koussoroplis and Wacker 2016;Koussoroplis et al. 2019) are scarce and comprise a variety of approaches and types of organism addressed (phytoplankton, zooplankton, mussels, and fish). Covariance effects (i.e., including fluctuations of two factors), for example, have been evaluated connecting models and experiments, but examples are rare and mostly limited to simplified laboratory approaches (Koussoroplis and Wacker 2016;Koussoroplis et al. 2019). Empirical tests across environmental gradients and ecologically relevant scenarios (i.e., representing a wide range of levels and combination of factors that simulate natural variance and covariance effects; e.g., Wahl et al. 2021) are much needed for covering different components of variability and multiple factor effects.
When considering investigations at higher biological levels of organization beyond physiological responses, for example, population and community level responses, two important aspects are added: inter-and intra-specific competition. Specifically, fluctuation-dependent mechanisms such as the storage effect can play a key role in the maintenance of biodiversity when community responses to environmental variability allow for reduced inter-specific competition and stable coexistence of species (Chesson 2000;Descamps-Julien and Gonzalez 2005). These expectations were supported by experiments testing phytoplankton diversity responses to fluctuating vs. constant light (Litchman 1998;Flöder et al. 2002). However, experiments evaluating the effects of environmental variability on phytoplankton communities have shown that species richness can decrease (Burgmer and Hillebrand 2011) or increase (Gerhard et al. 2019) as a consequence of temperature fluctuations. Moreover, it has been proposed that there is a strong link between functional biodiversity and environmental variability Guislain et al. 2019), suggesting that variability may have drastically different effects on diverse communities compared to the individual or population level. Although there is little experimental work evaluating how diversity mediates the effects of environmental variability on community functions, laboratory experiments showed that communities composed of higher numbers of species can buffer the negative effects of thermal variance on phytoplankton biomass (Bestion et al. 2021) and aquatic fungal leaf decomposition (Gonçalves et al. 2015). Interestingly, studies evaluating the effects of fluctuations on communities generally found that few (likely) well-adapted species (e.g., species with wider thermal range) dominate the communities under fluctuating conditions (Burgmer and Hillebrand 2011;Rasconi et al. 2017;Gerhard et al. 2019;Zhang et al. 2019;Bestion et al. 2021;Cabrerizo et al. 2021). This is surprising considering that natural ecosystems are regularly exposed to variability and are composed of diverse communities. Thus, it is important to note that these experiments are often logistically limited (conducted in closed systems, simulating unrealistic variability dynamics), which influences the capacity to realistically simulate and understand the role of variability in diverse communities (and meta-communities). Sometimes, the effect of variability may also be subtle when looking at species composition, but more pronounced in other measures, such as biochemical composition (Marzetz and Wacker 2021). The presence of such complex ecological processes makes it difficult to design, conduct, and interpret experimental studies and their connection with theory.
Increasing biological complexity typically results in large divergence between model predictions and experimental outcomes. How environmental variability is dampened or amplified in communities of more than one trophic level has been primarily approached by models Dee et al. 2020;Simon and Vasseur 2021). Conversely, many experiments (especially large mesocosm experiments) use scenario-based approaches instead of theory-based hypotheses or model assumptions. This was highlighted by Kharouba and Wolkovich (2020) who showed how theory and observations were disconnected in climate change driven phenological mismatch since collected data failed to test for theoretical assumptions (i.e., lack of pairwise per capita fitness) and to define a baseline (defining the range of natural variation in the timing of species interaction before climate change effects occur). The mismatch between theory and empirical work has been recently pointed out, beyond environmental variability research, showing that empirical ecologists often conduct experiments without considering theoretical projections (Korell et al. 2020), while theoretical ecologists use models limited by their assumptions for addressing more realistic scenarios (Klausmeier et al. 2020). This suggests that the lack of a common language, exchange and understanding between theorists and experimentalists might cause mismatches (Grainger et al. 2022). When criticizing that few experimental studies are hypothesis-driven, is that so because there is a lack of theoretical predictions for complex systems that could be used as hypotheses, or do we simply fail to translate existing theoretical predictions for our specific experimental system? Similarly, when arguing that modeling results are often based on unrealistic assumptions, and therefore of little use for predictions, should we not also acknowledge that many experiments are performed at conditions that seem experimentally feasible or promise a strong response to act as a proof-of-concept, thus also compromising on realism?

Dealing with mismatches between theoretical predictions and experimental data
The origin of mismatches Amending mismatches between (quantitative) models and empirical data is facilitated by first elucidating their origins (see Grainger et al. 2022 for a general synthesis). The first aspect to be considered in variability research is that models are commonly parametrized using experimental data produced under constant conditions in which biological processes that could manifest in variable environments and drive performance cannot be detected. For example, physiological plasticity might produce experimental results not considered in the design of previous models due to local acclimation or adaptation (Sanford and Kelly 2011;Peck et al. 2014). Thus, failing to include such biological processes will lead to mismatches between theoretical predictions on one hand, and experimental results and field observations on the other hand. Yet, even if the relevant biological mechanisms are included in the model, defining the temporal and/or spatial scale at which they operate can be a challenge. Mismatches between predictions and observations can arise, if, for example, the modeled rate of acclimation is too slow or too fast relative to reality (Denny and Dowd 2022). A recent conceptual framework in fluctuating environments could help designing experiments that address the issue of acclimation, the temporal scales at which it operates as well as its consequences on performance (Fey et al. 2021). Such experiments, although logistically challenging for organisms larger than microbes, could help in the design and parametrization of better models.
Another major cause for mismatches to be careful about can be that the model used was originally designed for a different purpose (see Klausmeier et al. 2020 for an example of careful model selection and extension). Here, multifactorial variability approaches pose a difficulty due to the interaction between factors. If a model is grounded on experiments for single factors variability, such interactions will not a priori be included in the model and may thus yield mismatches with data from multifactorial experiments (e.g., cross-dependence and covariance effects, see Section Integrating multifactorial variability). Similarly, if a model is specified to describe the physiology of individual organisms, then using the model to infer population level variability effects will ignore interactions within the population that may, for example, give rise to density-dependent effects. This equally holds for models designed for single populations that are used to infer community responses (e.g., if potential trophic cascades of variability effects are not considered, see Hunsicker et al. 2011 for a discussion on scale-dependence of predator-prey interactions). Clearly, mixtures of all the above cases can occur, showing that agreement between model predictions and experimental data are generally challenging to achieve.

Addressing mismatches
For bridging the gaps between variability theory and experimental work, it is important to start with the main questions: What do we, as scientists interested in multidimensional variability effects, expect from theory? And which information can be provided to improve model performance and predictions? Models can be combined with empirical information with different aims: (1) to state null-hypotheses that can be tested by experiments. For example, using nonlinear averaging we can predict performance responses to fluctuating environments in the absence of phenotypic plasticity, and mismatches allow for postulation of such time-dependent effects (e.g., Khelifa et al. 2019;Koussoroplis et al. 2019); (2) to improve mechanistic understanding of experimental data. To study, for example, how physiological processes are affected and interact in variable environments, parameter estimates and an iterative process between experiments and modeling are needed, which is aided by formalized model selection strategies such as AIC (this procedure is important but usually not documented in publications); or (3) to predict or extrapolate species responses to nontested scenarios (e.g., of climate change). For example, having a mechanistic understanding (e.g., physiological thermal limits) improves predictions of potential expansion in alien species (Buckley et al. 2011;Wesselmann et al. 2021) or changes in marine virushost dynamics (Demory et al. 2021). Based on these objectives a variety of models can be used for investigating environmental variability effects on biotic systems including mechanistic and phenomenological types.
Mechanistic models propose relationships between variables such as biomass or abundance that are based on the biological processes and their assumed dependence on environmental factors and variability therein (e.g., dynamic energy budget models, Koussoroplis et al. 2019), or models based on the metabolic theory of ecology (Walters et al. 2012). Phenomenological (or statistical) models aim to best describe data sets and propose relationships between variables and sensitivities on environmental variability based on these empirical or simulated patterns (Schwager et al. 2006;Picoche and Barraquand 2020). There are also other types of models between these extremes, for example, species distribution models that can be correlative or mechanistic and can describe and predict the response of species to environmental variability patterns (Kearney and Porter 2009;Zurell et al. 2009;Dormann et al. 2012;Bocedi et al. 2021), and food web models that are sometimes less mechanistic on the physiological level, but include species interactions and how effects of environmental variability propagate through communities Demory et al. 2021;Quévreux et al. 2021;Simon and Vasseur 2021).
Importantly, successful integration of modeling results and experimental observations is strongly facilitated by early agreement on the conditions that are simulated and parameters that are measured experimentally. An iterative process where model refinements are repeatedly checked against available data is also highly beneficial, pointing out necessary additional experiments to be performed to then feed back into the model formulation. Understanding the scope, strengths, and limitations of approaches that span theory, experiments, and field observations helps this process ( Fig. 1; Table S1). Finally, a mismatch between theoretical predictions and experimental data is not necessarily to be viewed as being negative but could instead be used to elucidate where the formulation of our current understanding remains incomplete and foster avenues for further research.
Overcoming experimental challenges related to environmental variability

Relevant aspects for experimental design
To evaluate the feasibility of an experiment and to estimate the number of treatments needed, we must define environmental factors to be manipulated, components of variability, type of organisms (considering life span), level of biological organization, and response traits we aim to address, as well as the spatial and temporal scale, in accordance with our research question. Most experimental setups are restricted by logistics so involving multiple factors and variability easily increases the complexity of the setup. Thus, it is critical to consider aspects that set the number of experimental units when designing such studies: Fig. 1. When deciding among approaches to investigate variability and its consequences across scales, it is critical to clearly define the research question. All approaches have different advantages and limitations (see Table S1 for a detailed description). Generally, moving from controlled to natural systems gains ecological realism while compromising on mechanistic understanding (Petersen et al. 2009;Boyd et al. 2018). Integrating experiments with models requires previous agreement on the conditions that are simulated and parameters that should be measured. Biorender.com was used for creating parts of the figure. i. Experimental scale: The scale of an experiment is usually related to the number of experimental units that can be handled (Petersen et al. 2009). Small-scale experiments (e.g., using cell wells or bottles) allow for a larger number of units or replications than large-scale arenas (e.g., mesocosms) that are often limited in the number of units (Fig. 2a). In addition to size (i.e., the volume of experimental units), the complexity of the setup may also limit replication as in the case of highly automated systems (e.g., chemostats to mesocosm infrastructure: Wahl et al. 2016;Pansch and Hiebenthal 2019;Vajedsamiei et al. 2021a). A negative correlation between the experimental scale and the number of units (Fig. 2a) restricts the type of experiment that can be conducted and often enforces scenarios-based over mechanistic approaches (but see Boyd et al. 2018 for alternative hybrid approaches such as collapsed and main vector designs). It is also important to consider that spatial and temporal scales in experiments are determined by the biological level of organization (e.g., individuals, communities), type of response (short-or long-term), and type of organisms (life span, generation time) that are investigated (Ryo et al. 2019;Jackson et al. 2021;Denny and Dowd 2022). Depending on its generation time and the frequency of fluctuations, the same environmental variability can be experienced as environmental fluctuations in long-lived organisms or as press disturbances in short-lived organisms, for which the fluctuation frequency is higher than the generation time (Jackson et al. 2021). Since the experience of past environmental conditions might affect biological responses to current conditions (a phenomenon defined as ecological memory, Jackson et al. 2021), responses to variability might differ substantially across types of organisms and traits measured. While longlived organisms respond to fluctuations according to previous exposure to variability (acclimation), short-lived organisms undergo continuous generations that experience different "static" environments where the parental environmental information differs from the experienced by the offspring (parental effect/epigenetic plasticity; Jackson et al. 2021;Kunze et al. 2022). Thus, it is fundamental to consider biological and environmental variability scales in experimental designs and elaborate consistent hypotheses. ii. Theoretical approaches: Using theoretical approaches to define the expected type of responses is important for elaborating hypotheses, either when adopting a general framework or when testing model predictions (Grainger et al. 2022) as well as to establish the treatments (factor selection, levels of factors, components of variability). More specifically, theoretical information can be also used to define the minimum number of treatment levels necessary per factor (Fig. 2b). For example, linear and nonlinear responses need at least two and three points to be detected, respectively. If the shape of the response to the target factors is unknown, previous small-scale experimental trials should be developed informing the design of the main experiment (Morel-Journel et al. 2020).
iii. Components of variability: Choosing which component(s) of variability are experimentally manipulated and the number of levels of each treatment depends on the aim (e.g., testing a future environmental scenario or isolating the impact of one component of variability). Each component of variability (variance, frequency, and autocorrelation; Fig. 2c) requires different manipulations of the environmental variable of interest and can be manipulated simultaneously or independently, though the manipulation of one may necessarily affect another (e.g., total variance and frequency can be manipulated independently, but frequency manipulations affect autocorrelation and rates of change). Experiments might include the manipulation of different components of variability in a gradient design to evaluate potential threshold or nonlinear biological responses, but the inclusion of more than one component of variability increases the complexity of the setup. Factorial designs might be simpler for combining components of variability, but less informative about the biological responses to the combination of components. In these cases, only one environmental factor is considered (e.g., temperature) in the experimental design, but multiple explanatory variables are included (e.g., temperature variance and frequency of fluctuations, and therefore its combined effects). iv. Multifactorial scenarios: The number of environmental factors (e.g., light, nutrients, temperature), as well as their identity and levels, determine potential nonadditive effects. Such effects can be complex when covariance (Koussoroplis and Wacker 2016;Koussoroplis et al. 2019) or interactive affect across environmental gradients (Koussoroplis et al. 2017) are present since the direction and magnitude of effects may change and/or vary at different levels of the investigated factors. This type of nonadditive effects increases the number of treatments needed for their investigation, since gradient designs and response surfaces (full factorial designs) are needed for evaluating the effects at different levels (Fig. 2d). Including multiple environmental factors implies complex experimental designs where decisions must be made about what factors, levels, and components of variability are included to capture the relevant regions of biological responses.

Relevant aspects for variability manipulations
There are key aspects that challenge our capacity to investigate and understand the role of variability through experimental research. Many of these aspects are related to experimental manipulations and are shared with any other experimental approach (Campbell and Stanley 1963;Petersen et al. 2009;Boyd et al. 2018), but here those are considered from a variability perspective.
The initial conditions used for experimental setups might have important effects on the responses measured depending on the organism or community type and origin. Defining aspects as pre-experimental acclimation, control treatments, and how to manipulate environmental variability are therefore crucial, but far from trivial. Manipulating different components of variability (e.g., reducing or increasing the total variance) are logistically not easy, especially in field experiments (e.g., outdoor in situ mesocosms placed in types of response shape (e.g., linear and nonlinear) to environmental gradients can be expected and determine the minimum number of treatments needed. (c) Different components of variability can be investigated and combined when considering temporal or spatial variability conditioning the number of treatments needed. (d) Considering multiple factors, nonadditive effects can be investigated using different approaches that differ in the treatment demand. Nonadditive effects can be analyzed by factorial designs where the effect size of two or more factors are tested independently and together identifying the type of effect, or by using response surfaces, which allow for testing relevant areas (combination of factor levels) with high effect sizes. The effects of variability might depend on the type of interaction among factors for the considered levels. Two factors can, for example, show synchrony in their dynamics (in-phase, positive covariation) or diverge over time (out-of-phase, negative covariation) having important consequences for organisms in changing environments (e.g., positive covariation used as cue for anticipatory responses to change, or negative covariation might lead to detrimental effects). Furthermore, changes in one factor might affect the response to a second factor causing cross-dependence and a different expected impact of its variability. aquatic systems or in the field, Fig.1) where nonmanipulated environmental variation may generate idiosyncratic effects (i.e., noise) or restrict the simulation of some components of variability (e.g., the manipulation of thermal variance without changing the mean) (Hong and Shurin 2015;Fey and Wieczynski 2017). Such limitations might lead to the use of extreme variability treatments that extend beyond ecological relevance (Korell et al. 2020) or to the underrepresentation of natural variability (Ziegler et al. 2021), imposing additional bias on our understanding. Thus, the integration of highly and less controlled experiments is needed to approach different aspects (Fig. 1).
When addressing the role of environmental variability in a global change context, we can discriminate between three approaches: (1) the acceptance of natural (deterministic and stochastic) variability. This approach allows experiments to be more natural (and realistic) and is a valuable approach when testing the effects of changing mean conditions, but variability components are assumed constant over time (Fig. 3, e.g., Wahl et al. 2016;Yvon-Durocher et al. 2017;Barneche et al. 2021;Sawall et al. 2021), (2) the manipulation of variability per se (compared to constant regimes) or the manipulation of particular components of variability under highly controlled conditions where nonmanipulated factors remain constant. To explicitly test for the role of variability and its components, mechanistic approaches may be applied (Fig. 3, Colinet et al. 2015), and (3) the combination of both approaches (i.e., accepting particular aspects of variability, while manipulating others). This may be an important step in understanding the role of complex variability patterns in aquatic ecosystems (Fig. 3).
Extracting information from local environmental time series for characterizing means and variability components allows for setting appropriate experimental treatments Cabrerizo et al. 2021;Dobry et al. 2021;Kroeker et al. 2021;Wolf et al. 2022). For multiple-driver experiments, characterizing the main drivers in a system and how they covary (e.g., Wahl et al. 2021) is vital for designing ecologically relevant manipulative experiments with a feasible number of treatments (i.e., reducing the complexity of the setting by selecting the combination of factors that show high variation or covariation in nature and combine them with future scenarios). Variability regimes identified like this can be manipulated under highly controlled experiments (i.e., laboratory conditions simulating natural regimes) or on top of natural environmental variability (Fig. 3). The use of such approaches is commonly related to the type of experiment (laboratory, outdoor/indoor mesocosms, etc.), and its characteristics ( Fig. 1; Table S1). Hence, defining the control in such different approaches is key and may range from complete elimination of variability, to reduced or manipulated variability, to the natural in situ variability, depending on the research question and hypothesis (Fig. 3).
In addition to conducting experiments at different scales, the comparison of experiments at the same scale but with different degrees of controlled variability (e.g., indoor tanks vs. outdoor enclosures) may allow to disentangle noise from the manipulated variability. Large coordinated mesocosm experiments across sites (Landkildehus et al. 2014;Mahdy et al. 2015), latitudes (Lenz et al. 2011(Lenz et al. , 2018, and repeated over time (Nejstgaard et al. 2006;Larsen et al. 2015) can also facilitate the generalization and therefore synthesis of patterns by ruling out idiosyncratic (noise) effects, and by including natural variability of nonmanipulated factors relevant for the investigated sites (Urrutia-Cordero et al. 2021). Such spatial and temporal extension of experiments is an important aspect since the majority of ecological studies are traditionally carried out in northern temperate systems (Martin et al. 2012;Thomsen et al. 2014), but diurnal and annual natural variability ranges differ among climatic regions and hemispheres (Wang and Dillon 2014). However, it is relevant to consider that these efforts need thoughtful planning and designing to generate comparable results; including aspects like the characteristics of the facilities (experimental units might not be identical), and protocols for sampling and analysis (Fraser et al. 2013; but see e.g., Lenz et al. 2011Lenz et al. , 2018 for a globally comparable approach).

Connecting patterns to processes
Experimental approaches vary in the capacity to investigate mechanisms underlying observed patterns ( Fig.1; Table S1; Boyd et al. 2018). In the experimental manipulation of environmental variability (variance, frequency, or predictability) that is imposed across treatments, statistical tests are primarily designed to determine whether differences exist between or among treatments, and whether these differences change across time. This framing poses two related issues. First, albeit rarely addressed, when differences were not observed this might be because: (1) the amount of environmental variation present was not sufficient to produce a measurable biological or ecological response; (2) the response variables measured are not relevant for the environmental change simulated (i.e., do not reflect how organisms respond to the manipulated environmental aspect); or (3) different environments (e.g., different types of environmental variation) produced impacts on biological processes that coincidently yield similar values for a given response variable. Fey et al. (2021) resolved environmental domains where phytoplankton species that acclimate rapidly vs. gradually can exhibit the same overall growth rates in certain variable environments, despite having different biological responses to this variation. As such, studies should explore the potential for cryptic biological processes (Strauss 2014) when no treatment differences exist. Second, when treatments differ from others (or from null models), it can be instructive to understand the underlying biological process that contributed to this occurrence.
For expanding our understanding of the processes driving observed patterns, mechanistic and scenario-based experiments can be integrated in a synergistic way. Combining experimental scales increases the number of treatments that can be tested and allows for addressing different levels of organization (i.e., individual, population, community, ecosystem) by increasing the number of units and integrating experimental scales (Figs. 1, 2). We here suggest three ways of combining experimental approaches: i. Upscaling in complexity: Patterns and mechanisms explaining variability effects shown in simplified models and laboratory experiments can be scaled to more biologically complex scenarios. In this approach it is key to conduct hypothesis-based experiments, and to measure Fig. 3. Different experimental approaches can be used for manipulating environmental variability, but there is a lack of methodological consistency in this research area. Here we conceptualize three general approaches and provide examples for them: (a) Investigating components of variability under controlled conditions: This approach aims for testing specific components of variability (e.g., mean, variance, extremes) and their interactions, maintaining other aspects constant. This approach is mostly used in controlled laboratory experiments but a variety of methods for simulating variability have been implemented, challenging comparisons among studies (e.g., variance simulated by gradual vs. rapid changes; see Colinet et al. 2015 for a detailed discussion). (b) Using natural environmental variability as background conditions. In this case, natural variability is maintained for some factors or components of their variability while others are modified. For example, natural variance can be kept while a change in mean is applied or daily and stochastic variability can be modified by buffering the variance in the experimental units. Alternatively, extremes such as heatwaves can be simulated on top of natural variability. This approach is mostly used in mesocosm experiments (see Thompson et al. 2013 for a detailed discussion). (c) Translating natural variability measured in the field into experimental designs. Different variability components of one or more factors are manipulated according to natural levels and dynamics (e.g., covariation, order of events, rates of changes), and can be compared to future-predicted scenarios or stress events. This approach can be implemented in controlled laboratory experiments or by using natural environmental variability as background depending on the manipulated factor.
parameters that help improve models in an iterative process. One way is to identify ecologically relevant scales of variation by measuring natural environmental variability across temporal and spatial scales of multiple drivers, and to test realistic combinations of such factors in the laboratory to detect relevant scales of biological responses at low levels of complexity. In a second step, additional complexity (e.g., additional trophic levels) should be incorporated in mesocosms (e.g., Wahl et al. 2021). Testing the direct effects of variability on primary producers and consumers separately and together may disentangle how indirect effects are translated into the next trophic level and interact with direct effects. When upscaling experimental evaluations from simple (e.g., artificial laboratory) to more complex (e.g., mesocosm or field) studies it is important to consider that potential biological functions, such as development of antipredator responses, that might not be present under simplified environmental conditions but are important in more complex systems (Nejstgaard et al. 2007). ii. Downscaling in complexity: This approach can be used for discerning mechanisms underlying field observations and explain mechanisms in long-term variability patterns. Imposing ecologically relevant patterns of variability (obtained through field measurements) in mesocosm experiments, and collecting high-resolution data on species or community traits and in situ processes may give clues regarding the underlying mechanisms driving patterns that would otherwise remain cryptic (Yvon-Durocher et al. 2010;Wahl et al. 2021). Outcomes can be used for generating hypotheses to be explicitly tested in controlled laboratory experiments that follow the mesocosm experiments (Yvon-Durocher et al. 2017). iii. Parallel-complementary experiments: Highly controlled and replicated laboratory experiments can be conducted in parallel to large-scale mesocosm efforts to identify response curves to an environmental gradient, or to extend variability treatments to facilitate the interpretation of patterns observed in the accompanying mesocosm experiment. Vajedsamiei et al. (2021c), for example, performed repeated side incubations of marine mussels originating from large-scale and long-term mesocosm experiments, which helped to identify the physiological mechanisms driving the long-term outcomes. Such complementary experiments may be conducted as independent laboratory experiments or as bottles or enclosures placed in the mesocosms, depending on the manipulated factor.

Conclusions
Aquatic ecology has advanced in incorporating environmental variability across multiple environmental factors in recent theoretical and empirical studies. However, the complexity that this research area involves makes it crucial to use a common framework and have consensus on definitions that strengthen the integration of generated knowledge and the design of comparable experimental studies for dealing with variability. A deeper integration of disciplines and development of broader realistic approaches is needed for understanding and predicting present and future global change. Suited experimental approaches may stem from experiments driven by theory-derived hypotheses, scaling predictions and combining different types of experimental approaches.