Rheological identification of jetted fluid using machine learning

The understanding of flowing properties of fluids and the knowledge of the related rheological properties are crucial from both a research and industrial point of view. To determine the complex rheological properties of fluids many devices have thus been developed, the so-called rheometers.The main objective of the present proposal is to develop a new kind of rheometer usingboth large datasets and the Continuous Inkjet Printing (CIJ) process.When ejecting a fluid, the CIJ ejection process competes between several forces: inertial, viscous, surface tension and elasticity, which affect the morphology of the resulting jet.Also, under certain conditions, the morphology of the jet is unique and directly related to the rheological properties of the fluid.We want to use this uniqueness to identify the fluid among a large dataset of known fluid jet morphologies to be compared with, to obtain its rheological properties.Using a large numerically-generated dataset of Newtonian fluid jets, we show in this article that the identification of the viscosity using Neural Network (NN) is not only feasible, but proves to be very accurate with an average error of less than 1% for a large range of viscosities.

The understanding of flowing properties of fluids and the knowledge of the related rheological properties are crucial from both a research and industrial point of view.To determine the complex rheological properties of fluids many devices have thus been developed, the so-called rheometers.
The main objective of the present paper is to identify the rheological properties of a fluid jetted using Continuous Inject Printing (CIJ) process by comparing the morphology of the aforementioned jetted fluid to a dataset of known (rheologically-speaking) fluid jets morphologies.
properties of a fluid by the viscosity, the surface tension and the density of fluids using a large datasets and a Continuous Inject Printing (CIJ) process.
When ejecting a fluid, the CIJ ejection process competes between several forces: inertial, viscous, surface tension and elasticity, which affect the morphology of the resulting jet.Also, under certain conditions, the morphology of the jet is unique and directly related to the rheological properties of the fluid.
We want to use this uniqueness to identify the fluid among a large dataset of known fluid jet morphologies to be compared with, to obtain its rheological properties.
Using a large numerically-generated dataset of Newtonian fluid jets, we show in this article that the identification of the viscosity using Neural Network (NN) is not only feasible, but proves to be very accurate with an average error of less than 1% for a large range of viscosities.

I. INTRODUCTION AND MOTIVATION
Rheological characterization of liquids is of major importance in many scientific fields as these properties play a key role and help to understand the complex fluid dynamical behavior.In many industrial applications, such as paint coating, inkjet printing, combustion, etc, the rheological insight is critical for the processes themselves.Many techniques and devices have thus been developed to measure a large range of fluid properties such as viscosity, surface tension or density.
The most widely used type of rheometer, the rotational rheometer, presents the results of a tested fluid sample as flow curves which plot the the viscosity η as a function of the shear rate γ.In a controlled shear rheometer, for each shear rate, a shear stress σ is measured then the associated viscosity is derived using the simple relation η = σ / γ, and from this curve, a relatively simple mathematical model η = f ( γ) is typically fitted.
In the present article we choose an original approach where the rheological properties are no longer measured but identified, using both Continuous Ink Jet (CIJ) device and machine learning.

CIJ jetting
CIJ installation consists in an air-assisted atomization nozzle (see Fig. 1): a liquid jet is released at moderate speed through a cylindrical nozzle on which is applied a periodical disturbance, usually using a piezoelectric actuator upstream from the nozzle.The presence of a strobe light and a camera allows to observe accurately the morphology of the fluid jet.

FIG. 1: Common CIJ and visualization setup.
When the wave length of the radial disturbance is bigger than the jet circumference, the Rayleigh capillary instability is triggered 1 , resulting in the jet breakup and the generation of fluid droplets.Several parameters can be adjusted in order to control the breakup morphology such as the fluid rheology, the jet speed or the disturbance frequency and amplitude 2 .Indeed, the jet breakup of fluids is affected by the competition between several main forces: inertia, viscous forces, surface tension and, when applicable, elastic force.This competition affects the morphology (i.e.break-up lengths, satellite dynamics, drop shape, etc) of the jet and thus for a given rheological properties and under given process settings, a unique morphology is observed for which every of the above forces plays a key-role.
In industrial CIJ setup, the disturbance (also called stimulation) frequency remains fixed while the amplitude of the stimulation, i.e. the voltage applied to the piezoelectric actuator, is adjusted to obtained the desired morphology.For example, Fig. 2 illustrates two jets morphologies for stimulation amplitudes ranging from 2V to 62V (left and right of each figure, respectively).These two fluids have been extensively studied in [3][4][5] and exhibit almost identical surface tension (22.8mN/m and 24mN/m for fluid 1 and 2, respectively), density (873kg/m 3 and 861kg/m 3 , for fluid 1 and 2, respectively) and viscosity for a shear rate γ < 10 4 s −1 as depicted Fig. 3. FIG.2: Morphologies of two fluid jets exhibiting almost identical surface tension, density and viscosity (for γ ≤ 10 4 s −1 ) for a wide range of stimulation amplitudes (from 2 to 62V ) with the exact same CIJ setup.Left: Fluid 1 with a quasi-Newtonian behavior (square plot on Fig. 3).
Right: Fluid 2 with a shear thinning behavior at high shear rate (round plot on Fig. 3).
It is worth pointing out that both fluids show same jet morphologies at low stimulation amplitude, i.e. less than 20V in Fig. 2, which was expected due to the similarity of the fluids for low to moderate shear rate (i.e.≤ 10 4 s −1 ).FIG. 3: Comparison of the dynamic viscosity of the two industrial CIJ fluids 1 and 2 from 3-5 .
In the example shown Figure 2, it can be seen that two fluids of the same viscosity produce an identical jet morphology.On the other hand, when the rheological behavior of these two fluids diverges, beyond a certain shear rate for example, the same applies to the morphologies of the jets obtained.It therefore seems that the CIJ jetting can be used to discriminate between the rheological behavior of the fluids, even at shear rates up to several hundred of thousands reciprocal seconds.
It should be noted that these values are not accessible by conventional rotational rheology 6 .
In this paper, by combining the CIJ process with machine learning approaches, we thus aim to show that it is possible not only to discriminate, but also identify the viscosity of a Newtonian fluid.Indeed, when a fluid is jetted, its jet morphology proves to be unique under certain jetting conditions and our objective is to use this uniqueness in order to perform a rheological characterization.To do so, first, a large dataset is generated containing information about the morphologies of jetted fluids with well-known rheological properties.Then, by jetting an unknown fluid, one can compare its morphology to the ones present in the dataset, in order to find the fluid having, if not the same, a very close morphology: the so-called matching fluid.This step is performed using machine learning approach, and more specifically deep learning, and is referred as the identification step, with the rheological properties identified by comparison.
Note that, the dataset containing the jet morphology can be either experimentally or numerically generated.In the latter case a particular attention must be paid to the ability of the numerical simulation to agree with experimental results because the unknown (rheologically-speaking) fluid morphology will, of course, be obtained experimentally.
To assess the feasibility of the above approach, a large dataset 7,8 of numerically-generated jets of Newtonian fluids is used in the present article.First, in section II, a brief presentation of the generation of the dataset of fluid jets is made while machine learning methods and associated datasets are described in section III.Different strategies of identification are then presented in section IV and finally the results are discussed in section V.

II. DATASET OF NEWTONIAN FLUID JETS INTERFACES
The dataset of numerically-generated interfaces of Newtonian jets in CIJ regime used in this work can be found at 8 under Creative Commons Attribution license.Data contained in the dataset are thoroughly described in 7 and the reader is invited to refer to this article for more detailed information about the dataset itself.In this section we briefly recall the main features of this numerically-generated dataset that have been exclusively generated using the open-source libraries provided by the Basilisk 9 platform.

A. Governing equations and boundary conditions
The Basilisk toolbox solves unsteady incompressible Navier-Stokes equations with the interface between the fluids tracked with a Volume-Of-Fluid (VOF) method: and with u the velocity field, ρ the density of the considered phase, D the strain rate tensor such as D = [∇u + (∇u) T ]/2 and η the dynamic viscosity.
At the interface between air and liquid, the term Rheological identification of jetted fluid using machine learning is added to the right-hand side of Eq.1 to account for the surface tension effect, where σ is the (constant) surface tension, κ the interface mean curvature and f the volume fraction describing the interface.
Following many CIJ jets simulation 3,10,11 , the present numerical model is axisymmetric with z the symmetry axis as depicted Figure 4.
The problem is addressed in dimensionless form and the numerical domain is a square of dimension 512.The boundary conditions (BC) are described in details in 7 and the inlet BC are depicted in Fig. 4. The initial radius of the jet is R 0 = 1 and u 0 is the inlet velocity field imposed on the liquid phase on which a periodic amplitude disturbance is applied.This triggers and controls the Rayleigh-Plateau instability with a disturbance amplitude δ , a frequency perturbation f r and a simulation time t.In the following, the frequency is arbitrary fixed to f r = 1/7 for all jets.
An outflow boundary condition is imposed on all the remaining boundaries.As depicted on Figure 4, each contiguous liquid zone is numbered as a drop, so that the main thread (the jet before breaking up) is considered as the first drop with a numbering starting from zero.
Hereafter both air and liquid densities, ρ a and ρ l are, respectively, fixed with a ratio and, in order to preserve the numerical stability of the model, the viscosity ratio between air and liquid viscosities, η a and η l respectively, is fixed to The Reynolds number writes: For all jets, fixed values of geometry, density and inlet velocity are used, i.e. ρ l = u t=0 0 = R 0 = 1, so that the Reynolds number is equal to the inverse of the viscosity.
Finally the surface tension is fixed for all the simulations with a Weber number We thus limit the identification to Newtonian fluids having a constant interfacial tension and a Reynolds number ranging from 100 to 1000 which means that the Ohnesorge number is not constant.

B. Data generation
In order to prevent the results from being influenced by the initial condition of the simulation, we start saving interfaces after at least three drops have already been jetted.From then on, 101 output interfaces are generated and saved in one oscillation period of time, i.e. every 0.07 dimensionless time.These interfaces are formed by segments from the VOF tracking interface algorithm used by Basilisk and are written in text files under Gnuplot-style 12 format.
Jets of fluids have been computed for Reynolds number in the range [100, 1000] with a step of 5 and for six stimulation amplitudes δ ranging from 0.01 to 0.035 every 0.005.Finally, with a total of 181 Reynolds numbers and 6 stimulation's amplitudes, 1086 jets have been generated and ≈ 110, 000 interfaces saved.
In addition to the interfaces, some coefficients are computed for each drop of the jets and saved inside a .csvfile: the drop barycenter, the width, the height, the Feret diameter (ratio of width over height), the area and the volume.Hereafter, we will refer to these coefficients as drop coefficients.

III. RHEOLOGICAL IDENTIFICATION OF THE FLUID
As stated in the first section, the objective of the present work is to identify the fluid viscosity by comparing the jet morphology of an unknown fluid to a dataset of morphologies of known fluid Rheological identification of jetted fluid using machine learning jetted with the same CIJ parameters.More specifically, in the present case, we aim at linking the Reynolds number of the jetted fluid, i.e. its viscosity, to the morphology of the jet in order to identify it.This regression problem is addressed by machine learning (ML) approaches using the following two datasets: 1. Dataset A: dataset created from the drop coefficients file (see sec.II B) that describes the jets using drop descriptors such as volume, area, etc.
2. Dataset B: dataset containing a more detailed description of the drops, by fitting each drop shape h(x) using a polynomial of order 9 such as where the exponent n stands for the drop number.
For each dataset, both a linear regression model (as a basic learning method) and a Artificial Neural Network (ANN) fit -more precisely a Deep Neural Network (DNN) -are performed using Scikit-Learn 13 and Tensorflow -Keras 14,15 API, respectively.Figure 5 shows an overview of the present approach with the creation of the datasets A and B and the use of a DNN.
FIG. 5: Overview of the present approach: for fixed Reynolds number, amplitude and time, descriptors and polynomial fits are calculated and gathered in datasets A and B, respectively, for drops 0 and 1.For each dataset, the DNN predicts the Reynolds number using each instance as input layer; it has four fully connected layers (see Table I) and one singled-element output layer which is the Reynolds number.
The use of ANN algorithms on large datasets is widespread and its popularity lies in the great accuracy achieved by such methods in many research and industrial fields.The reader interested by the ANN methods and its application in fluid mechanics field is invited to refer to 16 , or more generally 17,18 and therein references.
Due to their ability in modeling complex system with accuracies, ML and more specifically ANN approaches, have been already used to predict rheological properties of fluids.Starting from 19 , who successfully predicted both density and viscosity of biofuel compounds using ANN approach, many authors used ANN to predict rheological properties from experimental or analytical data such as 20 for the scCO2-Foam, 21 for the viscosity of cosmetic oil or nanofluids 22 .
More recently, physics-based ML algorithms, the so-called Physics-Informed Neural Network 23 (PINN), have been developed to reduce to diminish the need for big data sets by including governing physical laws in the ANN framework.This approach has then been extended to rheology by the Rheology-Informed Neural Networks (RhINNs) 24,25 which enables an accurate modeling of the rheological properties of a fluid with a limited number of experiments.
The actual work falls within prediction/identification of the properties using experimental data coupled with the use of a ML algorithm.

A. Deep Neural Network
The same sequential DNN is used for both datasets A and B and is schematically represented Fig. 5.The input size of the DNN depends on the used dataset and the output is the Reynolds number.The parameters defining the DNN, the so-called hyperparameters are summed-up in the Table I, and have been arbitrarily defined.Using Keras Tuner 26 and its built-in BayesianOptimization class, these hyperparmeters have been found to be the most efficient for the datasets defined in the next section.Hereafter, these hyperparameters are fixed, and no further optimization is performed.
Note that due to the randomness of both the train/test datasets generation and weights initialization, every DNN is run 30 times to assess its prediction accuracy with a re-shuffling of the dataset before partitioning.The given MAPE is then the average MAPE obtained over the 30 runs with the standard deviation associated.

B. Datasets generation
The two datasets used to perform the rheological identification are detailed in the next subsections.It is also worth pointing out that Machine Learning terminology is used with an instance referring to a single row of data, and a feature to a single column of data.

Dataset A
All the drop coefficients files are gathered in the same dataset, that we will call hereafter dataset A, where each instance describes an interface output, with descriptors of up to 10 drops.It is worth pointing out that the first drop, numbered Drop 0, refers to the main jet itself, while the others drops are either main drops or satellites (see Fig. 4).

Dataset B
As mentioned above the dataset B, is generated using the coefficients of the polynomial fit performed on each drop shape plus the Reynolds, Amplitude and Time features.The best polynomial fit, of order at most 9 from eq.10, is performed using Polynomial class from Numpy 27 .The main fluid jet (i.e. drop 0) can be as long as 400 • R 0 , which is far too long to be accurately fitted using a polynomial.The fit is then restricted on the last part of the jet of length 7 • R 0 as illustrated with the orange tick line Figure 5.
The domain of the drop, i.e. the minimum and maximum z-coordinates, is mapped to the interval [−1, 1] and a least squares polynomial fit of the drop shape is performed, returning its sum of squared residuals.The dataset B is the aggregation of the fit coefficients, the sum of squared residuals and order as well as domain boundaries for each drop, as shown in Table III.Some randomly picked drop fits are displayed Fig. 8 where it can be seen that the fit quality depends on the drop shape: when the drop exhibits more-complex shape (top left corner or center of Figure 8), the fit is of lesser quality and the residual is higher.Conversely when the shape is smoother, its shape is perfectly fitted with low or null residual (center top or left bottom corner of Figure 8).By comparing the polynomial fit of the two first drops Figure 9, it is of lower quality for drop 0 than drop 1, with a median of residuals of 7.02 and 1.53, respectively.Furthermore, as one can expect, the quality of fit also depends on the amplitude of stimulation as depicted in Fig. 9: higher stimulation amplitudes result in more complex drop shapes and thus reduce the fit goodness.

C. Final datasets A and B
Depending on the jet parameters (Reynolds number, amplitude of stimulation and time of the output), the number of drops may vary.We arbitrary decided to use only the data from the main thread (i.e. drop 0) and the first main drop to perform the identification.The resulting datasets have thus 15 and 31 features for dataset A and B, respectively, and ≈ 110, 000 instances.
It is also worth noting that the first drop (i.e. drop 1) is not always a main drop and in almost a quarter of the samples it is a secondary droplet, also called a satellite.In these cases, the drop 1 is replaced by the next main drop generated.

IV. RESULTS
As explained before, we use two learning methods: we start with a linear regression model, section IV A, and then a DNN, section IV B. However, the prediction is too poor and another approach must be considered such as the DNN developed in the next section.

B. Deep Neural Network
In this section the DNN is used to identify the viscosity of the jet morphology following the overview shown in Fig. 5.To do this, we first use all the information contained in datasets A and B, section IV B 1.However, these datasets have been generated from numerical simulations and contain data that can be very difficult to obtain experimentally.Therefore, in order to generalize the present method to experimentally generated datasets, A and B are progressively simplified or modified and several strategies are then assessed: • the time scale is refactored, section IV B 2; • information of the main drop only is used, section IV B 3; • the stimulation amplitudes are coupled, section IV B 4; The DNN strongly increases the accuracy of the prediction with an average MAPE of 1.11 ± 0.19 and 0.87 ± 0.16 for datasets A and B, respectively.As pictured Fig. 11, this also decreases drastically the number of outliers.

Refactoring time
The time information in both datasets is based on the simulation time, which is the time elapsed since the beginning of the simulation, and does not provide any useful information for the DNN: dropping the Time feature from the datasets gives very close prediction results.
We then choose to refactor the time by setting t = 0 when the breakup occurs, i.e. when a new drop is created.For each Reynolds-Amplitude group, we then keep 100 samples with a time from t = 0 to t = 6.93, and discarding the last t = 7 which is identical to t = 0 as the CIJ jetting is periodical.Fig. 12 shows an example of the time evolution of the interface of the same jet with both the simulation time t and the corresponding refactored time t since the breakup occurred.By refactoring the time information, we observe a great improvement of the prediction with an average MAPE of 0.78 ± 0.16 and 0.72 ± 0.07 for datasets A and B, respectively.It represents a relative improvement of 42% and 21%, respectively, in comparison with results using the simulation time.

First main drop information only
Gathering information about drops from numerical results is straightforward but it can be more difficult from an experimental point of view.As stated above, the main thread, i.e. drop 0, can be as long as several hundreds of radii and measuring its width with accuracy may be burdensome.
To alleviate the experimental work, one can restrict the identification approach by using the first main drop information only and its corresponding refactored time.When dropping information about drop 0 in datasets A and B, the average MAPE remains very close to the values found in the above subsection (sec.IV B 2) with 0.79±0.18and 0.76±0.13for datasets A and B, respectively.

Dropping time and coupling amplitudes
We saw, sec.IV B 2, that the time information has a great influence.However, from an experimental point of view, acquiring the time information for each photography is neither easy nor reliable.
One route to get rid of the time without losing in prediction accuracy, lies in coupling stimulation amplitudes.Two datasets were then generated, called A' and B' from A and B, respectively, in the following way: for each Reynolds number, randomly picked instances of each amplitude are randomly concatenated without following any subsequent time frame.
The resulting datasets A' and B' have thus 100 instances for each Reynolds number and n • i features: • n the number of stimulation amplitudes The more coupled amplitudes, the better the average MAPE of the prediction, with an average 0.80 ± 0.13 and 1.24 ± 0.17 for datasets A' and B', respectively (see Fig. 13).
FIG. 13: Average MAPE as a function of the number of coupled amplitudes.
It is worth pointing out that for the first time a better MAPE is obtained using drop descriptors (drop volume, area, etc, from dataset A') than more complex information from drop shape fitting (dataset B').It is not clear why such a difference is observed between datasets A' and B' and some explanations may lie in the DNN structure itself and in the number of samples in the datasets: by coupling amplitudes the input layer is larger while the number of samples has been drastically reduced.The DNN is then too small, i.e. the hidden layers have too few neurons to address such a large input layer and/or more samples are needed.
With that in mind, the prediction from A' shows an excellent average MAPE, with less than 1% of error, using the drop descriptors of the drop 1 only and with no time scale in the dataset.
Contrarily to what have been previously observed, no outliers, i.e. predicted values with more than 5% of error, is predicted.With n = 1, the entire datasets A' an B' are selected and logically, the same MAPE as in Fig.

Sampling the viscosity space
13 is obtained Fig. 14.As one can expect, the MAPE increases with the sampling parameter n, i.e. when less Reynolds number samples are picked.However, even with n = 10 (i.e. with only 18 Reynolds numbers samples in the subset) the MAPE remains relatively low with 1.33 ± 0.13 and 2.67 ± 0.27 for the subsets from datasets A' and B', respectively.
The number of samples needed to generate the dataset may then be drastically reduced without degrading too much the prediction accuracy: by using only a fifth of the Reynolds numbers (i.e. 36 samples), the MAPE remains as low as 1%.

V. DISCUSSION AND CONCLUSION
It has been a classical challenge to accurately characterize the rheological properties of fluids using many different approaches.Conventional rheometry often involves high technicality and time-consuming experimental manipulation.
In the present study, we have developed a novel approach for accurate and efficient prediction of the Newtonian viscosity of jetted fluids.The accuracy of the present approach has been demonstrated through tests against extensive numerical data where the ground truth is exactly known: while the present DNN is not optimized, we have reached less than 1% of average MAPE in many cases and notably, using .
Unlike conventional rheometry, which needs to conduct a time-consuming process to measure rheometrical properties, this method identify a fluid in a previously generated dataset using Deep Neural Network.The dataset is generated only once and we showed that a few number of samples are needed to map a large range of viscosity.In the same way, the DNN is trained offline, once, and its prediction process only involves a limited number of simple algebraic calculations.
Furthermore, the present method relies only on some descriptors of the first main drop (volume, area, etc) at several amplitudes of stimulation to identify accurately the fluid.That information is easily obtained using CIJ visualization system and can be fully automatized as the CIJ process itself proves to be robust.
However, the present proposal is a proof of concept that addresses only the prediction of the Newtonian viscosities where both density and surface tension are known and constant.The size of the dataset is directly correlated to the number of parameters to be predicted, and more complex rheological properties will thus need larger datasets.Many strategies can thus be developed to alleviate the difficulty related to the constitution of the dataset used to identify the fluid.For instance, using several jetting nozzle geometries may strongly increase the prediction accuracy

FIG. 4 :
FIG. 4: Example of generated fluid jet, with the inlet boundary conditions and drop numbering from 7 .
For each drop, 6 drop descriptors are derived: barycenter, width, height, Feret diameter, area and volume of the drop.Tab.II shows the 15 first features and the 5 first instances: the 3 first features give the jet parameters, i.e.Reynolds number, amplitude of stimulation δ and time of the output while the next features describe two drops with 6 variables each.

Fig. 6 .
Fig.6.As one can expect, there is a strong correlation between the features of each drop, while the Reynolds number is moderately correlated with the first drop (i.e. the jet of fluid), and in a lesser extend with the features of drop 1.

FIG. 8 :
FIG.8: Random examples of drop fits with the associated residuals where the surface color is the drop and the thick line is the polynomial fit.

FIG. 9 :
FIG. 9: Distribution of the fitted residuals for drop 0 (a) and drop 1 (b) with the hue depicting the stimulation amplitudes.

Figure 10
Figure 10 shows the prediction of the linear model for both datasets A and B. Predictions of Reynolds are denoted Re.As one may expect, the dataset B, which contains more data about

1.
FIG. 11: Predicted Reynolds numbers Re as a function of the ground truth Reynolds number Re for datasets A (left) and B (right) with deep-learning approach.

FIG. 12 :
FIG. 12: Jets of fluids for a fixed Re-Amplitude pair at 6 different times: t stands for the simulation time while t stands for the refactored time starting when the breakup occurs.

• i = 7
for A' (the 6 drop descriptors -see sec III B 1 -plus the stimulation amplitude); i = 14 for B' (13 values from the fit -see III B 2 plus the stimulation amplitude) Note that coupling amplitudes into A' and B' datasets results in much less samples with ≈ 18100 instances (100 instances for 181 Reynolds numbers).
FIG. 14: Average MAPE as a function of the systematic sampling parameter n: a subset of 1/n Reynolds number is used to train/test the model.

TABLE I :
Hyperparameters of the DNN used throughout the present article.
An hold-out validation is performed with a 65% -15% -20% partition (train, validation and test, respectively) .The dataset is shuffled before partitioning.The model accuracy is assessed using the Mean Absolute Percentage Error (MAPE) which is more humanly understandable than the MSE metric used in the DNN loss function (see Table

TABLE II
: The 15 first features and the 5 first instances for Re = 100 and Amp = 0.01 used in dataset A. The Pearson's coefficient correlation matrix between the features of the dataset A is plotted

TABLE III
: The 15 first features and the 5 first instances for Re = 100 and Amp = 0.01 of the dataset B, where the exponent stands for the drop number.