Identifiability of Causal-based ML Fairness Notions

Machine learning algorithms can produce biased outcome/prediction, typically, against minorities and under-represented sub-populations. Therefore, fairness is emerging as an important requirement for the safe application of machine learning based technologies. The most commonly used fairness notions (e.g. statistical parity, equalized odds, predictive parity, etc.) are observational and rely on mere correlation between variables. These notions fail to identify bias in case of statistical anomalies such as Simpson's or Berkson's paradoxes. Causality-based fairness notions (e.g. counterfactual fairness, no-proxy discrimination, etc.) are immune to such anomalies and hence more reliable to assess fairness. The problem of causality-based fairness notions, however, is that they are defined in terms of quantities (e.g. causal, counterfactual, and path-specific effects) that are not always measurable. This is known as the identifiability problem and is the topic of a large body of work in the causal inference literature. The first contribution of this paper is a compilation of the major identifiability results which are of particular relevance for machine learning fairness. To the best of our knowledge, no previous work in the field of ML fairness or causal inference provides such systemization of knowledge. The second contribution is more general and addresses the main problem of using causality in machine learning, that is, how to extract causal knowledge from observational data in real scenarios. This paper shows how this can be achieved using identifiability.


I. INTRODUCTION
Machine learning is being used to inform decisions with critical consequences on human lives such as job hiring, college admission, loan granting, and criminal risk assessment.Unfortunately, these automated decision systems have been found to consistently discriminate against certain individuals or sub-populations, typically minorities.Because the discrimination is very often unintentional, discovering and addressing it is a challenging task.The most commonly used fairness notions are observational and rely on mere correlation between variables.For example, statistical parity [4] requires that the proportion of positive outcome (e.g.granting loans) is the same for all sub-populations (e.g.male and female groups).Equal opportunity [7] requires that the true positive rate (TPR) is the same for all sub-populations.The main problem of correlationbased fairness notions is that they fail to detect discrimination This work was supported by the European Research Council (ERC) project HYPATIA under the European Union's Horizon 2020 research and innovation programme.Grant agreement n. 835294. in presence of statistical anomalies such as Simpson's paradox [22] and Berkson's paradox [1], [9].A famous example of the Simpson's paradox is the gender bias in 1973 Berkley admission [2], [11].In that year, 44% of male applicants were admitted against only 34% of female applicants.While this looks like a bias against female candidates, when the same data has been analyzed by department, acceptance rates were approximately the same.
One way to address this limitation is to consider how data is generated in the first place which leads to causal-based fairness notions.Because this new breed of fairness notions is immune to statistical paradoxes, it is now widely accepted that causality is necessary to appropriately address the problem of fairness [11].Examples of causal-based fairness notions include total effect [14], interventional fairness [17], counterfactual fairness [10], counterfactual effects [29], and pathspecific counterfactual fairness [3], [28].These notions are defined in terms of non-observable quantities such as causal, counterfactual, and path-specific effects.As they are nonobservable, these quantities cannot always be estimated based on observable data.This is known as the identifiability problem and is the topic of a large body of work in the causal inference literature.For example, the identifiability of causal effects can be decided using a set of three causal inference rules called do-calculus [13], [14].This paper summarizes the main identifiability results as they relate to the specific problem of discrimination discovery with an emphasis on graphical criteria.These results fall into two categories: causal effect (intervention) identifiability [6], [8], [14], [19], [21], [23]- [25] and counterfactual identifiability [18], [20], [21], [27].Section II provides necessary background concepts.Then, instead of repeating the definition of identifiability (Definition 3.2.3 in [14]), Section III gives an intuitive explanation of the identifiability problem through the teacher firing example.Sections IV and V compile the common identifiability results of causal and counterfactual effects, respectively.Section VI concludes.

II. PRELIMINARIES AND NOTATION
Variables are denoted by capital letters.In particular, A is used for the sensitive variable (e.g., gender, race, age) and Y is used for the outcome of the automated decision system (e.g., hiring, admission, releasing on parole).Small letters denote specific values of variables (e.g., A = a , W = w).Bold capital and small letters denote a set of variables and a set of values, respectively.
A structural causal model [14] is a tuple M = U, V, F, P (U) where: • U is a set of exogenous variables which cannot be observed or experimented on but constitute the background knowledge behind the model.
• V is a set of observable variables which can be experimented on.
• F is a set of structural functions where each f i is mapping U ∪ V → V\{V i } which represents the process by which variable V i changes in response to other variables in U ∪ V.
• P (u) is a probability distribution over the unobservable variables U.
Causal assumptions between variables are captured by a causal diagram G which is a directed acyclic graph (DAG) where nodes represent variables and directed edges represent functional relationships between the variables.Directed edges can have two interpretations.A probabilistic interpretation where the edge represents a dependency among the variables such that the direction of the edge is irrelevant.A causal interpretation where the edge represents a causal influence between the corresponding variables such that the direction of the edge matters.Unobserved variables U, which are typically not represented in the causal diagram, can be either mutually independent (Markovian model) or dependent from each others.In case the unobserved variables can be dependent and each U i ∈ U is used in at most two functions in F , the model is called semi-Markovian.In causal diagrams of semi-Markovian models, dependent unobservable variables (unobserved confounders) are represented by a dotted bi-directed edge between observable variables.Graphs G5 (Table I) and G16 (Table II) show causal graphs of Markovian and semi-Markovian models, respectively.An intervention, noted do(V = v), is a manipulation of the model that consists in fixing the value of a variable (or a set of variables) to a specific value regardless of the corresponding function f v .Graphically, it consists in discarding all edges incident to the node corresponding to variable V .Figure 1(c) shows the causal diagram of the manipulated model after intervention do(Z = z) denoted M Z=z or M z for short.The intervention do(V = v) induces a different distribution on the other variables.Intuitively, while P (Y |Z = z) reflects the population distribution of Y among individuals whose Z value is z, P (Y |do(Z = z) reflects the population distribution of Y if everyone in the population had their Z value fixed at z.The obtained distribution P (Y |do(Z = z) can be considered as a counterfactual distribution since the intervention forces Z to take a value different from the one it would take in the actual world.Such counterfactual variable is noted Y Z=z or Y z for short 1 .P (Y = y|do(Z = z)) = P (Y Z=z = y) = P (Y z = y) = P (y z ) is used to define the causal effect of z on Y .The term counterfactual quantity is used for expressions that involve explicitly multiple worlds.In Figure 1(b), consider the expression P (y a |Y = y, A = a) = P (y a |y, a).Such expression involves two worlds: an observed world where A = a and Y = y and a counterfactual world where Y = y and A = a and it reads "the probability of Y = y had A been a given that we observed Y = y and A = a".In the common example of job hiring, if A denotes race (a :white, a :non-white) and Y denotes the hiring decision (y:hired, y :not hired), P (y a |y, a) reads "given that a white applicant has been hired, what is the probability that the same applicant is still being hired had he been non-white".Nesting counterfactuals can produce complex expressions.For example, in the relatively simple model of Figure 1(b), P (y a,z a |y a ) = P (y(a, z(a ))|y (a )) reads the probability of Y = y had (1) A been a and (2) Z been z when A is a , given that an intervention A = a produced y .This expression involves three worlds: a world where A = a, a world where Z = z a , and a world where A = a .Such complex expressions are used to characterize direct, indirect, and path-specific effects.

III. EXPLAINING IDENTIFIABILITY THROUGH AN EXAMPLE
Consider the example of an automated system for deciding whether to fire a teacher at the end of the academic year.Deployed teacher evaluation systems have been suspected of bias in the past.For example, IMPACT is a teacher evaluation system used in the city of Washington, D.C., and have been found to be unfair against teachers from minority groups [12], [15], [16].Assume that the system takes as input one feature, namely, the initial2 average level of the students assigned to that teacher (A).The outcome is whether to fire the teacher ( Ŷ ).Assume that these two variables are confounded by a third unobservable variable U which represents a socioeconomic status related to the school neighborhood.
Assume also that all 3 variables are binary with the following values: If the initial average level of the students assigned to the teacher is high, A = 1, otherwise (initial level is low), A = 0. Firing a teacher corresponds to Ŷ = 1, while retaining her corresponds to Ŷ = 0.If the school is located in a high-income neighborhood, U = 1, otherwise (the school is located in a low-income neighborhood), U = 0.The level of students in a given class can be influenced by several variables, but in this example, assume that it is only influenced by the socioeconomic status of the school; students in high-income neighborhoods are more advantaged and typically perform better in school.
The relationships between the variables A, U, and Y can be graphically represented using the causal directed acyclic graph (DAG) in Figure 2 3 .Notice that the edges U → A and U → Y are dotted because they are emanating from an unobservable variable (U ).Assume that the automated decision system is suspected to be biased by the level of students assigned to the teacher.That is, it is claimed that the system is more likely to fire teachers who have been assigned classes with low level students at the beginning of the academic year, which is clearly unfair.The bias in the outcome ( Ŷ ) due to the sensitive variable A can be assessed by computing the total variation: which coincides with statistical parity [4] and measures the difference between the distributions of Ŷ when we (passively) observe A changing from a 0 to a 1 (e.g. from 0 to 1 in our example).The main limitation of T V is that it is purely statistical and may be fooled by statistical anomalies such as Simpson's and Berkson's paradoxes.Total effect (T E) [14] is the causal version of T V and is defined in terms of experimental probabilities as follows: While T V is expressed in terms of observable probabilites (P (y|a 1 ) and P (y|a 0 )) and hence can always be computed from observable data, T E is not.The question is can T E be expressed in terms of observable probabilities and hence computed from observable data?If the answer is yes, T E is said to be identifiable.Otherwise, it is not identifiable.Pearl gives a formal definition of identifiability [14], Page 77, Definition 3.2.3.Intuitively, given a dataset D (which can be generated by different causal models), a quantity (e.g.P (Y A=a1 = y)) is identifiable if it keeps the same value regardless of the causal model which generated the dataset D. For example, in the teacher firing scenario, P ( ŶA=0 = 1) is not identifiable since it is possible to come up with two causal models that can generate the same data, and hence Since total variation T V is defined in terms of observable probabilities, it can be computed based on the observable data.Total effect T E, however, cannot be computed based on observable data as P ( ŶA=0 = 1) is not identifiable.
Notice that, in this example, both models M 1 and M 2 share the same graph structure (Figure 2).This is not always the case.That is, it is possible to have two causal models with different graph structures coinciding on the observable joint distribution.Hardt et al. [7] illustrate this case with an example.Tikka [26] presents another non-identifiable example defined using the XOR logic operator.
Based on the causal inference literature, the next sections compile a list of identifiability criteria for the different types of non-observable quantities: causal, counterfactual, direct, indirect, and path-specific effects.

IV. IDENTIFIABILITY OF CAUSAL EFFECTS
The natural way to estimate the causal effect of a variable (the sensitive attribute A) on another (the outcome variable Y ) is to carry out real experiments using RCT (Randomized Controlled Trial) [5].If possible, RCT drops the need for identiability altogether.However, in the context of machine learning fairness, RCT is often not an option as experiments can be too costly to implement or physically impossible to carry out (e.g.changing the gender of a job applicant).
As an alternative, intervention using the do-operator can be used to compute the causal effect.Without loss of generality, this section focuses on the identifiability of P (Y = y|do(A = a)) = P (y a ), that is, the causal effect of the sensitive attribute A on the outcome variable Y .The computation of P (y a ) uses a "surgically altered" graph in which all arrows into A are deleted and the value of A is fixed at a, but the rest of the graph remains unchanged.
Whether it is possible to express P (y a ) only in terms of observable probabilities (identifiability) depends on the structure of the causal graph (which captures how data is generated).
A first important result is that any causal effect is identifiable in a Markovian model (where all unobservable variables are independent).In semi-Markovian models, however, the causal effect is not always identifiable.
Table I shows different Markovian models involving various patterns of causal relationships along with the corresponding expression in terms of observable probabilities.
Graphs G1 − G5 illustrate the simplest cases where no confounding between A and Y exists.In that case, the causal effect matches the conditional probability regardless of any mediator M as follows: P (y a ) = P (y|a) 1) Back-door adjustment: In case there are confounders involving A and Y , the causal effect can be identified by finding a set of variables C that block all back-door paths Causal graph P (ya) C 1 P (y|a, c 1 ) P (c 1 ) W C 1 P (y|a, w, c 1 ) P (w, c 1 ) W C 2 P (y|a, w, c 2 ) P (w, c 2 ) TABLE I: P (y a ) of some Markovian models.from A to Y .This is called the back-door criterion 5 .This criterion necessitates the existence of a set of covariates C which blocks all the indirect paths from A to Y , but keeps all the direct paths open.C satisfies the back-door criterion when (1) C blocks every back-door path between A and Y , and (2) no node in C is a descendant of A. Graphs G6 − 12 illustrate examples where C (or {C 1 ,C 2 }) meets the back-door criterion.In presence of an observable confounder C, P (y a ) is identifiable by adjusting 6 on that confounder using back-door formula: where the summation is on values c in the domain (sample space) of C denoted as dom(C).Note that G4 and G5 contain a collider (W ).Marginalizing over the collider variable disproves the equality in Eq. 6 as it might open back-door paths between A and Y and consquently create a dependency between these two variables.Despite the fact that G11 involves two confounders C 1 and C 2 , no adjustment is required because of the presence of the collider W . Hence P (y a ) can be computed using Eq. 5.Alternatively, controlling on: } is possible using Eq. 6.
Table I shows all possible formulas that can be used to calculate P (y a ) for G11.G12 presents another case with two confounders (C1 and C 2 ) and the two following backdoor paths between A and Y : The former must be blocked by either W or C 2 or both while the latter doesn't need any controlling because of the presence of the collider: W . Thus, the set of variables sufficient to control for confounding are: That is, any one of these equations can be used to calculate the causal effect of A on Y .As a summary, the only type of variables that have an impact on the identifiability of P (y a ) in Markovian models is the confounder.To compute the causal effect in presence of confounding, adjusting using the back-door formula (Eq.6) is required.However, adjusting should not be used in presence of a collider variable since this might open back-door paths between A and Y and hence, create a dependency between them.
Mediator variables, on the other hand, have no impact on the identifiability of causal effects in Markovian models.
Causal effects are not always identifiable in semi-Markovian models.This subsection focuses on causal models where the causal effect of A on Y is identifiable.The following subsection gives a graphical criteria of causal models where the causal effect is not identifiable.In the causal model, the measurement of causal effects is assisted by interventions following a set of inference rules introduced by Pearl [14] known as: docalculus.These rules tend to link the interventional quantities of causal effects to simple statistical distributions based solely on observational data.As an alternative way of assessing causal effects, relevant graphical patterns will be presented in the remainder of this section.
2) do-calculus inference rules: do-calculus [13], [14] is a set of three inference rules that can be used to express an interventional expression of the form P (y a ) in terms of subscript-free (observable) quantities.The rules are: • Rule 1 (Insertion/Deletion of Observations): P (y a |c, w) = P (y a |c) provided that the set of variables C blocks all back-door paths from W to Y after all arrows leading to A have been deleted.
• Rule 2 (Action/Observation Exchange): P (y a |c) = P (y|a, c) provided that the set of variables C blocks all back-door paths from A to Y .
• Rule 3 (Insertion/Deletion of Actions): P (y a ) = P (y) provided that there are no causal paths between A and Y .
Causal graph P (ya) G13 P (y|a) TABLE II: P (y a ) of some semi-Markovian models.
As an example, consider the graph G18.The causal effect P (y a ) can be identified as follows: P (y a ) = w1 w2 w3 P (y|w 1 , w 3 ) P (w 2 ) P (w 1 , y| do(a)) = w1 w2 = w1 w2 a The term P (w 1 , y| do(a)) in ( 8) is replaced by P (w 1 | a, w 2 ) a P (y| w 1 , a ) P (a | w 2 ) after applying Rule 2 followed by Rule 3 of do-calculus (symbolic derivation of Causal Effects: Eq. 3.43 [14]).Since W 2 blocks all back-door paths between A and Y , we apply the back-door formula (Eq.6) to adjust on W 2 in (10).
3) C-component factorization: C-component factorization [21] aims to express the observational distribution P (v) as a product of factors P v\s (s), where each s represents the set of vertices included in a c-component.A c-component is a set of vertices in the graph such that every pair of vertices are connected by a confounding edge.The c-components are very important in measuring the causal effect of A on Y since they help in decomposing the identification problem into smaller sub-problems.In other words, variables in the graph can be partitioned into a disjoint set of c-components in order to calculate P (y a ).For example, the graph G17 is partitioned into two c-components: Note that as long as there is no confounding path connecting A to any of its direct children, P (y a ) is identifiable and can be computed as [24]: where Q A is the c-factor of the c-component containing A (S A ) computed as follows: where v −1 is the set of values of all previous variables to V , assuming a topological order Y is a valid topological order.This criterion can be slightly generalized to be: P (y a ) is identifiable if there is no confounding path connecting A to any of its children in G An(Y ) which is the subgraph of G composed only of ancestors of the outcome variable Y .
To illustrate the c-component factorization property, consider the causal graph G17.Hence, applying Eq. 11 to G17 leads to: 4) Front-door adjustment: In case a bi-directed edge between the sensitive attribute A and the outcome Y exists, all the above approaches will fail.However, P (y a ) can still be measured using another criterion called the front-door criterion.The graph G19 satisfies this criterion.In fact, the back-door criterion cannot be used because of the unobserved confounder (impossible to control for) however, due to the presence of the mediators M , the front-door criterion can be applied to identify the causal effect as follows: More generally, the front-door adjustment can be applied if the the following conditions hold: 1) all of the direct paths from A to Y pass through M .
2) there are no back-door paths from A to M , 3) all back-door paths from M to Y are blocked by A.
Back-door and front-door adjustments are the main ingredients of the do-calculus(Section IV-2).
V Note that in Markovian, as well as semi-Markovian models, if all parameters of the causal model are known (including P (u)), any counterfactual is identifiable and can be computed using the three steps abduction, action, and prediction (Theorem 7.1.7 in [14]).However, this method is usually infeasible in real-world scenarios due to the lack of the complete knowledge of the causal model (more specifically the knowledge of the background variables U ).
Given a causal graph G of a Markovian model and a counterfactual expression γ = v a |e with e some arbitrary set of evidence, measuring P (γ) requires to construct a counterfactual graph which combines parallel worlds.Every world is represented by a counterfactual sub-model M a .For example, Figure 3 shows a causal graph for the firing example (Figure 3(a)) along with its corresponding counterfactual graph (Figure 3(b)).Thus, Figure 3(b) combines two worlds: the actual world where the teacher has actually A = a 0 and the counterfactual world where the same teacher is assigned A *7 = a 1 .As shown in the figure, the two worlds share the same unobserved background variable: U Y that highlights the interaction between these worlds.Note that no bi-directed edges are connected to the node A * = a 1 .The reason for that is that the intervention do(a * = a 1 ) removes all the incoming arrows to A * .Thus, in order to calculate the counterfactual expression P (Y * a * =a1 | A = a 0 ) of the simple Markovian graph in Figure 3(a), we need to construct the semi-Markovian graph in Figure 3(b).The make-cg algorithm [21] automates this procedure.Basically, make-cg algorithm starts by combining the two causal graphs (actual and counterfactual) and makes them share the same background variable U (as shown in Figure 3(b)).Then, it discards the duplicated endogenous nodes which are not affected by do(a).One typical unidentifiable counterfactual quantity is P (y a , y a ) which is called the probability of necessity and sufficiency.The corresponding counterfactual graph is the W-graph that has the same structure as to Figure 3(b).This simple criterion can be generalized to the zig-zag graph (Figure 3(c)) where the counterfactual P (y a , w 1 , w 2 , z x ) is not identifiable.

A. C-component factorization
To illustrate how counterfactual quantities are measured, consider the same firing example (Figure 2 where the latent variable U is now replaced with an observable variable C. Consider the counterfactual query: P (y a1 |a 0 ) which reads the probability of firing a teacher who is assigned a class with a high initial level of students (a 0 ) had she been assigned a class with a low initial level of students (a 1 ). Figure 4(a) shows the two parallel-worlds graph 8 for the query while Figure 4(b) presents the final constructed counterfactual graph using makecg algorithm.Note that in Figure 4(b), C and C * are merged as a single node C (by applying Lemma 24 [21]).The main reason for that is that these nodes are not descendants of A. Then, C inherits all the children of both nodes C (the old node in the previous graph) and C * .Finally, U C is omitted since any unobserved variable that possesses a single child should be removed [21].
Table III presents various examples of identifying the counterfactual quantities (column 3) of some causal graphs (first column) after obtaining their corresponding counterfactual graphs (column 2).
For example, G33 includes in addition to the confounder C a mediator M .As shown in the corresponding counterfactual graph, the nodes M and M * a1 are not merged as they differ on their A-derived parents by contrast to the node C. Similarly, in G34, the pair of nodes M , M * a1 and W , W * a1 are not merged for the same reason.

VI. CONCLUSION
A typical goal of causal inference in the context of discrimination discovery is establishing the causal effect of the sensitive attribute A on the outcome Y .Unfortunately, this may not be possible due to the identifiability problem.This paper studied the problem of identifiability as it relates to discrimination discovery.We made use of the large-scale body of work on identifiability theory to summarize the main results found in the literature.Based on various graphical patterns, we discussed and assessed whether the causal effect of A on Y is identifiable.The main identifiability results fall into two types, namely the causal effect (intervention) and the counterfactual effect.Finally, we note that in the case when identification is not possible, it may still be possible to bound causal effects.The development of bounds for non-identifiable quantities is called partial identifiability.

Fig. 4 :
Fig. 4: (a) Parallel worlds graph for P (y a1 |a 0 ) (b) Counterfactual graph for P (y a1 |a 0 ).Now, having constructed the counterfactual graph for the counterfactual expression P (y a1 |a 0 ), we can turn to the identifiability of this expression.Note that the obtained counterfactual graph (Figure 4(b)) has three c-components: {C}, {A}, {Y, Y * a1 } thus: P (y a1 |a 0 ) = y,c Q(c) Q(a 0 ) Q(y, y a1 ) P (a 0 )(15) . IDENTIFICATION OF COUNTERFACTUAL EFFECTS While causal effects (Section IV) interpret the effect of actions as downward flow, counterfactual effects require more complex reasoning.Basically, counterfactual effects measure fairness based on multiple worlds: the actual world and other hypothetical (or counterfactual) worlds.The actual world is represented by a causal model M in its actual (normal) state without any interventions, while the counterfactual worlds are represented by sub-models: M a where the intervention do(a) forces the actual state to change to an alternative state.