<Dl> <Dd> H S I C L a s s o: min x 1 2 ∑ k, l = 1 n x k x l HSIC (f k, f l) − ∑ k = 1 n x k HSIC (f k, c) + λ ‖ x ‖ 1, s.t. x 1,..., x n ≥ 0, (\ displaystyle \ mathrm (HSIC_ (Lasso)): \ min _ (\ mathbf (x)) (\ frac (1) (2)) \ sum _ (k, l = 1) ^ (n) x_ (k) x_ (l) (\ mbox (HSIC)) (f_ (k), f_ (l)) - \ sum _ (k = 1) ^ (n) x_ (k) (\ mbox (HSIC)) (f_ (k), c) + \ lambda \ \ mathbf (x) \ _ (1), \ quad (\ mbox (s.t.)) \ x_ (1), \ ldots, x_ (n) \ geq 0,) </Dd> </Dl> <Dd> H S I C L a s s o: min x 1 2 ∑ k, l = 1 n x k x l HSIC (f k, f l) − ∑ k = 1 n x k HSIC (f k, c) + λ ‖ x ‖ 1, s.t. x 1,..., x n ≥ 0, (\ displaystyle \ mathrm (HSIC_ (Lasso)): \ min _ (\ mathbf (x)) (\ frac (1) (2)) \ sum _ (k, l = 1) ^ (n) x_ (k) x_ (l) (\ mbox (HSIC)) (f_ (k), f_ (l)) - \ sum _ (k = 1) ^ (n) x_ (k) (\ mbox (HSIC)) (f_ (k), c) + \ lambda \ \ mathbf (x) \ _ (1), \ quad (\ mbox (s.t.)) \ x_ (1), \ ldots, x_ (n) \ geq 0,) </Dd> <P> where HSIC (f k, c) = tr (K _̄ (k) L _̄) (\ displaystyle (\ mbox (HSIC)) (f_ (k), c) = (\ mbox (tr)) ((\ bar (\ mathbf (K))) ^ ((k)) (\ bar (\ mathbf (L))))) is a kernel - based independence measure called the (empirical) Hilbert - Schmidt independence criterion (HSIC), tr (⋅) (\ displaystyle (\ mbox (tr)) (\ cdot)) denotes the trace, λ (\ displaystyle \ lambda) is the regularization parameter, K _̄ (k) = Γ K (k) Γ (\ displaystyle (\ bar (\ mathbf (K))) ^ ((k)) = \ mathbf (\ Gamma) \ mathbf (K) ^ ((k)) \ mathbf (\ Gamma)) and L _̄ = Γ L Γ (\ displaystyle (\ bar (\ mathbf (L))) = \ mathbf (\ Gamma) \ mathbf (L) \ mathbf (\ Gamma)) are input and output centered Gram matrices, K i, j (k) = K (u k, i, u k, j) (\ displaystyle K_ (i, j) ^ ((k)) = K (u_ (k, i), u_ (k, j))) and L i, j = L (c i, c j) (\ displaystyle L_ (i, j) = L (c_ (i), c_ (j))) are Gram matrices, K (u, u ′) (\ displaystyle K (u, u')) and L (c, c ′) (\ displaystyle L (c, c')) are kernel functions, Γ = I m − 1 m 1 m 1 m T (\ displaystyle \ mathbf (\ Gamma) = \ mathbf (I) _ (m) - (\ frac (1) (m)) \ mathbf (1) _ (m) \ mathbf (1) _ (m) ^ (T)) is the centering matrix, I m (\ displaystyle \ mathbf (I) _ (m)) is the m - dimensional identity matrix (m: the number of samples), 1 m (\ displaystyle \ mathbf (1) _ (m)) is the m - dimensional vector with all ones, and ‖ ⋅ ‖ 1 (\ displaystyle \ \ cdot \ _ (1)) is the l 1 (\ displaystyle \ ell _ (1)) - norm . HSIC always takes a non-negative value, and is zero if and only if two random variables are statistically independent when a universal reproducing kernel such as the Gaussian kernel is used . </P> <P> The HSIC Lasso can be written as </P>

Which algorithm is used the most for evaluating the correlation between features of the data