Useful Identities of Computing Gradients
∂X∂f(X)T=(∂X∂f(X))T
∂X∂tr(f(X))=tr(∂X∂f(X))
∂X∂det(f(X))=det(f(X))tr(f(X)−1∂X∂f(X))
∂X∂f(X)−1=−f(X)−1∂X∂f(X)f(X)−1
∂X∂aTX−1b=−(X−1)TabT(X−1)T
∂x∂xTa=aT
∂x∂aTx=aT
∂X∂aTXb=aTb
∂X∂xTBx=xT(BT+B)
∂s∂(x−As)TW(x−As)=−2(x−As)TWA for symmetric matrix W