Overall framework of STiL. STiL encodes image-tabular data using $\phi$, decomposes modality-shared and -specific information through DCC $\psi$ (a), and outputs ...