Tensor Networks in Machine Learning: Recent Advances and Frontiers
Description
Tensor Networks (TNs) are efficient representation of high-order tensors by a network of many low-order tensors, which have been studied in quantum physics and applied mathematics. In recent years, TNs have been increasingly investigated and applied to machine learning for high-dimensional data analysis, model compression and efficient computation in deep neural networks (DNNs), and theoretical analysis of expressive power for DNNs. This tutorial aims to present an overview of recent progress of TNs technology applied to machine learning from perspectives of TN for data representation, parameter modeling, and functional approximation. Specifically, we will introduce the fundamental model and algorithms of TNs, typical approaches in unsupervised learning, tensor completion, multi-modal learning and various applications in DNN, CNN, RNN and etc. We also discuss new frontiers and future trends in this research area.
Prerequisites
The audiences of this tutorial are expected to have basic knowledge in multilinear algebra, tensor decomposition, machine learning and deep neural networks.
Tutorial Outline
- Part I. Tensor Methods for Data Representation
Tensor Decomposition
Tensor Completion for Missing Values
Latent Convex Tensor Decomposition
Tensor Network Diagram
Tensor Train and Tensor Ring Models
- Part II. Tensor Networks in Deep Learning Modeling
Model Compression of NN by Tensor Networks
Learning Algorithms for Reparametrization by Tensor Networks
Supervised Learning with Quantum Inspired Tensor Networks
Exponential Machine
Speedup and Compression of CNN
Tensor Networks for Theoretical Analysis of DNNs
Multimodal Learning by Tensor Networks
Applications to RNN, LSTM, and Transformer
- Part III. Frontiers and Future Trends
TN for Function Approximation of Supervised Learning
TN Representation for Probabilistic Graphical Model
Generative Modeling by TN
Gaussian Mixture Distribution with Multi-dimensional Modes
Supervised Learning by Multi-scale TNs, 2D PEPS type TNs, and Tree TNs
Structure Learning of Tensor Networks
Discussions
Presenter
Qibin Zhao is a team leader for tensor learning team at RIKEN Center for Advanced Intelligence Project. He is also a visiting professor in Saitama Institute of Technology and a visiting associate professor in Tokyo University of Agriculture and Technology, Japan. Previously, he was a research scientist at RIKEN Brain Science Institute from 2009 to 2017. His research interests include machine learning, tensor factorization and tensor networks, computer vision and brain signal processing. He has published more than 120 scientific papers in international journals and conferences and two monographs on tensor networks (2016, 2017). He serves as an editorial board member for journal “Science China: Technological Sciences”, and also serves as an Area Chair for the international conferences of NeurIPS 2020, ICML 2021, ICLR 2021, AISTATS 2021, IJCAI 2021 and ACML 2020. He is (co-)organizing two workshops on tensor networks in NeurIPS 2020 [Link] and IJCAI 2020 [Link].
References
- [1]
Liu, X., Zhao, Q., Walid, A. (2020). Tensor and Tensor Networks for Machine Learning: An Hourglass Architecture. International Workshop on Tensor Network Representations in Machine Learning In conjunction with IJCAI2020.
- [2]
Wang, M., Liu, M., Liu, J., Wang, S., Long, G., & Qian, B. (2017). Safe medicine recommendation via medical knowledge graph embedding. arXiv preprint arXiv:1710.05980.
- [3]
Vasilescu, M. A. O., & Terzopoulos, D. (2002). Multilinear analysis of image ensembles: Tensorfaces. In European conference on computer vision, (pp. 447-460). Springer, Berlin, Heidelberg.
- [4]
De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. SIAM journal on Matrix Analysis and Applications, 21(4), 1253-1278.
- [5]
Zhou, G., & Cichocki, A. (2012). Fast and unique Tucker decompositions via multiway blind source separation. Bulletin of the Polish Academy of Sciences. Technical Sciences, 60(3), 389-405.
- [6]
Anandkumar, A., Ge, R., Hsu, D., & Kakade, S. M. (2014). A tensor approach to learning mixed membership community models. The Journal of Machine Learning Research, 15(1), 2239-2312.
- [7]
Zhao, Q., Zhang, L., & Cichocki, A. (2015). Bayesian CP factorization of incomplete tensors with automatic rank determination. IEEE transactions on pattern analysis and machine intelligence (T-PAMI), 37(9), 1751-1763.
- [8]
He, W., Yao, Q., Li, C., Yokoya, N., & Zhao, Q. (2019). Non-local meets global: An integrated paradigm for hyperspectral denoising. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 6861-6870). IEEE.
- [9]
Li, C., He, W., Yuan, L., Sun, Z., & Zhao, Q. (2019). Guaranteed Matrix Completion Under Multiple Linear Transformations. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 11128-11137). IEEE.
- [10]
Tomioka, R., Hayashi, K., & Kashima, H. (2010). Estimation of low-rank tensors via convex optimization. arXiv preprint arXiv:1010.0789.
- [11]
Tomioka, R., & Suzuki, T. (2013). Convex tensor decomposition via structured schatten norm regularization. In Advances in neural information processing systems (NeurIPS), (pp. 1331-1339).
- [12]
Li, C., Khan, M. E., Sun, Z., Niu, G., Han, B., Xie, S., & Zhao, Q. (2020). Beyond Unfolding: Exact Recovery of Latent Convex Tensor Decomposition under Reshuffling. In Proceedings of the AAAI Conference on Artificial Intelligence.
- [13]
Imaizumi, M., Maehara, T., & Hayashi, K. (2017). On tensor train rank minimization: Statistical efficiency and scalable algorithm. In Advances in Neural Information Processing Systems (NeurIPS), (pp. 3930-3939).
- [14]
Yuan, L., Li, C., Mandic, D., Cao, J., & Zhao, Q. (2019). Tensor ring decomposition with rank minimization on latent space: An efficient approach for tensor completion. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 9151-9158).
- [15]
Orús, R. (2014). A practical introduction to tensor networks: Matrix product states and projected entangled pair states. Annals of Physics, 349, 117-158.
- [16]
Oseledets, I. V. (2011). Tensor-train decomposition. SIAM Journal on Scientific Computing, 33(5), 2295-2317.
- [17]
Zhao, Q., Sugiyama, M., Yuan, L., & Cichocki, A. (2019, May). Learning efficient tensor representations with ring-structured networks. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing, (pp. 8608-8612). IEEE.
- [18]
Cichocki, A., Lee, N., Oseledets, I., Phan, A. H., Zhao, Q., & Mandic, D. P. (2016). Tensor networks for dimensionality reduction and large-scale optimization: Part 1 low-rank tensor decompositions. Foundations and Trends in Machine Learning, 9(4-5), 249-429.
- [19]
Cichocki, A., Phan, A. H., Oseledets, I., Zhao, Q., Sugiyama, M., Lee, N., & Mandic, D. (2017). Tensor networks for dimensionality reduction and large-scale optimizations: Part 2 applications and future perspectives. Foundations and Trends in Machine Learning, 9(6), 431-673.
- [20]
Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in neural information processing systems (NeurIPS), (pp. 1135-1143).
- [21]
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
- [22]
Han, S., Mao, H., & Dally, W. J. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149.
- [23]
Novikov, A., Podoprikhin, D., Osokin, A., & Vetrov, D. P. (2015). Tensorizing neural networks. In Advances in neural information processing systems (NeurIPS), (pp. 442-450).
- [24]
Stoudenmire, E., & Schwab, D. J. (2016). Supervised learning with tensor networks. In Advances in Neural Information Processing Systems (NeurIPS), (pp. 4799-4807).
- [25]
Novikov, A., Trofimov, M., & Oseledets, I. (2016). Exponential machines. arXiv preprint arXiv:1605.03795.
- [26]
Ye, J., Wang, L., Li, G., Chen, D., Zhe, S., Chu, X., & Xu, Z. (2018). Learning compact recurrent neural networks with block-term tensor decomposition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 9378-9387).
- [27]
Pan, Y., Xu, J., Wang, M., Ye, J., Wang, F., Bai, K., & Xu, Z. (2019, July). Compressing recurrent neural networks with tensor ring for action recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, (Vol. 33, pp. 4683-4690).
- [28]
Kossaifi, J., Lipton, Z. C., Kolbeinsson, A., Khanna, A., Furlanello, T., & Anandkumar, A. (2020). Tensor regression networks. Journal of Machine Learning Research, 21(123), 1-21.
- [29]
Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., & Lempitsky, V. (2015). Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In 3rd International Conference on Learning Representations (ICLR).
- [30]
Tai, C., Xiao, T., Zhang, Y., Wang, X., & Weinan, E. (2016). Convolutional neural networks with low-rank regularization. In 4th International Conference on Learning Representations (ICLR).
- [31]
Kim, Y. D., Park, E., Yoo, S., Choi, T., Yang, L., & Shin, D. (2016). Compression of deep convolutional neural networks for fast and low power mobile applications. In 4th International Conference on Learning Representations (ICLR)
- [32]
Hayashi, K., Yamaguchi, T., Sugawara, Y., & Maeda, S. I. (2019). Exploring Unexplored Tensor Network Decompositions for Convolutional Neural Networks. In Advances in Neural Information Processing Systems (NeurIPS), (pp. 5552-5562).
- [33]
Li, C., Sun, Z., Yu, J., Hou, M., & Zhao, Q. (2019). Low-rank Embedding of Kernels in Convolutional Neural Networks under Random Shuffling. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 3022-3026). IEEE.
- [34]
Li, J., Sun, Y., Su, J., Suzuki, T., & Huang, F. (2020). Understanding Generalization in Deep Learning via Tensor Methods. arXiv preprint arXiv:2001.05070.
- [35]
Cohen, N., & Shashua, A. (2016). Convolutional rectifier networks as generalized tensor decompositions. In International Conference on Machine Learning (ICML), (pp. 955-963).
- [36]
Levine, Y., Yakira, D., Cohen, N., & Shashua, A. (2018). Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design. In International Conference on Learning Representations (ICLR).
- [37]
Cohen, N., Sharir, O., & Shashua, A. (2016). On the expressive power of deep learning: A tensor analysis. In Conference on learning theory (COLT), (pp. 698-728).
- [38]
Khrulkov, V., Novikov, A., & Oseledets, I. (2018). Expressive power of recurrent neural networks. In International Conference on Learning Representations (ICLR).
- [39]
Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Lawrence Zitnick, C., & Parikh, D. (2015). Vqa: Visual question answering. In Proceedings of the IEEE international conference on computer vision (ICCV), (pp. 2425-2433).
- [40]
Zadeh, A., Chen, M., Poria, S., Cambria, E., & Morency, L. P. (2017). Tensor Fusion Network for Multimodal Sentiment Analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, (pp. 1103-1114).
- [41]
Liu, Z., Shen, Y., Lakshminarasimhan, V. B., Liang, P. P., Zadeh, A. B., & Morency, L. P. (2018). Efficient Low-rank Multimodal Fusion With Modality-Specific Factors. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, (pp. 2247-2256).
- [42]
Ben-Younes, H., Cadene, R., Cord, M., & Thome, N. (2017). Mutan: Multimodal tucker fusion for visual question answering. In Proceedings of the IEEE international conference on computer vision (ICCV), (pp. 2612-2620).
- [43]
Ma, X., Zhang, P., Zhang, S., Duan, N., Hou, Y., Zhou, M., & Song, D. (2019). A tensorized transformer for language modeling. In Advances in Neural Information Processing Systems (NeurIPS), (pp. 2232-2242).
- [44]
Liang, P. P., Liu, Z., Tsai, Y. H. H., Zhao, Q., Salakhutdinov, R., & Morency, L. P. (2019). Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, (pp. 1569-1576).
- [45]
Hou, M., Tang, J., Zhang, J., Kong, W., & Zhao, Q. (2019). Deep multimodal multilinear fusion with high-order polynomial pooling. In Advances in Neural Information Processing Systems (NeurIPS), (pp. 12136-12145).
- [46]
Yu, R., Zheng, S., Anandkumar, A., & Yue, Y. (2017). Long-term forecasting using higher order tensor RNNs. arXiv preprint arXiv:1711.00073.
- [47]
Su, J., Byeon, W., Huang, F., Kautz, J., & Anandkumar, A. (2020). Convolutional Tensor-Train LSTM for Spatio-temporal Learning. In Advances in Neural Information Processing Systems (NeurIPS).
- [48]
Kargas, N., & Sidiropoulos, N. D. (2020). Nonlinear System Identification via Tensor Completion. In Proceedings of the AAAI Conference on Artificial Intelligence.
- [49]
Glasser, I., Sweke, R., Pancotti, N., Eisert, J., & Cirac, I. (2019). Expressive power of tensor-network factorizations for probabilistic modeling. In Advances in Neural Information Processing Systems (NeurIPS), (pp. 1498-1510).
- [50]
Robeva, E., & Seigal, A. (2019). Duality of graphical models and tensor networks. Information and Inference: A Journal of the IMA, 8(2), 273-288.
- [51]
Ran, S. J. (2019). Bayesian Tensor Network with Polynomial Complexity for Probabilistic Machine Learning. arXiv, arXiv-1912.
- [52]
Han, Z. Y., Wang, J., Fan, H., Wang, L., & Zhang, P. (2018). Unsupervised generative modeling using matrix product states. Physical Review X, 8(3), 031012.
- [53]
Kuznetsov, M., Polykovskiy, D., Vetrov, D. P., & Zhebrak, A. (2019). A prior of a Googol Gaussians: a tensor ring induced prior for generative models. In Advances in Neural Information Processing Systems (NeurIPS), (pp. 4102-4112).
- [54]
Stoudenmire, E. M. (2018). Learning relevant features of data with multi-scale tensor networks. Quantum Science and Technology, 3(3), 034003.
- [55]
Selvan, R., & Dam, E. B. (2020). Tensor Networks for Medical Image Classification. In Medical Imaging with Deep Learning (MIDL).
- [56]
Glasser, I., Pancotti, N., & Cirac, J. I. (2020). From probabilistic graphical models to generalized tensor networks for supervised learning. IEEE Access, 8, 68169-68182.
- [57]
Cheng, S., Wang, L., & Zhang, P. (2020). Supervised Learning with Projected Entangled Pair States. arXiv preprint arXiv:2009.09932.
- [58]
Liu, D., Ran, S. J., Wittek, P., Peng, C., García, R. B., Su, G., & Lewenstein, M. (2019). Machine learning by unitary tensor network of hierarchical tree structure. New Journal of Physics, 21(7), 073059.
- [59]
Sheikh, H. R., & Bovik, A. C. (2006). Image information and visual quality. IEEE Transactions on image processing, 15(2), 430-444.
- [60]
Li, C., & Sun, S. (2020). Evolutionary topology search for tensor network decomposition. In Proceedings of the 37th International Conference on Machine Learning (ICML).