**Title:**Machine-Proving of Entropy Inequalities

**Abstract:**The entropy function plays a central role in information theory. Constraints on the entropy function in the form of inequalities, viz. entropy inequalities (often conditional on certain Markov conditions imposed by the problem under consideration), are indispensable tools for proving converse coding theorems. In this talk, I will give an overview of the development of machine-proving of entropy inequalities for the past 25 years. To start with, I will present a geometrical framework for the entropy function, and explain how an entropy inequality can be formulated, with or without constraints on the entropy function. Among all entropy inequalities, Shannon-type inequalities, namely those implied by the nonnegativity of Shannon’s information measures, are best understood. We will focus on the proving of Shannon-type inequalities, which in fact can be formulated as a linear programming problem. I will discuss ITIP, a software package originally developed for this purpose in the mid-1990s, as well as some of its later variants. In 2014, Tian successfully characterized the rate region of a class of exact-repair regenerating codes by means of a variant of ITIP. This is the first nontrivial converse coding theorem proved by a machine. At the end of the talk, I will discuss some recent progress in speeding up the proving of entropy inequalities.

**Bio:**Raymond W. Yeung is the Choh-Ming Li Professor of Information Engineering at The Chinese University of Hong Kong (CUHK). He received his B.S., M.Eng., and PhD degrees from Cornell University in Electrical Engineering in 1984, 1985, and 1988, respectively. Before joining CUHK in 1991, he was a Member of Technical Staff at AT&T Bell Laboratories. A co-founder of the field of network coding, he has been serving as Co-Director of the Institute of Network Coding at CUHK since 2010. He is the author of the books A First Course in Information Theory (Kluwer Academic/Plenum Publishers, 2002) and Information Theory and Network Coding (Springer 2008), which have been adopted by over 100 institutions around the world. In spring 2014, he gave the first MOOC in the world on information theory that reached over 25,000 students. He is a recipient of the 2005 IEEE Information Theory Society Paper Award, the Friedrich Wilhelm Bessel Research Award from the Alexander von Humboldt Foundation in 2007, the 2016 IEEE Eric E. Sumner Award, the 2018 ACM SIGMOBILE Test-of-Time Paper Award, the 2021 IEEE Richard W. Hamming Medal, and the 2022 Claude E. Shannon Award. In 2015, he was named an Outstanding Overseas Chinese Information Theorist by the China Information Theory Society. He is a Fellow of the IEEE, Hong Kong Academy of Engineering Sciences, Hong Kong Institution of Engineers, and the US National Academy of Engineering.

#### Yi Ma

Director & Chair Professor

HKU Musketeers Foundation Institute of Data Science; Head, Department of Computer Science, HKU

**Title:**The Past, Present, and Future of Intelligence: from Artificial Intelligence to Autonomous Intelligence

**Abstract:**In this talk, we provide a more systematic and principled view about the practice of artificial intelligence from the history of the study of intelligence. We argue that the most fundamental objective of intelligence is to learn a parsimonious and structured representation of the sensed world that maximizes the internal information gain. This objective naturally leads to a unifying computational framework which integrates fundamental ideas from information theory, optimization, feedback control, and game theory, hence connects us back to the true origin of the study of intelligence 80 years ago. We contend that this new framework provides a unifying understanding and white-box explanation for almost all recent and current practices of artificial intelligence based on deep networks/learning. Probably most importantly, it reveals out a much broad and bright future for developing next-generation autonomous intelligent systems that truly emulate the computational mechanisms of natural intelligence.

**Bio:**Yi Ma is the inaugural director of the Data Science Institute and the head of the Computer Science Department of the University of Hong Kong. He is also a professor at the EECS Department at the University of California, Berkeley. His research interests include computer vision, high-dimensional data analysis, and integrated intelligent systems. Yi received his two bachelor’s degrees in Automation and Applied Mathematics from Tsinghua University in 1995, two master’s degrees in EECS and Mathematics in 1997, and a PhD degree in EECS from UC Berkeley in 2000. He has been on the faculty of UIUC ECE from 2000 to 2011, the principal researcher and manager of the Visual Computing group of Microsoft Research Asia from 2009 to 2014, and the Executive Dean of the School of Information Science and Technology of ShanghaiTech University from 2014 to 2017. He joined the faculty of UC Berkeley EECS in 2018. He has published over 60 journal papers, 120 conference papers, and three textbooks in computer vision, generalized PCA, and high-dimensional data analysis. He received the NSF Career award in 2004 and the ONR Young Investigator award in 2005. He also received the David Marr prize in computer vision from ICCV 1999 and best paper awards from ECCV 2004 and ACCV 2009. He has served as the Program Chair for ICCV 2013 and the General Chair for ICCV 2015. He is a Fellow of IEEE, ACM, and SIAM.

#### Masahito Hayashi

Professor, IEEE Fellow and an IMS Fellow

The Chinese University of Hong Kong, Shenzhen

**Title:**Iterative Minimization Algorithm on Mixture Family

**Abstract:**Iterative minimization algorithms appear in various areas including machine learning, neural network, and information theory. The em algorithm is one of the famous iterative minimization algorithms in the former area, and Arimoto-Blahut algorithm is a typical iterative algorithm in the latter area. However, these two topics had been separately studied for a long time. In this paper, we generalize an algorithm that was recently proposed in the context of Arimoto-Blahut algorithm. Then, we show various convergence theorems, one of which covers the case when each iterative step is done approximately. Also, we apply this algorithm to the target problem in em algorithm, and propose its improvement. In addition, we apply it to other various problems in information theory. This paper is available in https://arxiv.org/abs/2302.06905.

**Bio:**Masahito Hayashi received the B.S.degree from the Faculty of Sciences in Kyoto University, Japan, in 1994 and the M.S.and Ph.D.degrees in Mathematics from Kyoto University, Japan, in 1996 and 1999, respectively. He worked in Kyoto University as a Research Fellow of the Japan Society of the Promotion of Science (JSPS) from 1998 to 2000, and worked in the Laboratory for Mathematical Neuroscience, Brain Science Institute, RIKEN as a Researcher from 2000 to 2003. He worked in ERATO Quantum Computation and Information Project, Japan Science and Technology Agency (JST) as the Research Head from 2003 to 2006, and in ERATO-SORST Quantum Computation and Information Project, JST as a Group Leader from 2006 to 2007. He worked in the Graduate School of Information Sciences, Tohoku University as an Associate Professor from 2007 to 2012. In 2012, he joined the Graduate School of Mathematics, Nagoya University as a Full Professor. He worked in Shenzhen Institute for Quantum Science and Engineering, Southern University of Science and Technology, Shenzhen, China as a Chief Research Scientist from 2020 to 2023. In 2023, he joined School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen) as a Full Professor, and joined International Quantum Academy (SIQA) as a Chief Research Scientist. In 2011, he received Information Theory Society Paper Award (2011) for ``Information-Spectrum Approach to Second-Order Coding Rate in Channel Coding''.In 2016, he received the Japan Academy Medal from the Japan Academy and the JSPS Prize from Japan Society for the Promotion of Science. In 2017, he was elected as an IEEE Fellow. In 2022, he was elected as an IMS Fellow and an Asia-Pacific Artificial Intelligence Association (AAIA) Fellow. His research interests include classical and quantum information theory and classical and quantum statistical inference.

**Title:**Decomposition of Multi-Variate Dependence and Application in Parameterized Inference

**Abstract:**In this talk, we report recent progress in neural network based modal decomposition. The overarching theme is that we should view learning as selecting feature functions in a vector space, the geometry of which provides critical insights into the design of learning algorithms, including neural networks. This talk introduces the neural network modules to make projections in this vector space. With this as a basic building block, we can decompose dependence between more than two random variables. We argue that this technique is the counterpart to basic concepts like the chain rule and random binning in classical information theory. We demonstrate that this result can be used in problems such as designing receivers that can take parameters as side information, which is widely used in communication problems.

**Bio:**Lizhong Zheng received the B.S and M.S. degrees, in 1994 and 1997 respectively, from the Department of Electronic Engineering, Tsinghua University, China, and the Ph.D. degree, in 2002, from the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley. Since 2002, he has been working at MIT, where he is currently a professor of Electrical Engineering. His research interests include information theory, statistical inference, communications, and networks theory.

**Title:**On Information Theoretic Generalization Error Bounds for Machine Learning Algorithms

**Abstract:**Recent advances in machine learning algorithms reveal the deficiency of existing approaches in understanding the generalization behaviors of such algorithms. Information theoretic generalization error bounds have emerged as a promising approach, which allows the incorporation of both the restriction on the data distribution and the interaction between training data and the algorithms. In this talk we shall discuss several results that we have developed using this approach. The first is a conditional mutual information based generalization error bound, the second is a new chaining approach that can be used to boost existing information theoretic generalization error bound, and the third is a matching approach to construct tight generalization error bound for specific problems. Akin to traditional information theoretic studies in communications, the Gaussian setting plays an important role in these studies of information theoretic generalization error.

**Bio:**Dr. Chao Tian received the B.E. degree in Electronic Engineering from Tsinghua University, Beijing, China, in 2000 and the M.S. and Ph. D. degrees in Electrical and Computer Engineering from Cornell University, Ithaca, NY in 2003 and 2005, respectively. Dr. Tian was a postdoctoral researcher at Ecole Polytechnique Federale de Lausanne (EPFL) from 2005 to 2007, a member of technical staff--research at AT&T Labs--Research in New Jersey from 2007 to 2014, and an Associate Professor in the Department of Electrical Engineering and Computer Science at the University of Tennessee Knoxville from 2014 to 2017. He joined the Department of Electrical and Computer Engineering at Texas A&M University in 2017, where he is a now an associate professor. His research interests include machine learning, data storage systems, multi-user information theory, and optimization. Dr. Tian received the Liu Memorial Award at Cornell University in 2004, AT&T Key Contributor Award in 2010, 2011 and 2013. His authored and co-authored papers received the 2014 IEEE Data Storage Best Paper Award, the 2017 IEEE Jack Keil Wolf ISIT Student Paper Award, and the 2020-2021 IEEE Data Storage Best Student Paper Award. He was an Associate Editor for the IEEE Signal Processing Letters during 2012-2014, an Editor for the IEEE Transactions on Communications during 2016-2021, and an Associate Editor for the IEEE Transactions on Information Theory during 2020-2023.

**Title:**On Sampling Continuous-Time AWGN Channels

**Abstract:**For a continuous-time additive white Gaussian noise (AWGN) channel with possible feedback, it has been shown that as sampling gets infinitesimally fine, the mutual information of the associative discrete-time channels converges to that of the original continuous-time channel. We give in this paper more quantitative strengthenings of this result, which, among other implications, characterize how over-sampling approaches the true mutual information of a continuous-time Gaussian channel with bandwidth limit. The assumptions in our results are relatively mild. In particular, for the non-feedback case, compared to the Shannon-Nyquist sampling theorem, a widely used tool to connect continuous-time Gaussian channels to their discrete-time counterparts that requires the band-limitedness of the channel input, our results only require some integrability conditions on the power spectral density function of the input.

**Bio:**Guangyue Han received the B.S. and M.S. degrees in mathematics from Peking University, China, in 1997 and 2000, respectively, and the Ph.D. degree in mathematics from the University of Notre Dame, USA, in 2004. After three years with the Department of Mathematics, the University of British Columbia, Canada, he joined the Department of Mathematics, the University of Hong Kong, China, in 2007. His main research areas are coding and information theory.

#### Rui Zhang

X.Q. Deng Presidential Chair Professor

Shenzhen Research Institute of Big Data; The Chinese University of Hong Kong, Shenzhen

**Title:**Intelligent Reflecting Surface (IRS) Empowered 6G: Fundamentals, Applications and Challenges

**Abstract:**In this talk, we introduce a new promising paradigm for future wireless networks (6G) by leveraging a massive number of low-cost passive elements with controllable signal reflection, named Intelligent Reflecting Surface (IRS), which can smartly reconfigure wireless channels for enhancing the communication performance. First, we present the signal and channel models of IRS by considering its hardware constraints in practice. We then illustrate the main functions and applications of IRS in achieving spectral and energy efficient wireless networks and highlight its cost and performance advantages as compared to existing wireless technologies. Next, we focus on the main design challenges for efficiently integrating IRS into future wireless networks, including passive reflection optimization, IRS channel acquisition, and IRS deployment, and overview their state-of-the-art solutions, with an emphasis on the recent progress in multi-IRS multi-reflection aided wireless networks via a new “beam routing” approach. Finally, we discuss other extensions and point out directions worthy of investigation in the future.

**Bio:**Dr. Rui Zhang (Fellow of IEEE, Fellow of the Academy of Engineering Singapore) received the Ph.D. degree from Stanford University in electrical engineering in 2007.Rui Zhang is currently a Principal Research Scientist at Shenzhen Research Institute of Big Data. He is also a Principal's Diligence Chair Professor in the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen. His current research interests include wireless information and power transfer, UAV/satellite communications, intelligent reflecting surfaces and reconfigurable MIMO. He has published over 500 papers, which have been cited more than 75,000 times with the h-index over 120. He has been listed as a Highly Cited Researcher by Thomson Reuters / Clarivate Analytics since 2015. He was the recipient of the IEEE Communications Society Asia-Pacific Region Best Young Researcher Award in 2011, the Young Researcher Award of National University of Singapore in 2015, the Wireless Communications Technical Committee Recognition Award in 2020, and the IEEE Signal Processing and Computing for Communications (SPCC) Technical Recognition Award in 2021. He received 17 IEEE Best Journal Paper Awards, including the IEEE Marconi Prize Paper Award in Wireless Communications (twice), the IEEE Communications Society Heinrich Hertz Prize Paper Award (thrice), the IEEE Communications Society Stephen O. Rice Prize, the IEEE Signal Processing Society Best Paper Award, etc. He has served as an Editor for several IEEE journals, including TWC, TCOM, JSAC, TSP, TGCN, etc., and as TPC co-chair or organizing committee member for over 30 international conferences. He served as an IEEE Distinguished Lecturer of IEEE Communications Society and IEEE Signal Processing Society.

**Title:**Time-varying Topological Signal Processing

**Abstract:**Graphs provide an effective framework for the analysis of multi-variate data. Research in Graph Signal Processing and Graph Neural Networks have advanced to the point of providing state of the art solutions in various applications ranging from traffic data analysis to meteorological data analysis. However, graphs provide only a quantification of one-to-one relationships between variables. Topological data analysis, an extension of graph data analysis provides a much greater potential for uncovering hidden relationships between multiple variables. We present the extension of basic graph signal processing to simplicial graphs and give the fundamentals of topological data analysis. This new formulation allows us also to address two important challenges of the adaptive graph signal processing problem: non-Gaussian data and time-varying graphs where not only the node values but also the branch weights change over time. We present new algorithms providing time-varying solutions robust to non-Gaussian noise and data. Finally, we present an algorithm that can estimate time varying node and branch values simultaneously.

**Bio:**Ercan E. Kuruoğlu received MPhil and PhD degrees in information engineering from the University of Cambridge, United Kingdom, in 1995 and 1999, respectively. In 1998, he joined Xerox Research Center Europe, Cambridge. He was an ERCIM fellow in 2000 with INRIA-Sophia Antipolis, France. In January 2002, he joined Institute of Science and Technology of Information-CNR (Italian National Council of Research), Pisa, Italy where he became a Chief Scientist in 2020. He is a Full Professor at Tsinghua-Berkeley Shenzhen Institute since 2022. He served as an Associate Editor for the IEEE Transactions on Signal Processing and IEEE Transactions on Image Processing. He was the Editor in Chief of Digital Signal Processing: A Review Journal between 2011-2021. He is currently co-Editor-in-Chief of Journal of the Franklin Institute. He acted as a Technical co-Chair for EUSIPCO 2006 and a Tutorials co-Chair of ICASSP 2014. He is a member of the IEEE Signal Processing Society Data Challenges Committee, IEEE Technical Committees (TC) on Machine Learning for Signal Processing and TC on Image, Video and Multidimensional Signal Processing. He was a plenary speaker at DAC 2007, ISSPA 2010, IEEE SIU 2017, Entropy 2018, MIIS 2020 and tutorial speaker at IEEE ICSPCC 2012. He was an Alexander von Humboldt Experienced Research Fellow in the Max Planck Institute for Molecular Genetics in 2013-2015. His research interests are in the areas of statistical signal and image processing, Bayesian machine learning and information theory with applications in remote sensing, environmental sciences, telecommunications and computational biology. He speaks English, Italian, German and Turkish.

**Title:**A Machine Learning Approach for Estimating and Achieving Capacity Regions of Communication System

**Abstract:**In this talk we will develop a principled framework for neural estimation and optimization of information measures and specifically directed information, which is then leveraged to estimate the feedforward and feedback capacities of general channels and to design a decoder for polar codes based on neural networks. To that end, we propose a novel Directed Information Neural Estimator (DINE) that complements the Mutual Information Neural Estimation (MINE), and then develop methods for optimizing DINE and MINE over the channel input distributions. More specifically, two optimization methods are proposed, one for continuous channel input spaces and the other for discrete. While capacity estimation is the main application considered in this talk, we will discuss how the developed estimation and optimization techniques are applicable in additional scenarios where (maximized) Directed Information is of interest such as probability density estimation for processes with memory, causality identification and machine learning in general.

**Bio:**Haim Permuter received his B.Sc. (summa cum laude) from Ben-Gurion University (BGU) and Ph.D. from Stanford University, both in in Electrical Engineering, in 1997 and 2008, respectively. Between 1997-2004, he served as a scientific research officer in an R&D unit in the Israeli Defense Forces. In summer 2002 he worked for IBM, Almaden research center. He is a recipient of several rewards including Eshkol Fellowship, Wolf Award, Fulbright Fellowship, Stanford Graduate Fellowship, U.S.-Israel Binational Science Foundation Bergmann Memorial Award, and Allon Fellowship. Haim joined the faculty of Electrical Engineering Department at BGU in Oct 2008 as a tenure-track faculty, and is now a Professor, Luck-Hille Chair in Electrical Engineering. Haim also serves as head of the communication, cyber and information track in his department. Haim served on the editorial boards of the IEEE Transactions on Information Theory in 2013-2016 and serves again from Jan 2023 till now.

**Title:**Certain Families of Non-convex Optimization Problems in Information Theory with Unique Local Maximzers

**Abstract:**Non-Convex optimization problems arise often in information theory, especially in the context of evaluation of inner and outer bounds to the capacity regions of various multiuser settings. Multiuser settings with additive Gaussian noise is a commonly used model for communication in wireless networks. For Gaussian broadcast channels, capacity region was established by evaluating the outer bound and demonstrating that the outer bound was optimized by Gaussian distributions. Further, using ideas inspiered by Dirty Paper Coding, it was shown that the outer and the inner bounds coincide. From an information theorist's perspective, the question of determining the capacty region of a broadcast channel seemed settled. However, from an algotihmic point of the evaluation of the capacity region, the non-convexity of the resultant optimization problem over Gaussian distributions poses a significant computational challenge. In this talk, we will show that the non-convex optimization problem only has a unique local maximizer, thus making techniques like Gradient deascent feasible. The technique for proving optimizality of local maximizers turns out to be a consequence of sub-additivity of the underlying funcitonal, the same idea that led to the discovery of an alternate technqiue to derive the Gaussian optimality of the auxiliaries. Finally I will present a simple observation (conjecture) about the maximizers whose proof will lead to further decoupling of the optimization problem.

**Bio:**Chandra Nair is a Professor with the Information Engineering department at The Chinese University of Hong Kong. He also serves as the Programme Director of the undergraduate program on Mathematics and Information Engineering. Chandra Nair got his Bachelor's degree, B.Tech (EE), from IIT Madras (India) and his Ph.D. degree from the EE department of Stanford University. He has been an Associate Editor for the IEEE Transactions on Information Theory and was a distinguished lecturer of the IEEE Information theory society. He is a Fellow of the IEEE. He is a co-recipient of the 2016 Information Theory Society paper award. His recent research interests and contributions are in developing ideas, tools, and techniques to tackle families of combinatorial and non-convex optimization problems arising primarily in the information sciences. You can find more about his research: http://chandra.ie.cuhk.edu.hk/pub/research-summary.pdf

**Title:**Information Theoretic Results in Semantic and Task-Oriented Communications

**Abstract:**Many emerging applications, from autonomous driving, to healthcare and Internet of things, involve connecting machines with humans and other machines, where the goal is to enable the receiver to make the right inference or to take the right action at the right time and context. This requires identifying and sending only the most relevant information for the underlying task, which can significantly reduce the communication load, and has motivated the recent resurgence of interest in semantic and task-oriented communications. In this talk, I will present new sets of performance requirements and design constraints motivated by these applications, and present the results of our recent efforts in designing semantic and pragmatic communication systems bringing together ideas and tools from classical information and coding theory with modern machine learning algorithms.

**Bio:**Deniz Gündüz received his M.S. and Ph.D. degrees in electrical engineering from NYU Tandon School of Engineering (formerly Polytechnic University) in 2004 and 2007, respectively. After his PhD, he served as a postdoctoral research associate at Princeton University, as a consulting assistant professor at Stanford University, and as a research associate at CTTC. ln 2012, he joined the Electrical and Electronic Engineering Department of Imperial College London, UK, where he is currently a Professor of Information Processing, and serves as the deputy head of the Intelligent Systems and Networks Group. He has held visiting/part-time positions at University of Modena and Reggio Emilia (2019-2022), University of Padova (2018-2020) and Princeton University (2009-2012). His research interests lie in the areas of communications and information theory, machine learning, and privacy. Dr. Gündüz is a Fellow of the IEEE. He is an Area Editor for the IEEE Transactions on Information Theory and IEEE Transactions on Communications, and an Editor of the IEEE Transactions on Wireless Communications. He is the recipient of the IEEE Communications Society - Communication Theory Technical Committee (CTTC) Early Achievement Award in 2017, Starting (2016), Consolidator (2022) and Proof-of-Concept (2023) grants of the European Research Council (ERC), and several best paper awards.

#### Ruoyu Sun

Associate Professor

Shenzhen International Center for Industrial and Applied Mathematics; Shenzhen Research Institute of Big Data; The Chinese University of Hong Kong, Shenzhen

**Title:**Converge or Diverge? A Story of Adam

**Abstract:**Adam is one of the most popular algorithms in deep learning, used in lots of applications including chatGPT. Despite the popularity, the theoretical properties of Adam were largely unknown, and how to tune Adam was not clear. Reddi et al. (2018) pointed out the diverengece issue of Adam, and since then many variants of Adam were proposed. However, vanilla Adam remains exceptionally popular and it works well in practice. Why is there a gap between theory and practice? We point out there is a mismatch between the settings of theory and practice: Reddi et al. (2018) pick the problem after picking the hyperparameters of Adam, i.e., (β1, β2); while practical applications often fix the problem first and then tune (β1, β2). We conjecture for the latter practical setting, i.e. allowing tuning hyperparameters, Adam can converge. In this talk, we present our recent findings that confirm this conjecture. More specifically, we show that when the 2nd-order momentum parameter β2 is large enough and 1st-order momentum parameter β1 < sqrt(β2) < 1, Adam converges. In general, Adam converges to the neighborhood of critical points; and under an extra condition (strong growth condition), Adam converges to critical points. These results lead to suggestions on how to tune Adam hyperparameters, which are confirmed by empirical experiments.

**Bio:**Ruoyu Sun is currently a Senior Research Scientist at Shenzhen Research Institute of Big Data and Shenzhen International Center for Industrial and Applied Mathematics. He is also an associate professor (tenured) in the School of Data Science at The Chinese University of Hong Kong, Shenzhen. From 2017 to 2022, he worked as a tenure-track assistant professor in the Department of ISE and ECE (affiliated), at the University of Illinois at Urbana-Champaign (UIUC). Prior to that, he worked as a full-time visiting research scientist at the Facebook AI Research, and was a postdoctoral researcher at Stanford University. He obtained Ph.D. in Electrical Engineering from the University of Minnesota, and B.S. in mathematics from Peking University. His research interests include deep learning theory and algorithms, generative models, large-scale optimization, learning to optimize and communication networks. He has won the second place of INFORMS George Nicholson student paper competition, and honorable mention of INFORMS optimization society student paper competition. He received “AI2000 Most Influential Scholar Honorable Mention” (theory, 2012-2021) from AMiner in 2022. He has published more than 50 papers in machine learning conferences, information theory and communication journals, and optimization journals. He has been serving as an area chair of machine learning conferences NeurIPS, ICML, ICLR and AISTATS.

#### Amin Gohari

Vice-Chancellor Associate Professor

The Chinese University of Hong Kong

**Title:**Shannon-type Inequalities for f-Divergences

**Abstract:**Shannon's mutual information is known to satisfy a class of inequalities known as the Shannon-type inequalities. The Shannon-type inequalities are widely utilized in network information theory to prove infeasibility results. In this talk, I consider the question of whether some of these Shannon-type inequalities continue to hold for mutual f-information, which is defined using f-divergences (also known as generalized relative entropy). f-divergences have found various applications in information theory, statistics, and machine learning among other fields. We will introduce a new class of f-divergences called super-modular divergences and show that they lead to information measures satisfying certain Shannon-type inequalities. Applications of super-modular divergences in statistics and information theory are discussed. In particular, we offer new bounds on the rate-distortion function in the finite blocklength regime as well as an extended Sanov's bound for the hypothesis testing problem. This talk is based on a joint work with Saeed Masiha and Mohammad Hossein Yassaee.

**Bio:**Amin Gohari received his B.Sc. degree from the Sharif University of Technology, Tehran, Iran, in 2004 and his Ph.D. degree in Electrical Engineering from the University of California, Berkeley in 2010. From 2010-2011, he was a postdoc at the Chinese University of Hong Kong, Institute of Network Coding. From 2011-2020 he was with the Electrical Engineering Department of Sharif University of Technology. He was with the Tehran Institute for Advanced Studies from 2020-2022. He joined the Chinese University of Hong Kong in 2022. Dr. Gohari received the IEEE Iran Section Young Researcher Award in 2021 and was selected as a Distinguished Lecturer by the IEEE Information Theory Society in 2019. He received the 2010 Eli Jury Award from UC Berkeley, Department of Electrical Engineering and Computer Sciences, for outstanding achievement in the area of communication networks, and the 2009-2010 Bernard Friedman Memorial Prize in Applied Mathematics from UC Berkeley, Department of Mathematics, for demonstrated ability to do research in applied mathematics. He also received the Gold Medal from the 41st International Mathematical Olympiad (IMO 2000) and the First Prize from the 9th International Mathematical Competition for University Students (IMC 2002). He was a finalist for the best student paper award at the IEEE International Symposium on Information Theory (ISIT) in three consecutive years, 2008, 2009, and 2010. He was also co-author of a paper that won the ISIT 2013 Jack Wolf student paper award and two that were finalists in 2012 and 2014 (as supervisor). He was selected as an exemplary reviewer for the IEEE Transactions on Communications in 2015 and 2016. Dr. Gohari served as an Associate Editor for the IEEE Transactions on Information Theory from 2018 to 2021.

**Title:**From Information Freshness to Semantics of Information and Goal-oriented Communications

**Abstract:**Wireless networks are evolving to cater to emerging cyber-physical and mission-critical interactive systems, such as swarm robotics, self-driving cars, and the smart Internet of Things. A fundamental shift in thinking is necessary to satisfy the requirements for real-time communication, autonomous decision-making, and efficient distributed processing. In this talk, we will focus on the freshness and value of information, and we will present some initial results and ongoing works on semantics-aware goal-oriented communication.

**Bio:**Nikolaos Pappas (Senior Member, IEEE) received a B.Sc. degree in computer science, a B.Sc. degree in mathematics, an M.Sc. degree in computer science, and a Ph.D. degree in computer science from the University of Crete, Greece, in 2005, 2012, 2007, and 2012, respectively. From 2005 to 2012, he was a Graduate Research Assistant with the Telecommunications and Networks Laboratory, Institute of Computer Science, Foundation for Research and Technology—Hellas, Heraklion, Greece, and a Visiting Scholar with the Institute of Systems Research, University of Maryland at College Park, College Park, MD, USA. From 2012 to 2014, he was a postdoctoral Researcher with the Department of Telecommunications, CentraleSupélec, Gif-sur-Yvette, France. He is currently an Associate Professor at the Department of Computer and Information Science at Linköping University, Linköping, Sweden. His main research interests include the field of wireless communication networks with an emphasis on semantics-aware communications, energy harvesting networks, network-level cooperation, age of information, and stochastic geometry. Dr. Pappas has served as the Symposium Co-Chair of the IEEE International Conference on Communications in 2022 and the IEEE Wireless Communications and Networking Conference in 2022. From 2013 to 2018, he was an Editor of the IEEE COMMUNICATIONS LETTERS. He was a Guest Editor of the IEEE INTERNET OF THINGS JOURNAL on “Age of Information and Data Semantics for Sensing, Communication and Control Co-Design in IoT”. He is also an Editor of the IEEE TRANSACTIONS ON COMMUNICATIONS, the IEEE Transactions on Machine Learning in Communications and Networking, the IEEE/KICS JOURNAL OF COMMUNICATIONS AND NETWORKS. He is area editor of the IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY and an Expert Editor for invited papers of the IEEE COMMUNICATIONS LETTERS. He is guest editor of the IEEE Network on “Tactile Internet for a cyber-physical continuum”. He has co-authored a monograph on Age of Information, and he is the leading editor of a book on “Age of Information: Foundations and Applications” published by Cambridge University Press. He has appeared in the Top 2% scientists list, single year impact, in the area of Networking and Telecommunications for the last three years. He is the principal investigator of the project “Semantics-Empowered Communication for Networked Intelligent Systems” funded by the Swedish Research Council, and he is the main PI of five more projects related to the semantics of information funded by other national and EU sources.

**Title:**Approximate Message Passing Algorithms for High-dimensional Statistical Inference and a Study of the Type 1 - Type 2 Error Tradeoff for SLOPE

**Abstract:**Sorted L1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this talk, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion (TPP) or, equivalently, between measures of type I and type II error. Additionally, we show that on any problem instance, SLOPE with a certain regularization sequence outperforms the Lasso, in the sense of having a smaller FDP, larger TPP and smaller L2 estimation risk simultaneously. Our proofs are based on a novel technique that reduces a variational calculus problem to a class of infinite-dimensional convex optimization problems and a very recent result from approximate message passing (AMP) theory. AMP refers to a class of iterative algorithms that have been successfully applied to a number of high-dimensional statistical estimation problems like linear regression, generalized linear models, and low-rank matrix estimation, and are practical and useful in a variety of engineering and computer science applications such as imaging, communications, and neural networks. AMP algorithms have two features that make them particularly attractive: they can easily be tailored to take advantage of prior information on the structure of the signal, such as sparsity, and under suitable assumptions on a design matrix, AMP theory provides precise asymptotic guarantees for statistical procedures in the high-dimensional regime. With SLOPE being a particular example, we discuss these results in the context of a general program for systematically deriving exact expressions for the asymptotic risk of estimators that are solutions to a broad class of convex optimization problems via AMP. Collaborators on this work include Zhiqi Bu, Jason Klusowski, and Weijie Su (https://arxiv.org/abs/1907.07502 and https://arxiv.org/abs/2105.13302) and Oliver Feng, Ramji Venkataramanan, and Richard Samworth (https://arxiv.org/abs/2105.02180).

**Bio:**Cynthia Rush is an Associate Professor of Statistics in the Department of Statistics at Columbia University. She received a Ph.D. and M.A. in Statistics from Yale University in 2016 and 2011, respectively, and she completed her undergraduate coursework at the University of North Carolina at Chapel Hill where she obtained a B.S. in Mathematics in 2010. She received a NSF CRIII award in 2019, was a finalist for the 2016 IEEE Jack K. Wolf ISIT Student Paper Award, was an NTT Research Fellow at the Simons Institute for the Theory of Computing for the program on Probability, Computation, and Geometry in High Dimensions in Fall 2020, and was a Google Research Fellow at the Simons Institute for the Theory of Computing for the program on Computational Complexity of Statistical Inference in Fall 2021.

**Title:**Characterize the Geometry of Object Manifolds in Deep Neural Networks Using Statistical Physics Methods

**Abstract:**Methods from statistical physics usually describe the typical behaviour of random, unstructured systems when the number of elements becomes large. Here we present an application of those methods to real-world data, which contains both detailed structure and non-trivial correlations, through the introduction of a novel manifold generation model. In computational neuroscience, it is usually assumed that stimuli are represented in the brain by the collective population responses of sensory neurons, and an object presented under varying conditions gives rise to a collection of neural population responses called an ‘object manifold’. Then changes in the object representation along a hierarchical sensory system are associated with changes in the geometry of those manifolds. Introducing a theoretical framework based on statistical physics and an appropriate data generation model, it is possible to connect the manifolds' geometry with 'classification capacity', a quantitative measure of the ability of the neural population to support object classification. Deep neural networks trained on object classification tasks are a natural testbed for the applicability of this relation, where it is demonstrated that classification capacity improves along the hierarchies of deep neural networks with different architectures and that changes in the geometry of the associated object manifolds underlie this improved capacity. This analysis sheds light on the functional roles different levels in the hierarchy play in achieving the improvement in classification through orchestrated reduction of manifolds’ radius, dimensionality and inter-manifold correlations.

**Bio:**Uri Cohen received a BSc in Mathematics and Cognitive sciences from the Hebrew University of Jerusalem, Israel. After spending some time outside the academy, working as a software engineer, he completed a PhD at the Edmond & Lily Safra Center for Brain Sciences, working on population coding and manifold representation of sensory information under the supervisor of Haim Sompolinsky. Currently a postdoctoral research fellow at the Computational and Biological Learning Lab, University of Cambridge, he is using computational models for long-term memory to predict the properties of memory mechanisms in the brain. He is interested in developing theories on the dynamics of learning and memory in the brain, with possible applications to machine learning.

**Title:**Rate Distortion Perspectives in Goal-Oriented Semantic Communications

**Abstract:**In this talk, the goal is to stress the usefulness of rate distortion theory and its variants in identifying fundamental limitations in goal-oriented semantic communications. In the first part of the talk, we discuss a variant of a robust description lossy source coding problem. Particularly, for the proposed setup, we demonstrate the cardinal role of multiple fidelity constraints in designing selective decoders which in turn dictate the outcome of the reconstructed message depending on the task. We also discuss a less restrictive class of fidelity constraints, called f-separable distortions, which allow for a much richer class of distortion penalties (e.g., exponential, polynomial, logarithmic) between the transmitted signal and the received signal. An algorithm for computing the resulting rate-distortion characterizations via the alternating minimization approach is also discussed. In the second part of this talk, we consider a variant of rate distortion function, called rate-distortion-perception function, and explain its utility in goal-oriented compression. Here we primarily discuss optimization viewpoints and algorithms that allow the computation of this function for finite alphabet sources, when the distortion criterion is separable and the perception constraint belongs to the family of f-divergences. An interesting result herein is the derivation of a robust approximate iterative method with provable convergence guarantees, that is shown to converge to the globally optimal solution. In the third part of this talk and if time permits, we will also talk about a lower bound on the optimal performance theoretically attainable by causal and zero-delay codes, called non-anticipative rate distortion function, and demonstrate its utility in constructing achievable bounds in a low-delay variable-rate lossy source coding setup that necessitates causal reconstruction of the source message, one of the key features of goal-oriented semantic communications from an information-theoretic standpoint. The utility of this bound, is also demonstrated in identifying the synthesis and the performance analysis of a quantized closed-loop controlled network.

**Bio:**Photios A. Stavrou (M’16, SM’22) received his D. Eng in 2008 from the Department of Electrical and Computer Engineering (ECE) of the Faculty of Engineering at Aristotle University of Thessaloniki, Greece and his Ph.D in Electrical Engineering from the University of Cyprus, Cyprus, in 2016. From October 2016 to August 2022, he held post-doctoral positions at Aalborg University in Denmark, at KTH Royal Institute of Technology in Sweden and at EURECOM in France. As of September of 2022, Dr Stavrou is an Assistant Professor within the Communication Systems Department at EURECOM, Sophia-Antipolis, France. His research interests span information and communication theories, networked control systems, goal-oriented semantic communications, causal estimation theory, optimization and game theory.

**Title:**Discovering Spikes in Random Matrices Using Approximate Message Passing

**Abstract:**In modern statistics, random matrices of large dimensions often arise with a low-dimensional planted structure. Such structure may originate from an informative component that has small effective dimensions such as sparsity, rank, etc. It is of statisticians' interest to extract such components from large noisy matrices. To this end, understanding the spectral properties of spiked random matrices arising from various models becomes crucial. In this talk, I will present a novel proof technique using the approximate message passing theory that allows us to systematically characterize some spectral properties such as the location of outlier eigenvalues and the overlap between principal components and unknown parameters. The power of this approach will be demonstrated in the context of generalized linear models with general Gaussian design for which there exists no off-the-shelf random matrix theory results for the matrices of interest. Based on joint work in preparation with Hong Chang Ji, Marco Mondelli and Ramji Venkataramanan.

**Bio:**Yihan Zhang received the B.Eng. degree in computer science and technology from Northeastern University, Shenyang, China, in June 2016, and the Ph.D. degree from the Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong, in August 2020. He was a Post-Doctoral Researcher at the Henry and Marilyn Taub Faculty of Computer Science, Technion--Israel Institute of Technology, from October 2020 to October 2021. He has been a Post-Doctoral Researcher at the Institute of Science and Technology Austria since October 2021. His research interests include coding theory, information theory, and statistics theory (in no particular order).