Communications and Signal Processing Seminar Series
Communications and Signal Processing Seminar Series
  • Видео 99
  • Просмотров 59 716
Transformer Meets Nonparametric Kernel Regression
Tan Minh Nguyen
Assistant Professor
National University of Singapore
Abstract: Pairwise dot-product self-attention is key to the success of transformers that achieve state-of-the-art performance across a variety of applications in language and vision. This dot-product self-attention computes attention weights among the input tokens using Euclidean distance, which makes the model prone to representation collapse and vulnerable to contaminated samples. In this talk, we interpret attention in transformers as a nonparametric kernel regression, which uses an isotropic Gaussian kernel for density estimation. From the non-parametric regression perspective, we show that spherical invariance in the i...
Просмотров: 81

Видео

Recent Advances in Average-Reward Restless Bandits
Просмотров 67День назад
Weina Wang Assistant Professor Carnegie Mellon University Abstract: We consider the infinite-horizon, average-reward restless bandit problem, where a central challenge is designing computationally efficient policies that achieve a diminishing optimality gap as the number of arms, N, grows large. Existing policies, including the renowned Whittle index policy, all rely on a global attractor prope...
Quantum Bayesian Framework for Efficient Storage of Quantum Information
Просмотров 7421 день назад
Sandeep Pradhan Professor Electrical Engineering and Computer Science Abstract: Superposition, entanglement and nonlocality are the hallmarks of the quantum framework. In this talk, we consider the problem of reliable storage of quantum information with qubit rate below its von Neumann entropy with controlled loss. This requires a transformation of the quantum source state into a more entangled...
Adventures in PCA for Heterogeneous Data: Optimal Weights and Rank Estimation
Просмотров 61Месяц назад
David Hong Assistant Professor University of Delaware Abstract: PCA is a textbook method for discovering underlying low-rank signals in noisy high-dimensional data and is ubiquitous throughout machine learning, data science, and engineering. But what happens to this workhorse technique when the data are heterogeneous? This talk presents recent progress on understanding (and improving) PCA for s...
What’s in my networks? On learned proximals and testing for interpretations
Просмотров 206Месяц назад
Jeremias Sulam Assistant Professor Johns Hopkins University, BME Abstract: Modern machine learning methods are revolutionizing what we can do with data - from tiktok video recommendations to biomarkers discovery in cancer research. Yet, the complexity of these deep models makes it harder to understand what functions these data-dependent models are computing, and which features they detect regar...
Deep Representation Interface for Engineering Problems
Просмотров 239Месяц назад
Xiangxiang Xu Postdoctoral Associate Massachusetts Institute of Technology Abstract: Using deep neural networks (DNNs) as elements of engineering solutions can potentially enhance the system’s overall performance. However, existing black-box practices with DNNs are incompatible with the modularized design of engineering systems. To tackle this problem, we propose using feature representations a...
Label Noise: Ignorance is Bliss
Просмотров 1692 месяца назад
Clay Scott Professor Electrical Engineering and Computer Science Abstract: We establish a new theoretical framework for learning under multi-class, instance-dependent label noise. At the heart of our framework is the concept of \emph{relative signal strength} (RSS), which is a point-wise measure of noisiness. Using relative signal strength, we establish nearly matching upper and lower bounds fo...
Domain Counterfactuals for Explainability, Fairness, and Domain Generalization
Просмотров 1362 месяца назад
David I. Inouye Assistant Professor, ECE Purdue University Abstract: Although incorporating causal concepts into deep learning shows promise for increasing explainability, fairness, and robustness, existing methods require unrealistic assumptions or aim to recover the full latent causal model. This talk proposes an alternative: domain counterfactuals. Domain counterfactuals ask a more concrete ...
Approximate independence of permutation mixtures
Просмотров 1042 месяца назад
Yanjun Han Assistant Professor New York University See slides at below link (slides cannot be seen in video). drive.google.com/file/d/1BVF3iskR1gn5laMq-35Xl0whyDrBd7hq/view?usp=sharing Abstract: prove bounds on statistical distances between high-dimensional exchangeable mixture distributions (which we call \emph{permutation mixtures}) and their i.i.d. counterparts. Our results are based on a no...
Understanding Distribution Learning of Diffusion Models via Low-dimensional Modeling
Просмотров 2582 месяца назад
Peng Wang Postdoc Research Fellow Electrical Engineering and Computer Science Abstract: Recent empirical studies have demonstrated that diffusion models can effectively learn the image distribution and generate new samples. Remarkably, these models can achieve this even with a small number of training samples despite a large image dimension, circumventing the curse of dimensionality. In this wo...
A Statistical Framework for Private Personalized Federated Learning and Estimation
Просмотров 942 месяца назад
Suhas Diggavi Professor University of California, Los Angeles Abstract: In federated learning, edge nodes collaboratively build learning models from locally generated data. Federated learning (FL) introduces several unique challenges to traditional learning including (i) need for privacy guarantees on the locally residing data (ii) communication efficiency from edge devices (iii) robustness to ...
Computational Imaging through Atmospheric Turbulence
Просмотров 1913 месяца назад
Stanley Chan, Nick Chimitt Elmore Professor of ECE, Research Scientist Purdue University Abstract: Long-range imaging is an important task for many civilian and military applications. However, when seeing through a long distance, the random effects of the atmosphere will cause severe distortions to the images. Since Andrey Kolmogorov (40s), Valerian Tatarski (50s), and David Fried (70s), the su...
Permutation-Free Kernel Hypothesis Tests
Просмотров 1833 месяца назад
Shubhanshu Shekhar Assistant Professor Electrical Engineering and Computer Science Abstract: Kernel-based methods have become popular in recent years for solving nonparametric testing problems, such as two-sample and independence testing. However, the test statistics involved in these methods (such as kernel-MMD and HSIC) have intractable null distributions, as they are instances of degenerate ...
Interdependent privacy in third-party apps: past, present, and future
Просмотров 423 месяца назад
Gergely Biczok Associate Professor Budapest University of Technology and Economics Abstract: In today’s networked online environments, privacy has become a complex affair. An important aspect of this complexity stems from the prominent interconnectedness of individuals, and therefore their personal data. Given this interconnectedness, an individual’s privacy is bound to be affected by the data ...
Novel Technologies for Accelerated MRI
Просмотров 1456 месяцев назад
Nick Dwork Assistant Professor University of Colorado Anschutz Abstract: MRI is a ubiquitously used gross imaging modality in clinical settings. However, its long scan time and the requirement that patients remain still dramatically limit its applications. In this talk, we will discuss several new technologies for accelerating MRI. We will discuss an application of the Fast Fourier Transform th...
Data-Analytic Opportunities and Challenges in Solar Eruption Forecasting
Просмотров 528 месяцев назад
Data-Analytic Opportunities and Challenges in Solar Eruption Forecasting
Edge and IoT-supported Intelligent Augmented Reality: Promise, Challenges, and Solutions
Просмотров 1018 месяцев назад
Edge and IoT-supported Intelligent Augmented Reality: Promise, Challenges, and Solutions
Distributed learning on correlated data with communication constraints
Просмотров 1268 месяцев назад
Distributed learning on correlated data with communication constraints
Image Restoration through Inversion by Direct Iteration (InDI)
Просмотров 4729 месяцев назад
Image Restoration through Inversion by Direct Iteration (InDI)
Formal privacy guarantees for optimization datasets in power systems
Просмотров 1389 месяцев назад
Formal privacy guarantees for optimization datasets in power systems
Select Topics Associated with 3D Scanning by Means of Structured Light Illumination
Просмотров 7110 месяцев назад
Select Topics Associated with 3D Scanning by Means of Structured Light Illumination
On False Positive Error
Просмотров 18310 месяцев назад
On False Positive Error
Accelerated Optimization for Dynamic MRI Reconstruction with Locally Low-Rank Regularizers
Просмотров 29511 месяцев назад
Accelerated Optimization for Dynamic MRI Reconstruction with Locally Low-Rank Regularizers
Quantum Signal Processing
Просмотров 2,3 тыс.Год назад
Quantum Signal Processing
Wireless is back: The 6G Revolution Towards Connected Intelligence
Просмотров 183Год назад
Wireless is back: The 6G Revolution Towards Connected Intelligence
Breaking the Sample Size Barrier in Reinforcement Learning via Model-Based Algorithms
Просмотров 727Год назад
Breaking the Sample Size Barrier in Reinforcement Learning via Model-Based Algorithms
Towards a Theoretical Understanding of Parameter-Efficient Fine-Tuning (and Beyond)
Просмотров 1,5 тыс.Год назад
Towards a Theoretical Understanding of Parameter-Efficient Fine-Tuning (and Beyond)
Overparameterization and Global Optimality in Nonconvex Low-Rank Matrix Estimation and Optimization
Просмотров 717Год назад
Overparameterization and Global Optimality in Nonconvex Low-Rank Matrix Estimation and Optimization
Sparsity for Efficient Long Sequence Generation of LLMs
Просмотров 996Год назад
Sparsity for Efficient Long Sequence Generation of LLMs
Theoretical Characterization of Forgetting and Generalization of Continual Learning
Просмотров 502Год назад
Theoretical Characterization of Forgetting and Generalization of Continual Learning

Комментарии

  • @Robingreat3095
    @Robingreat3095 2 месяца назад

    How to disabled edited tag symbol in what's app edit messages

  • @RolandElvira-l4y
    @RolandElvira-l4y 3 месяца назад

    Wilson Donald Perez Helen Taylor Sarah

  • @guanlinchang
    @guanlinchang 4 месяца назад

    Haha how are you Kane.

  • @해위잉
    @해위잉 8 месяцев назад

    Thx!

  • @maktube_220
    @maktube_220 9 месяцев назад

    great presentation!!

  • @iamsiddhantsahu
    @iamsiddhantsahu 11 месяцев назад

    This is a very interesting talk -- quite helpful for my own research!

  • @dirtbikersteve
    @dirtbikersteve 11 месяцев назад

    Is there an implementation of this algorithm (NTMA) anywhere for testing?

  • @arthurzhang8759
    @arthurzhang8759 11 месяцев назад

    a natural question to ask is : can the random sparse pruning apply to GPT-style large models to achieve small models?

  • @Mome3600
    @Mome3600 11 месяцев назад

    Thanks ! :D The LIFT idea is absolutely amazing !((:

  • @imrajsingh5561
    @imrajsingh5561 11 месяцев назад

    Thanks for the great talk. I am trying to get a more inuitive understanding. In my mind's eye by taking a low-rank approximation of a network it is sort of like choosing the initialisation for the subsequent optimisation?

  • @r.d.7575
    @r.d.7575 11 месяцев назад

    Wonderful talk.

  • @KevinWeiss-2023
    @KevinWeiss-2023 Год назад

    lol🤣🤣🤣

  • @Muhammad-go5xb
    @Muhammad-go5xb Год назад

    *Promo SM*

  • @u2b83
    @u2b83 Год назад

    from torch.optim import SGD from torch.optim.optimizer import Optimizer from torch.autograd import grad class SAM(Optimizer): def step(self, closure=None): loss = None if closure is not None: loss = closure() for group in self.param_groups: for p in group['params']: if p.grad is None: continue grad_params = grad(loss, p, create_graph=True) v = grad_params[0] p_s = -v / torch.norm(v) p.data.add_(p_s, alpha=lr) return loss # Usage model = ... optimizer = SAM(model.parameters(), lr=0.01) ############## # 2-step back prop # Backpropagate the image loss img_loss = mse(generated, real_fulldata.detach()) img_loss.backward(retain_graph=True) # Zero gradients manually before the first step of the SAM optimizer optimizer.zero_grad() # Perform the first step of the SAM optimizer (which is the base optimizer's step) optimizer.step() # Perform the second step of the SAM optimizer img_loss.backward()

  • @minsookim-ql1he
    @minsookim-ql1he Год назад

    Thanks!

  • @jodbakafran2802
    @jodbakafran2802 Год назад

    Nice presentation on Robust RL, is there a link to have the slide please?

  • @nhonth2011
    @nhonth2011 Год назад

    Nice talk!

  • @Bert2997
    @Bert2997 Год назад

    Are the slides available somewhere? Thanks!

  • @TheCrmagic
    @TheCrmagic Год назад

    Thank you for the talk.

  • @aminuabdulsalami4325
    @aminuabdulsalami4325 Год назад

    Awesome !!! Never ever thought sparsity in NN could be this interesting.

  • @TheCrmagic
    @TheCrmagic Год назад

    These talks are a really valuable resource, thank you for sharing them.

  • @manojtaleka954
    @manojtaleka954 2 года назад

    Informative session on Secure aggregation for federated learning and privacy concerns in multi-round secure aggregation.

  • @bahaaelden6234
    @bahaaelden6234 2 года назад

    so clear really thanks so much

  • @blackyogurt
    @blackyogurt 2 года назад

    😍😍😍

  • @meghaldarji598
    @meghaldarji598 2 года назад

    Was looking for few resources to understand what inductive bias meant. This video helped me a lot. Thank you

  • @sourabhbhattacharya9133
    @sourabhbhattacharya9133 2 года назад

    Thank you Maam, it was really a wonderful 70min of FL.

  • @shaluyadav2036
    @shaluyadav2036 2 года назад

    Thanks for giving such interesting facts. My question is in context of framework I.e. in which framework(TENSORFLOW/FATE etc) this work is carried out? And one more question is can we use this setting in analyzing traffic in ITS as there are different devices(Sensor,mobile devices,GPS, on-board sensor)which can participate in this?

  • @gary8421
    @gary8421 2 года назад

    The speaker was excellent but the question askers (Joseph started around 50:00 and Vijay, the host) were terrible and even malicious.

  • @jfjfcjcjchcjcjcj9947
    @jfjfcjcjchcjcjcj9947 2 года назад

    nice dedication 😉

  • @minhnampham1386
    @minhnampham1386 3 года назад

    the video image is too poor, you need to fix it more

  • @leiwu232
    @leiwu232 3 года назад

    Great talk!

  • @hoangngocdiep6854
    @hoangngocdiep6854 3 года назад

    how did you do it can you share with me , thank you

  • @MrSaqibsaqi
    @MrSaqibsaqi 3 года назад

    Can you please share the slides ?

  • @runggp
    @runggp 3 года назад

    awesome talk, very sexy approach on a hot topic!

  • @yimaphd8750
    @yimaphd8750 3 года назад

    A complete version of this work can be found at: arxiv.org/pdf/2105.10446.pdf