Jonathan Kahana

Hi, I'm Jonathan Kahana

I am a Computer Science PhD student at the Hebrew University of Jerusalem, under the supervision of Prof. Yedid Hoshen. My research interests lie in the intersection of machine learning and computer vision. Currently, I am working on weight space learning, aiming to understand what information can be extracted from pre-trained neural networks. Previously, I focused on representation learning, specifically learning disentangled representations.

Email Scholar Resume Github Twitter

Publications

An image showing an overview of ProbeLog

Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights

Arxiv Preprint

Jonathan Kahana, Or Nathan, Eliahu Horwitz, Yedid Hoshen

We propose ProbeLog, a method for searching models in large model repositories. ProbeLog embeds each logit of classifier models separately and matches that representation with a text prompt. This allows users to search for models that can recognize a target concept, such as "Dog" without access to the model metadata or training data.

Project Page arXiv Code - Coming soon...

Atlas of Stable Diffusion vs. Llama

Charting and Navigating Hugging Face's Model Atlas

Arxiv Preprint

Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana, Liel Amar, Yedid Hoshen

We propose the Model Atlas, an interactive visualization and analysis tool for exploring large-scale AI model repositories. To explore the hidden potential of model repositories, we chart a preliminary atlas representing the documented fraction of Hugging Face. It provides stunning visualizations of the model landscape and evolution. We demonstrate several applications of this atlas including predicting model attributes (e.g., accuracy), and analyzing trends in computer vision models. However, as the current atlas remains incomplete, we propose a method for charting undocumented regions.

Project Page arXiv 🤗 Demo Code - Coming soon... 🤗 Dataset

An image showing an overview of ProbeX

Representing Model Weights with Language using Tree Experts

Arxiv Preprint

Eliahu Horwitz^*, Bar Cavia^*, Jonathan Kahana^*, Yedid Hoshen

We identify a key property of real-world models: most public models belong to a small set of Model Trees, where all models within a tree are fine-tuned from a common ancestor (e.g., a foundation model). Importantly, we find that within each tree there is less nuisance variation between models. We introduce Probing Experts (ProbeX), a theoretically motivated, lightweight probing method. Notably, ProbeX is the first probing method designed to learn from the weights of just a single model layer. Our results show that ProbeX can effectively map the weights of large models into a shared weight-language embedding space. Furthermore, we demonstrate the impressive generalization of our method, achieving zero-shot model classification and retrieval.

arXiv

An image showing an overview of ProbeGen

Deep Linear Probe Generators for Weight Space Learning

Arxiv Preprint

Jonathan Kahana, Eliahu Horwitz, Imri Shuval, Yedid Hoshen

We conduct a study of weight space analysis methods and observe that probing is a promising approach for such tasks. However, we find that a vanilla probing approach performs no better than probing a neural network with random data. To address this, we propose "Deep Linear Probe Generators" (ProbeGen), a simple and effective modification to probing-based methods of weight space analysis. ProbeGen introduces a shared generator module with a deep linear architecture, providing an inductive bias toward structured probes. ProbeGen significantly outperforms the state-of-the-art and is highly efficient, requiring 30 to 1,000 times fewer FLOPs than other leading approaches.

Project Page arXiv Code

Data Size Recovery from Lora Weights

Arxiv Preprint

Mohammad Salama, Jonathan Kahana, Eliahu Horwitz, Yedid Hoshen

We introduce the task of dataset size recovery that aims to determine the number of samples used to train a model based on its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots.

Project Page arXiv Code 🤗 Dataset

Recovering the Pre-Fine-Tuning Weights of Generative Models

ICML 2024

Eliahu Horwitz, Jonathan Kahana, Yedid Hoshen

The dominant paradigm in generative modeling consists of two steps: i) pre-training on a large-scale but unsafe dataset, ii) aligning the pre-trained model with human values via fine-tuning. This practice is considered safe, as no current method can recover the unsafe, pre-fine-tuning model weights. In this paper, we demonstrate that this assumption is often false. Concretely, we present Spectral DeTuning, a method that can recover the weights of the pre-fine-tuning model using a few low-rank (LoRA) fine-tuned models. In contrast to previous attacks that attempt to recover pre-fine-tuning capabilities, our method aims to recover the exact pre-fine-tuning weights. Our approach exploits this new vulnerability against large-scale models such as a personalized Stable Diffusion and an aligned Mistral.

Project Page arXiv Code 🤗 Dataset

Improving Zero-Shot Models with Label Distribution Priors

Arxiv Preprint

Jonathan Kahana, Niv Cohen Yedid Hoshen

We propose a new approach for zero-shot labeling of large image datasets, CLIPPR (CLIP with Priors), which adapts zero-shot models for regression and classification on unlabelled datasets. Our method does not use any annotated images. Instead, we assume a prior over the label distribution in the dataset. We then train an adapter network on top of CLIP under two competing objectives: i) minimal change of predictions from the original CLIP model ii) minimal distance between predicted and prior distribution of labels. Our method is effective and presents a significant improvement over the original model.

Project Page arXiv Code

An image showing anomalies and psuedo-anomalies from Red PANDA

Red PANDA: Disambiguating Anomaly Detection by Removing Nuisance Factors

ICLR 2023

Niv Cohen, Jonathan Kahana, Yedid Hoshen

We present a new anomaly detection method that allows operators to exclude an attribute from being considered as relevant for anomaly detection. Our approach then learns representations which do not contain information over the nuisance attributes. Anomaly scoring is performed using a density-based approach. Importantly, our approach does not require specifying the attributes that are relevant for detecting anomalies, which is typically impossible in anomaly detection, but only attributes to ignore. An empirical investigation is presented verifying the effectiveness of our approach.

arXiv

An image illustrating DCoDR representations.

A Contrastive Objective for Learning Disentangled Representations

ECCV 2022

Jonathan Kahana, Yedid Hoshen

We present a new approach for domain-disentanglement, proposing a new domain-wise contrastive objective for ensuring invariant representations. In an extensive evaluation, our method convincingly outperforms the state-of-the-art in terms of representation invariance, representation informativeness, and training speed. Furthermore, we find that in some cases our method can achieve excellent results even without the reconstruction constraint, leading to a much faster and resource efficient training.

arXiv Code