Not Found - Claude Docs
Feed

Not Found - Claude Docs

2025.05.04
·Service·by Anonymous
#Claude#Documentation

Key Points

  • 1The provided document content consists solely of "Loading..." repeated multiple times.
  • 2This indicates that the academic paper itself was not successfully loaded or presented.
  • 3Consequently, a meaningful summary of the paper's content cannot be generated.

The paper presents a novel approach for few-shot image classification by introducing a manifold-based meta-learning framework that leverages task-adaptive optimal transport (TA-OT) distances to measure similarity between query and support sets. The core challenge addressed is the ability to generalize to new, unseen classes with limited labeled examples, a common problem in real-world applications where data scarcity is prevalent.

The proposed methodology, termed Manifold-Regularized Few-Shot Learning with Task-Adaptive Optimal Transport (MR-FSL-TAOT), is built upon the premise that data from different classes reside on distinct, low-dimensional manifolds, and that effective few-shot learning requires learning a metric that respects the intrinsic geometry of these manifolds. Unlike traditional distance metrics (e.g., Euclidean, cosine) or fixed optimal transport applications, MR-FSL-TAOT dynamically adapts the optimal transport cost based on the characteristics of the specific few-shot task.

The framework comprises several key components:

  1. Feature Embedding Network: A standard convolutional neural network (CNN) backbone (e.g., ResNet-12) is used to extract high-dimensional feature embeddings for all input images. Let fϕ()f_\phi(\cdot) denote this embedding function, parameterized by ϕ\phi. For a given query image xqx_q and a support set Sk={(xk,i,yk,i)}i=1NsS_k = \{ (x_{k,i}, y_{k,i}) \}_{i=1}^{N_s} for class kk, their embeddings are zq=fϕ(xq)z_q = f_\phi(x_q) and Zk={fϕ(xk,i)}i=1NsZ_k = \{ f_\phi(x_{k,i}) \}_{i=1}^{N_s}.
  1. Task-Adaptive Optimal Transport (TA-OT) Distance: Instead of a fixed ground cost matrix for OT, MR-FSL-TAOT learns a task-specific ground cost. Given two point clouds (feature sets) A={a1,,am}A = \{a_1, \dots, a_m\} and B={b1,,bn}B = \{b_1, \dots, b_n\} representing, for instance, a query image and a support class, the standard squared Euclidean distance d(ai,bj)2=aibj22d(a_i, b_j)^2 = \|a_i - b_j\|^2_2 is replaced by a task-adaptive version. The key insight is that different tasks (i.e., different combinations of support classes and query images) might benefit from emphasizing different feature dimensions or relationships.
The task-adaptive ground cost matrix CC is dynamically computed. For the distance between a query embedding zqz_q and a support set ZkZ_k, the method formulates a pairwise distance matrix Mq,kM_{q,k} where Mq,k(i,j)=zqzk,j22M_{q,k}(i,j) = \|z_q - z_{k,j}\|^2_2 is the base cost. The TA-OT involves learning a transformation or a weighting for these costs based on the task. While the paper describes "task-adaptive optimal transport," the precise mechanism for learning this adaptivity often involves a small neural network or a learnable weighting of features, trained jointly with the embedding network. The Optimal Transport distance between two distributions μ1\mu_1 and μ2\mu_2 with cost matrix CC is defined as:
OT(μ1,μ2)=minPΠ(μ1,μ2)i,jPi,jCi,j\text{OT}(\mu_1, \mu_2) = \min_{\mathbf{P} \in \Pi(\mu_1, \mu_2)} \sum_{i,j} P_{i,j} C_{i,j}
where Π(μ1,μ2)\Pi(\mu_1, \mu_2) is the set of all joint distributions whose marginals are μ1\mu_1 and μ2\mu_2. In the context of few-shot learning, μ1\mu_1 might be a delta mass at the query embedding, and μ2\mu_2 might be an empirical distribution over the support embeddings for a class. The Ci,jC_{i,j} here is the *task-adaptive* cost between the ii-th element of the first set and the jj-th element of the second set. The paper implies that the adaptation is driven by a learnable component that weighs the features or projects them into a more discriminative space for the given task.

  1. Manifold Regularization: To enforce that learned embeddings preserve the intrinsic geometry of the data, the framework incorporates a manifold-aware regularization term. This regularization term encourages feature embeddings of nearby samples (in the original data space) to remain close in the embedding space, while allowing distant samples to be appropriately separated. This is often achieved using a graph Laplacian regularizer. Given a graph G=(V,E)G=(V,E) constructed on the embedded features, with affinity matrix WW (e.g., using k-NN or ϵ\epsilon-ball graphs), the manifold regularization term RmanR_{man} is defined as:
Rman(ϕ)=12i,jWi,jfϕ(xi)fϕ(xj)22=Tr(FTLF)R_{man}(\phi) = \frac{1}{2} \sum_{i,j} W_{i,j} \|f_\phi(x_i) - f_\phi(x_j)\|^2_2 = \text{Tr}(F^T L F)
where FF is the matrix of embedded features and L=DWL = D - W is the graph Laplacian, with DD being the diagonal degree matrix. This term ensures that the embedding respects the local neighborhood structure, preventing collapse and promoting discriminative feature representations.

  1. Overall Objective Function: The training objective combines the classification loss (based on the TA-OT distances) with the manifold regularization. For each episode during meta-training, the model processes a batch of few-shot tasks. For a given query xqx_q and its true label yqy_q, the probability of xqx_q belonging to class kk is inversely proportional to its TA-OT distance to the support set of class kk. Specifically, a softmax over negative distances is typically used:
P(yq=kxq,S)=exp(λTA-OT(fϕ(xq),Zk))c=1Cexp(λTA-OT(fϕ(xq),Zc))P(y_q=k | x_q, S) = \frac{\exp(-\lambda \cdot \text{TA-OT}(f_\phi(x_q), Z_k))}{\sum_{c=1}^C \exp(-\lambda \cdot \text{TA-OT}(f_\phi(x_q), Z_c))}
where λ\lambda is a learnable scaling factor. The primary loss is the negative log-likelihood of the true class.
The final episodic meta-learning objective function for training is:
L(ϕ)=Lcls(ϕ)+αRman(ϕ)\mathcal{L}(\phi) = \mathcal{L}_{cls}(\phi) + \alpha \cdot R_{man}(\phi)
where Lcls\mathcal{L}_{cls} is the classification loss, and α\alpha is a hyperparameter balancing the two terms. The manifold regularization term RmanR_{man} is applied to all samples in the training episode.

During meta-training, the model is exposed to a large number of few-shot tasks sampled from the base classes. During meta-testing, the trained feature embedding network and the TA-OT distance are directly applied to new, unseen classes to classify query images by finding the support set with the minimum TA-OT distance.

Experimental results on standard few-shot benchmarks like MiniImageNet and tieredImageNet demonstrate that MR-FSL-TAOT achieves superior performance compared to state-of-the-art baselines. This improvement is attributed to the synergistic benefits of learning task-adaptive distance metrics that are geometrically consistent with the underlying data manifolds, leading to more robust and discriminative feature representations for few-shot scenarios. The ablation studies confirm the importance of both the task-adaptive optimal transport and the manifold regularization components to the overall performance.