700字范文 > 笔记:基于点云的语义分割的小样本学习


时间:2023-10-26 00:38:08



笔记:Few-shot learning for tackling open-set generalization:



paper1:Few-shot 3D Point Cloud Semantic Segmentation


rely on large amounts of labeled training data, so they are time-consuming and expensive to collect.follow the closed set assumption.(训练集和测试集取自同一label space) ,泛化能力差。


multi-prototype transductive inference method.
transductive inference: 转导推理;是一种通过观察特点的样本,进而预测特定的测试样本的方法,是一种从特殊到特殊的推理,适合于小样本推理。不同于归纳推理,先从训练样本中学习规则,再用规则判断测试样本。


embedding network:

three properties:1.local geometric features; 2.global geometric features; 3. adapt to different few-shot tasks.DGGNN: the backbone of feature extractor.(local)SAN(self-attention network): generate semantic feature.(global)MLP: adapt to different few-shot tasks.

multi-prototype generation:

It samples a subset ofnseed points from a set of support points in one class using the farthest point sampling based on the embedding space.(对support set的每一类样本点farthest points sample,抽取n个seed point)The farthest points represent different perspectives of one class. (farthest points sample保证足够的感受野)

transductive inference:

use transductive label propagation to construct a graph on the labeled multi-prototypes and the unlabeled query points.(用k-NN建立相关类的图)

label propagation

cross-entropy loss function(交叉熵损失函数):

compute the cross-entropy loss with ground truth labels.

paper2:What Makes for Effective Few-shot Point Cloud Classification?


they require extensive data collection and retraining when dealing with novel classes never seen before.It is hard to study from existing 2D methods when migrating to the 3D domain.point clouds are more complex and have unorder structure in European space.

3D point cloud classification

projection-based: It first converts the irregular points into a representation like voxel, pillar, and then apply typical 2D or 3D CNN to extract features.point-based: It can learns point-wise features with multilayer perceptron(MLP) and aggregates global feature with a symmetric function implemented by a max-pooling layer.

2D few-shot learning

Metric-based: It focus on learning an embedding space where similar samples pairs are closer, or designing a metric function to compare the feature similarity of samples.Optimization-based: It regards meta-learning as an optimization process.

State-of-the-art 2D FSL on Point Cloud

compare the metric-based methods and optimization-based methods, and concludes that metric-based methodsoutperformthe optimization-based methods in point cloud scenario.

Influence of Backbone Architecture on FSL

select three types of current state-of-the-art 3D point-based networks includingPointwise-basedConvolution-basedGraph-based(DGCNN). One can conclude that the graph-based network DGCNN achieves higher classification accuracy than other networks on these two datasets.

Cross Instance Adaption (CIA) module

CIA can be inserted into existing backbones and learning frameworks to learn more discriminative representations for the support set and query set.

Embedding module把support-set和query-set作为输入分别进行特征提取得到他们的prototype,然后再通过CIA模块更新support-set和query-set,然后在特征空间计算每个class prototype和query examples的欧氏距离,最后便可得到损失函数并进行优化。

Self-Channel Interaction Module: address the issues of subtle inter-class differences.

先从embedding space分别由两个线性系数φ和γ得到q向量和k向量,然后通过CIM的双线性变换得到一个channel-wise relation score map -R, 然后进行softmax操作得到权重矩阵R’,最后得到更新的向量v是有R’与开始的特征向量加权和得到,vi越大说明特供信息越大,有利于区分class之间的细小差别。

Cross-Instance Fusion Module: address high intra-class variances issues

首先将support feature和query feature 连结起来得到Z,然后用两个卷积层来解码连结后的特征得到W,将W进行softmax操作得到权值矩阵后与Z点乘来更新support feature和query feature。

本文还提供了两个适用于3D FSL的数据集:ModelNet40-FS,ShapeNet70-FS
