January 18, 2025

Key Idea

Learning from a few data is called few-shot learning. If we use k data points and j tasks for training, we call it “k-shot j-way learning”. The basic idea of few-shot learning is to learn how to learn. Let’s see it in detail. Generally speaking, Two types of datasets are prepared for this learning process, the support set and the query set. These are sampled from a dataset, and we use them for learning how to learn models. First, we sample support sets from a dataset and use them for model optimization. The second procedure is, to sample query sets from different data points to use for testing.

There are 3 types of meta-learning,

  • metric space learning
    • It will learn the appropriate metric space for a dataset. Basically, it is going to learn the difference in data points between the different classes. Siamese networks, Triplet networks, prototypical networks, and relation networks are good examples. They are going to learn similarities or distances between classes.
  • Learning initializers
    • These methods are going to find good initial parameters for machine learning models. On the first loop, These methods are going to optimize several weights for several tasks on the support set. And then, these methods are going to find optimal initial parameters using several weights which are gotten from the previous step on the second loop. Instead of initializing random the weights randomly, we can attain convergence faster. MAML, Reptile, and Meta-SGD are belonging to these methods.
  • Learning optimizers
    • It is going to learn optimizers itself by regarding optimizers as deep neural networks. The optimizer, (which is actually a neural network like RNN) exports parameters for optimizing the model (which is actually what we’d like to get, called Omtimizee). And for optimizing the optimizers, we use gradient descent using the loss from the model. This is absolutely complicated and difficult to use.