# SetFit v1.0.0 Migration Guide

To update your code to work with v1.0.0, the following changes must be made:

## General Migration Guide

1. `keep_body_frozen` from `SetFitModel.unfreeze` has been deprecated, simply either pass `"head"`, `"body"` or no arguments to unfreeze both.
2. `SupConLoss` has been moved from `setfit.modeling` to `setfit.losses`. If you are importing it using `from setfit.modeling import SupConLoss`, then import it like `from setfit import SupConLoss` now instead.
3. `use_auth_token` has been renamed to `token` in `SetFitModel.from_pretrained()`. `use_auth_token` will keep working until the next major version, but with a warning.

## Training Migration Guide

1. Replace all uses of `SetFitTrainer` with [Trainer](/docs/setfit/main/en/reference/trainer#setfit.Trainer), and all uses of `DistillationSetFitTrainer` with [DistillationTrainer](/docs/setfit/main/en/reference/trainer#setfit.DistillationTrainer).
2. Remove `num_iterations`, `num_epochs`, `learning_rate`, `batch_size`, `seed`, `use_amp`, `warmup_proportion`, `distance_metric`, `margin`, `samples_per_label` and `loss_class` from a `Trainer` initialization, and move them to a `TrainerArguments` initialization instead. This instance should then be passed to the trainer via the `args` argument.

    * `num_iterations` has been deprecated, the number of training steps should now be controlled via `num_epochs`, `max_steps` or [`EarlyStoppingCallback`](https://huggingface.co/docs/transformers/main_classes/callback#transformers.EarlyStoppingCallback).
    * `learning_rate` has been split up into `body_learning_rate` and `head_learning_rate`.
    * `loss_class` has been renamed to `loss`.

3. Stop providing training arguments like `num_epochs` directly to `Trainer.train`: pass a `TrainingArguments` instance via the `args` argument instead.
4. Refactor multiple `trainer.train()`, `trainer.freeze()` and `trainer.unfreeze()` calls that were previously necessary to train the differentiable head into just one `trainer.train()` call by setting `batch_size` and `num_epochs` on the `TrainingArguments` dataclass with tuples. The first value in the tuple is for training the embeddings, and the second is for training the classifier. 

## Hard deprecations

* `SetFitBaseModel`, `SKLearnWrapper` and `SetFitPipeline` have been removed. These can no longer be used starting from v1.0.0.

## v1.0.0 Changelog

This list contains new functionality that can be used starting from v1.0.0.

* `SetFitModel.from_pretrained()` now accepts new arguments:
    * `device`: Specifies the device on which to load the SetFit model.
    * `labels`: Specify labels corresponding to the training labels - useful if the training labels are integers ranging from `0` to `num_classes - 1`. These are automatically applied on calling [SetFitModel.predict()](/docs/setfit/main/en/reference/main#setfit.SetFitModel.predict).
    * `model_card_data`: Provide a [SetFitModelCardData](/docs/setfit/main/en/reference/main#setfit.SetFitModelCardData) instance storing data such as model language, license, dataset name, etc. to be used in the automatically generated model cards.
* Certain SetFit configuration options, such as the new `labels` argument from `SetFitModel.from_pretrained()`, now get saved in `config_setfit.json` files when a model is saved. This allows `labels` to be automatically fetched when a model is loaded.
* [SetFitModel.predict()](/docs/setfit/main/en/reference/main#setfit.SetFitModel.predict) now accepts new arguments:
    * `batch_size` (defaults to `32`): The batch size to use in encoding the sentences to embeddings. Higher often means faster processing but higher memory usage.
    * `use_labels` (defaults to `True`): Whether to use the `SetFitModel.labels` to convert integer labels to string labels. Not used if the training labels are already strings.
* [SetFitModel.encode()](/docs/setfit/main/en/reference/main#setfit.SetFitModel.encode) has been introduce to convert input sentences to embeddings using the `SentenceTransformer` body.
* [SetFitModel.device](/docs/setfit/main/en/reference/main#setfit.AspectModel.device) has been introduced to determine the device of the model.
* [AbsaTrainer](/docs/setfit/main/en/reference/trainer#setfit.AbsaTrainer) and [AbsaModel](/docs/setfit/main/en/reference/main#setfit.AbsaModel) have been introduced for applying [SetFit for Aspect Based Sentiment Analysis](absa).
* [Trainer](/docs/setfit/main/en/reference/trainer#setfit.Trainer) now supports a `callbacks` argument for a list of [`transformers` `TrainerCallback` instances](https://huggingface.co/docs/transformers/main/en/main_classes/callback).
    * By default, all installed callbacks integrated with `transformers` are supported, including [`TensorBoardCallback`](https://huggingface.co/docs/transformers/main/en/main_classes/callback#transformers.integrations.TensorBoardCallback), [`WandbCallback`](https://huggingface.co/docs/transformers/main/en/main_classes/callback#transformers.integrations.WandbCallback) to log training logs to [TensorBoard](https://www.tensorflow.org/tensorboard) and [W&B](https://wandb.ai), respectively.
    * The [Trainer](/docs/setfit/main/en/reference/trainer#setfit.Trainer) will now print `embedding_loss` in the terminal, as well as `eval_embedding_loss` if `eval_strategy` is set to `"epoch"` or `"steps"` in [TrainingArguments](/docs/setfit/main/en/reference/trainer#setfit.TrainingArguments).
* [Trainer.evaluate()](/docs/setfit/main/en/reference/trainer#setfit.Trainer.evaluate) now works with string labels.
* An updated contrastive pair sampler increases the variety of training pairs.
* [TrainingArguments](/docs/setfit/main/en/reference/trainer#setfit.TrainingArguments) supports various new arguments:
    * `output_dir`: The output directory where the model predictions and checkpoints will be written.
    * `max_steps`: If set to a positive number, the total number of training steps to perform. Overrides num_epochs. The training may stop before reaching the set number of steps when all data is exhausted.
    * `sampling_strategy`: The sampling strategy of how to draw pairs in training. Possible values are:

        * `"oversampling"`: Draws even number of positive/negative sentence pairs until every sentence pair has been drawn.
        * `"undersampling"`: Draws the minimum number of positive/negative sentence pairs until every sentence pair in the minority class has been drawn.
        * `"unique"`: Draws every sentence pair combination (likely resulting in unbalanced number of positive/negative sentence pairs).

    The default is set to `"oversampling"`, ensuring all sentence pairs are drawn at least once. Alternatively, setting `num_iterations` will override this argument and determine the number of generated sentence pairs.
    * `report_to`: The list of integrations to report the results and logs to. Supported platforms are `"azure_ml"`, `"comet_ml"`, `"mlflow"`, `"neptune"`, `"tensorboard"`,`"clearml"` and `"wandb"`. Use `"all"` to report to all integrations installed, `"none"` for no integrations.
    * `run_name`: A descriptor for the run. Typically used for [wandb](https://wandb.ai/) and [mlflow](https://www.mlflow.org/) logging.
    * `logging_strategy`: The logging strategy to adopt during training. Possible values are:

        - `"no"`: No logging is done during training.
        - `"epoch"`: Logging is done at the end of each epoch.
        - `"steps"`: Logging is done every `logging_steps`.

    * `logging_first_step`: Whether to log and evaluate the first `global_step` or not.
    * `logging_steps`: Number of update steps between two logs if `logging_strategy="steps"`.
    * `eval_strategy`: The evaluation strategy to adopt during training. Possible values are:

        - `"no"`: No evaluation is done during training.
        - `"steps"`: Evaluation is done (and logged) every `eval_steps`.
        - `"epoch"`: Evaluation is done at the end of each epoch.

    * `eval_steps`: Number of update steps between two evaluations if `eval_strategy="steps"`. Will default to the same as `logging_steps` if not set.
    * `eval_delay`: Number of epochs or steps to wait for before the first evaluation can be performed, depending on the `eval_strategy`.
    * `eval_max_steps`: If set to a positive number, the total number of evaluation steps to perform. The evaluation may stop before reaching the set number of steps when all data is exhausted.
    * `save_strategy`: The checkpoint save strategy to adopt during training. Possible values are:

        - `"no"`: No save is done during training.
        - `"epoch"`: Save is done at the end of each epoch.
        - `"steps"`: Save is done every `save_steps`.

    * `save_steps`: Number of updates steps before two checkpoint saves if `save_strategy="steps"`.
    * `save_total_limit`: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in `output_dir`. Note, the best model is always preserved if the `eval_strategy` is not `"no"`.
    * `load_best_model_at_end`: Whether or not to load the best model found during training at the end of training.

    

    When set to `True`, the parameters `save_strategy` needs to be the same as `eval_strategy`, and in
    the case it is "steps", `save_steps` must be a round multiple of `eval_steps`.

    
* Pushing SetFit or SetFitABSA models to the Hub with `SetFitModel.push_to_hub()` or [AbsaModel.push_to_hub()](/docs/setfit/main/en/reference/main#setfit.AbsaModel.push_to_hub) now results in a detailed model card. As an example, see [this SetFitModel](https://huggingface.co/tomaarsen/setfit-paraphrase-mpnet-base-v2-sst2-8-shot) or [this SetFitABSA polarity model](https://huggingface.co/tomaarsen/setfit-absa-bge-small-en-v1.5-restaurants-polarity).