Write a Model Adapter
=====================

The interface between your model and the evaluation on our metrics and datasets is provided as a ``vqa_benchmarking_backend.datasets.dataset.DatasetModelAdapter`` .
An adapter wraps around a model and is required to return a probability distribution from its ``_forward()`` function.
During metric calculation, the ``_forward()`` function recieves a list of ``DataSample``s that need to be transformed to fit your models expected input.

Some general getter functions are required, e.g. its name ``get_name()``, output size ``get_output_size()``, and the model itself ``get_torch_module()`` .
The functions ``get_question_embedding()`` and ``get_image_embedding()`` should fill the properties ``question_features`` 
and ``image_features`` of a ``DataSample`` object respectively in order to enable caching and appyling noise onto the feature representations. 

Now that we have a created a ``DatasetModelAdapter``, we can start evaluating (see :ref:`Evaluate Metrics`) .

.. code-block:: python

    from vqa_benchmarking_backend.datasets.dataset import DatasetModelAdapter

    class MyModelAdapter(DatasetModelAdapter):
        """
        NOTE: when inheriting from this class, make sure to

            * move the model to the intended device
            * move the data to the intended device inside the _forward method
        """
        def __init__(self, 
                     device,
                     vocab: Vocabulary, 
                     ckpt_file: str = '',
                     name: str,
                     n_classes: int) -> None:

            self.device = device
            self.vocab = vocab
            self.name = name
            self.n_classes = n_classes
            self.vqa_model = myModel().to(device) # the pytorch instance of the VQA model
            self.vqa_model.load_state_dict(torch.load(ckpt_file, map_location=device)['state_dict'])

            gpu_id = int(device.split(':')[1]) # cuda:ID -> ID
            self.img_feat_extractor, self.img_feat_cfg = setup("bottomupattention/configs/bua-caffe/extract-bua-caffe-r101.yaml", 10, 100, gpu_id)  # in this example, we load an external image feature extractor


        def get_name(self) -> str:
            # Needed for file caching, has to be overriden
            return self.name

        def get_output_size(self) -> int:
            # number of classes in prediction, has to be overriden
            return self.n_classes

        def get_torch_module(self) -> torch.nn.Module:
            # return the pytorch VQA model, has to be overriden
            return self.vqa_model

        def question_token_ids(self, question_tokenized: List[str]) -> torch.LongTensor:
            # helper function to get token ids as input to our VQA model, custom to this example
            return torch.tensor([self.vocab.stoi(token) if self.vocab.exists(token) else self.vocab.stoi('UNK') for token in question_tokenized], dtype=torch.long)

        def get_question_embedding(self, sample: DataSample) -> torch.FloatTensor:
            # embed questions without full model forward-pass, has to be overriden
            if isinstance(sample.question_features, type(None)):
                sample.question_features = self.vqa_model.embedding(self.question_token_ids(sample.question_tokenized).to(self.device)).cpu()
            return sample.question_features

        def get_image_embedding(self, sample: DataSample) -> torch.FloatTensor:
            # embed images without full model forward-pass, has to be overriden
            # in this example, the feature extractor is external
            if isinstance(sample.image_features, type(None)):
                sample.image_features = extract_feat_in_memory(self.img_feat_extractor, sample._image_path, self.img_feat_cfg)['x'].cpu()
            return sample.image_features

        def _forward(self, samples: List[DataSample]) -> torch.FloatTensor:
            """
            Overwrite this function to run a forward-pass of a list of samples using your model.
            IMPORTANT: 
                * Make sure that the outputs are probabilities, not logits!
                * Make sure that the data samples are using the samples' question embedding field, if assigned (instead of re-calculating them, they could be modified from feature space methods)
                * Make sure that the data samples are moved to the intended device here
            """
            q_feats = pad_sequence(sequences=[self.get_question_embedding(sample).to(self.device) for sample in samples], batch_first=True) # extract question features
            img_feats = pad_sequence(sequences=[self.get_image_embedding(sample).to(self.device) for sample in samples], batch_first=True) # extract image features

            logits = self.vqa_model.forward(img_feats, q_feats) # run forward-pass for our VQA model
            probs = logits.softmax(dim=-1) # convert model outputs to probability distribution across answer space

            return probs