vqa_benchmarking_backend.datasets.dataset

Module Contents

Classes

DataSample

Superclass for data samples

DiagnosticDataset

Superclass for custom datasets, inheriting from original pytorch dataset class.

DatasetModelAdapter

Superclass for model adapters.

class vqa_benchmarking_backend.datasets.dataset.DataSample(question_id: str, question: str, answers: Dict[str, float], image_id: str, image_path: str, image_feat_path: str, image_transform=None)

Superclass for data samples

property question_id(self) str

Returns: Question id as a string.

property question(self) str

Returns: Original question as string.

property answers(self) Dict[str, float]

Returns: A mapping from each answer (ground truth) as string to the respecive score. E.g. if you have only one answer per question, the score is 1.0 and the return value might look like this:

{‘yellow’: 1.0}

But if you have multiple possible answers per question, some answers might be better (e.g. higher inter-annotator agreement, like in VQA2):

{‘yellow’: 0.9, ‘gold’: 0.1}

property image_id(self) str

Returns: Image identifier as string. Neccessary for loading the corresponding image for this question-image pair later.

property image(self) numpy.ndarray

Returns: Image data as numpy array, e.g. load using the pillow library or opencv. NOTE: This property has to be overriden to load your custom image data.

property image_features(self) Union[torch.FloatTensor, None]

Returns: Processed image features as float tensor. Recommended assignment in the vqa_benchmarking_backend.datasets.dataset.ModelAdapter.get_image_embedding method.

property question_tokenized(self) List[str]

Returns: Tokenized version of the original question.

property question_features(self) Union[None, torch.FloatTensor]

Returns: Processed question features / embedding as float tensor. Recommended assignment in the vqa_benchmarking_backend.datasets.dataset.ModelAdapter.get_question_embedding method.

class vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset

Bases: torch.utils.data.dataset.Dataset

Superclass for custom datasets, inheriting from original pytorch dataset class. Thus, the same functions like __len__ and __getitem__ have to be overwritten as in any pytorch dataset with index-based access. Check the pytorch documentation for more details.

abstract __len__(self)

Returns size of dataset (how many samples). NOTE: has to be overridden!

abstract __getitem__(self, index) DataSample

Returns a single data sample. NOTE: has to be overridden!

abstract get_name(self) str

Required for file caching, e.g. naming databases and displaying corresponding entries in the webapp. NOTE: has to be overridden!

abstract class_idx_to_answer(self, class_idx: int) str

Returns: Natural language answer from a class index. NOTE: has to be overridden!

class vqa_benchmarking_backend.datasets.dataset.DatasetModelAdapter

Superclass for model adapters. When inheriting from this class, make sure to

  • move the model to the intended device

  • move the data to the intended device inside the _forward method

abstract get_name(self) str

Required for file caching, e.g. naming databases and displaying corresponding entries in the webapp. NOTE: has to be overridden!

abstract get_output_size(self) int

Size of answer space. NOTE: has to be overridden!

abstract get_torch_module(self) torch.nn.Module

Return the pytorch VQA model NOTE: has to be overridden!

train(self)

Set VQA model to train mode (for MC Uncertainty)

eval(self)

Set VQA model to eval mode

abstract get_question_embedding(self, sample: DataSample) torch.FloatTensor

Embed questions without full model forward-pass. NOTE: has to be overridden!

abstract get_image_embedding(self, sample: DataSample) torch.FloatTensor

Embed image without full model forward-pass. NOTE: has to be overridden!

abstract _forward(self, samples: List[DataSample]) torch.FloatTensor

Overwrite this function to transform a list of samples to fit your VQA model’s required input format. IMPORTANT:

  • Make sure that the outputs are probabilities, not logits!

  • Make sure that the data samples are using the samples’ question embedding field, if assigned (instead of re-calculating them, they could be modified from feature space methods)

  • Make sure that the data samples are moved to the intended device here

forward(self, samples: List[DataSample]) torch.FloatTensor

Return samples x classes results AS PROBABILITIES