vqa_benchmarking_backend.datasets.GQADataset
¶
Module Contents¶
Classes¶
Class describing one data sample of the GQA dataset |
|
Class describing the GQA dataset |
Functions¶
|
Remove punctuation and make everything lower case |
|
Load an image using module |
|
Load a numpy array containing image features |
- vqa_benchmarking_backend.datasets.GQADataset.preprocess_question(question: str) List[str] ¶
Remove punctuation and make everything lower case
- vqa_benchmarking_backend.datasets.GQADataset.load_img(path: str, transform=None) numpy.ndarray ¶
Load an image using module
cv2
- vqa_benchmarking_backend.datasets.GQADataset.load_img_feats(path: str) torch.FloatTensor ¶
Load a numpy array containing image features
- class vqa_benchmarking_backend.datasets.GQADataset.GQADataSample(question_id: str, question: str, answers: Dict[str, float], image_id: str, image_path: str, image_feat_path: str, image_transform=None)¶
Bases:
vqa_benchmarking_backend.datasets.dataset.DataSample
Class describing one data sample of the GQA dataset Inheriting from
DataSample
- property image(self) numpy.ndarray ¶
Returns the image, if not present it loads it from
self._image_path
- property question_tokenized(self) List[str] ¶
Returns tokenized question
- __str__(self)¶
Stringify object
- class vqa_benchmarking_backend.datasets.GQADataset.GQADataset(question_file: str, img_dir, img_feat_dir, idx2ans, name, transform=None, load_img_features=False)¶
Bases:
vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset
Class describing the GQA dataset Inheriting from
DiagnosticDataset
- _load_data(self, question_file: str) Tuple[List[vqa_benchmarking_backend.datasets.dataset.DataSample], Dict[str, vqa_benchmarking_backend.datasets.dataset.DataSample], vqa_benchmarking_backend.utils.vocab.Vocabulary, vqa_benchmarking_backend.utils.vocab.Vocabulary] ¶
Loads data from GQA json files Returns:
data: list of
GQADataSample
qid_to_sample: mapping of question id to data sample
question_vocab:
Vocabulary
of all unique words occuring in the dataanswer_vocab:
Vocabulary
of all unique answers
- __getitem__(self, index) vqa_benchmarking_backend.datasets.dataset.DataSample ¶
Returns a data sample
- label_from_class(self, class_index: int) str ¶
Get the answer string of a given class index
- word_in_vocab(self, word: str) bool ¶
Checks if a word occured inside the
Vocabulary
dervied of all questions
- __len__(self)¶
Returns the length of the GQADataset as in self.data
- get_name(self) str ¶
Returns the name of the dataset, required for file caching
- index_to_question_id(self, index) str ¶
Get the index of a specific question id
- class_idx_to_answer(self, class_idx: int) Union[str, None] ¶
Get the answer string for a given class index from the
self.idx2ans
dictionary