vqa_benchmarking_backend.datasets.TextVQADataset
¶
Module Contents¶
Classes¶
Class describing one data sample of the TextVQA dataset |
|
Class describing the TextVQA dataset |
Functions¶
|
Removes punctuation and make everything lower case |
|
Loads an image using module |
|
Loads a numpy array containing image features |
|
Calculates VQA score in [0,1] depending on number of humans having given the same answer |
- vqa_benchmarking_backend.datasets.TextVQADataset.preprocess_question(question: str) List[str] ¶
Removes punctuation and make everything lower case
- vqa_benchmarking_backend.datasets.TextVQADataset.load_img(path: str, transform=None) numpy.ndarray ¶
Loads an image using module
cv2
- vqa_benchmarking_backend.datasets.TextVQADataset.load_img_feats(path: str) torch.FloatTensor ¶
Loads a numpy array containing image features
- vqa_benchmarking_backend.datasets.TextVQADataset.answer_score(num_humans) float ¶
Calculates VQA score in [0,1] depending on number of humans having given the same answer
- class vqa_benchmarking_backend.datasets.TextVQADataset.TextVQADataSample(question_id: str, question: str, answers: Dict[str, float], image_id: str, image_path: str, image_feat_path: str, image_transform=None)¶
Bases:
vqa_benchmarking_backend.datasets.dataset.DataSample
Class describing one data sample of the TextVQA dataset Inheriting from
DataSample
- property image(self)¶
Returns the image, if not present it loads it from
self._image_path
- property question_tokenized(self) List[str] ¶
Returns tokenized question
- property question(self) str ¶
Returns full question
- __str__(self)¶
Stringify object
- class vqa_benchmarking_backend.datasets.TextVQADataset.TextVQADataset(question_file: str, img_dir, img_feat_dir, idx2ans, transform=None, load_img_features=False)¶
Bases:
vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset
Class describing the TextVQA dataset Inheriting from
DiagnosticDataset
- _load_data(self, question_file: str) Tuple[List[vqa_benchmarking_backend.datasets.dataset.DataSample], Dict[str, vqa_benchmarking_backend.datasets.dataset.DataSample], vqa_benchmarking_backend.utils.vocab.Vocabulary, vqa_benchmarking_backend.utils.vocab.Vocabulary] ¶
Loads data from TextVQA json files Returns:
data: list of
TextVQADataSample
qid_to_sample: mapping of question id to data sample
question_vocab:
Vocabulary
of all unique words occuring in the dataanswer_vocab:
Vocabulary
of all unique answers
- __getitem__(self, index) vqa_benchmarking_backend.datasets.dataset.DataSample ¶
Returns a data sample
- label_from_class(self, class_index: int) str ¶
Get the answer string of a given class index
- word_in_vocab(self, word: str) bool ¶
Checks if a word occured inside the
Vocabulary
dervied of all questions
- __len__(self)¶
Returns the length of the TextVQADataset as in self.data
- index_to_question_id(self, index) str ¶
Get the index of a specific question id
- get_name(self) str ¶
Returns the name of the dataset, required for file caching
- class_idx_to_answer(self, class_idx: int) Union[str, None] ¶
Get the answer string for a given class index from the
self.idx2ans
dictionary