vqa_benchmarking_backend.metrics.sear
¶
Module Contents¶
Functions¶
|
SEAR 1: WP VBZ -> WP’s |
|
SEAR 2: What NOUN -> Which NOUN |
|
SEAR 3: color -> colour |
|
SEAR 4: ADV VBZ -> ADV’s |
|
Creates inputs where semantically equivalent changes are applied to the input questions |
|
Evalutate predictions generated with inputs_for_question_sears. |
- vqa_benchmarking_backend.metrics.sear._apply_SEAR_1(question_postagged: List[Tuple[str, str]])¶
SEAR 1: WP VBZ -> WP’s
- vqa_benchmarking_backend.metrics.sear._apply_SEAR_2(question_postagged: List[Tuple[str, str]])¶
SEAR 2: What NOUN -> Which NOUN
- vqa_benchmarking_backend.metrics.sear._apply_SEAR_3(question_tokenized: List[str])¶
SEAR 3: color -> colour
- vqa_benchmarking_backend.metrics.sear._apply_SEAR_4(question_postagged: List[Tuple[str, str]])¶
SEAR 4: ADV VBZ -> ADV’s
- vqa_benchmarking_backend.metrics.sear.inputs_for_question_sears(current_sample: vqa_benchmarking_backend.datasets.dataset.DataSample) Tuple[Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None]] ¶
Creates inputs where semantically equivalent changes are applied to the input questions
- Returns:
A tuple with 4 entries of either type DataSample or with value None. 1st entry corresponds to SEAR 1, 2nd entry to SEAR 2, … . Note: if a value in the tuple at index i is None, that means that SEAR i is not applicable.
- vqa_benchmarking_backend.metrics.sear.eval_sears(dataset: vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset, sear_inputs: Tuple[Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None]], sear_predictions: Tuple[Union[torch.FloatTensor, None]], original_class_prediction: str) Dict[str, dict] ¶
Evalutate predictions generated with inputs_for_question_sears.
- Args:
sear_inputs: the 4 outputs generated by inputs_for_question_sears predictions List[(1 x answer space)] of length 4: Model predictions for SEAR questions or None if SEAR input was None. (probabilities)
- Returns:
- dictionary with information per sear, e.g.
- sear_4: {
‘predicted_class’: 10, ‘flipped’: False, ‘applied’: True
}