`vqa_benchmarking_backend.metrics.sear`¶

Module Contents¶

Functions¶

`_apply_SEAR_1`(question_postagged: List[Tuple[str, str]])	SEAR 1: WP VBZ -> WP’s
`_apply_SEAR_2`(question_postagged: List[Tuple[str, str]])	SEAR 2: What NOUN -> Which NOUN
`_apply_SEAR_3`(question_tokenized: List[str])	SEAR 3: color -> colour
`_apply_SEAR_4`(question_postagged: List[Tuple[str, str]])	SEAR 4: ADV VBZ -> ADV’s
`inputs_for_question_sears`(current_sample: vqa_benchmarking_backend.datasets.dataset.DataSample) → Tuple[Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None]]	Creates inputs where semantically equivalent changes are applied to the input questions
`eval_sears`(dataset: vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset, sear_inputs: Tuple[Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None]], sear_predictions: Tuple[Union[torch.FloatTensor, None]], original_class_prediction: str) → Dict[str, dict]	Evalutate predictions generated with inputs_for_question_sears.

vqa_benchmarking_backend.metrics.sear._apply_SEAR_1(question_postagged: List[Tuple[str, str]])¶: SEAR 1: WP VBZ -> WP’s

vqa_benchmarking_backend.metrics.sear._apply_SEAR_2(question_postagged: List[Tuple[str, str]])¶: SEAR 2: What NOUN -> Which NOUN

vqa_benchmarking_backend.metrics.sear._apply_SEAR_3(question_tokenized: List[str])¶: SEAR 3: color -> colour

vqa_benchmarking_backend.metrics.sear._apply_SEAR_4(question_postagged: List[Tuple[str, str]])¶: SEAR 4: ADV VBZ -> ADV’s

vqa_benchmarking_backend.metrics.sear.inputs_for_question_sears(current_sample: vqa_benchmarking_backend.datasets.dataset.DataSample) → Tuple[Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None]]¶

Creates inputs where semantically equivalent changes are applied to the input questions

Returns:: A tuple with 4 entries of either type DataSample or with value None. 1st entry corresponds to SEAR 1, 2nd entry to SEAR 2, … . Note: if a value in the tuple at index i is None, that means that SEAR i is not applicable.

vqa_benchmarking_backend.metrics.sear.eval_sears(dataset: vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset, sear_inputs: Tuple[Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None], Union[vqa_benchmarking_backend.datasets.dataset.DataSample, None]], sear_predictions: Tuple[Union[torch.FloatTensor, None]], original_class_prediction: str) → Dict[str, dict]¶

Evalutate predictions generated with inputs_for_question_sears.

Args:

sear_inputs: the 4 outputs generated by inputs_for_question_sears predictions List[(1 x answer space)] of length 4: Model predictions for SEAR questions or None if SEAR input was None. (probabilities)

Returns:

dictionary with information per sear, e.g.

sear_4: {: ‘predicted_class’: 10, ‘flipped’: False, ‘applied’: True

}

vqa_benchmarking_backend.metrics.sear¶

Module Contents¶

Functions¶

`vqa_benchmarking_backend.metrics.sear`¶