:mod:`vqa_benchmarking_backend.metrics.bias` ============================================ .. py:module:: vqa_benchmarking_backend.metrics.bias Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: vqa_benchmarking_backend.metrics.bias.inputs_for_question_bias_featurespace vqa_benchmarking_backend.metrics.bias.inputs_for_question_bias_imagespace vqa_benchmarking_backend.metrics.bias.inputs_for_image_bias_featurespace vqa_benchmarking_backend.metrics.bias._extract_subjects_and_objects_from_text vqa_benchmarking_backend.metrics.bias._questions_different vqa_benchmarking_backend.metrics.bias.inputs_for_image_bias_wordspace vqa_benchmarking_backend.metrics.bias.eval_bias Attributes ~~~~~~~~~~ .. autoapisummary:: vqa_benchmarking_backend.metrics.bias.nlp .. data:: nlp .. function:: inputs_for_question_bias_featurespace(current_sample: vqa_benchmarking_backend.datasets.dataset.DataSample, min_img_feat_val: torch.FloatTensor, max_img_feat_val: torch.FloatTensor, min_img_feats: int = 10, max_img_feats: int = 100, trials: int = 15) -> List[vqa_benchmarking_backend.datasets.dataset.DataSample] Creates inputs for measuring bias towards questions by creating random image features. Args: min_img_feat_val (img_feat_dim): vector containing minimum value per feature dimension max_img_feat_val (img_feat_dim): vector containing maximum value per feature dimension Returns: trials x [min_img_feats..max_img_feats] x img_feat_dim : Tensor of randomly generated feature inputs in range [min_img_feat_val, max_img_feat_val]. Number of drawn features (dim=1) is randomly drawn from [min_img_feats, max_img_feats] .. function:: inputs_for_question_bias_imagespace(current_sample: vqa_benchmarking_backend.datasets.dataset.DataSample, dataset: vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset, trials: int = 15) -> List[vqa_benchmarking_backend.datasets.dataset.DataSample] Creates inputs for measuring bias towards questions by replacing the current sample's image with images drawn randomly from the dataset. Also, checks that the labels of the current sample and the drawn samples don't overlap. .. function:: inputs_for_image_bias_featurespace(current_sample: vqa_benchmarking_backend.datasets.dataset.DataSample, min_question_feat_val: torch.FloatTensor, max_question_feat_val: torch.FloatTensor, min_tokens: int, max_tokens: int, trials: int = 15) -> List[vqa_benchmarking_backend.datasets.dataset.DataSample] Creates inputs for measuring bias towards images by creating random question features. .. function:: _extract_subjects_and_objects_from_text(text: str) -> Tuple[Set[str], Set[str]] .. function:: _questions_different(q_a: str, q_b: str) -> bool Simple comparison for the semantic equality of 2 questions. Tests, if the subjects and objects in the question are the same. .. function:: inputs_for_image_bias_wordspace(current_sample: vqa_benchmarking_backend.datasets.dataset.DataSample, dataset: vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset, trials: int = 15) -> List[vqa_benchmarking_backend.datasets.dataset.DataSample] Creates inputs for measuring bias towards images by replacing the current sample's question with questions drawn randomly from the dataset. Also, checks that the questions don't overlap. .. function:: eval_bias(dataset: vqa_benchmarking_backend.datasets.dataset.DiagnosticDataset, original_class_prediction: str, predictions: torch.FloatTensor) -> Tuple[Dict[int, float], float] Evalutate predictions generated with `inputs_for_question_bias_featurespace`, `inputs_for_question_bias_imagespace`, `inputs_for_image_bias_featurespace` or `inputs_for_image_bias_wordspace`. Args: predictions (trials x answer space): Model predictions (probabilities) Returns: * Mapping from best prediction class -> fraction of total predictions * normalized bias score (float), where 0 means no bias, and 1 means 100% bias