2.4. Number of Floors Detector¶

On a randomly selected set of in-the-wild building images from New Jersey’s Bergen, Middlesex, and Moris Counties, the model attains an F1-score of 86%. Here, in-the-wild building images are defined as street-level photos that may contain multiple buildings and are captured with random camera properties. confusion_nFloorWildv2 is the confusion matrix of the model inferences on the aforementioned in-the-wild test set.

Fig. 2.4.1 Confusion matrix of the pretrained model on the in-the-wild test set¶

If the test images are constrained such that a single building exists in each image, the building is viewed with minimal obstructions, and the images are captured such that the image plane is nearly parallel to the frontal plane of the building facade, the F1-score of the model is determined as 94.7%. confusion_nFloorClean shows the confusion matrix for the pretrained model on a test set generated according to these constraints.

Fig. 2.4.2 Confusion matrix of the pretrained model on the dataset containing lightly distorted/obstructed images of individual buildings¶

Table 2.4.1 shows a sample of images removed from the in-the-wild test set that were found to display weak resemblance of the visual cues necessary for a valid number of floor predictions.

Table 2.4.1 In-the-wild street level imagery removed as a part of dataset cleaning¶
Fig. 2.4.3 Heavily occluded building facade¶	Fig. 2.4.4 Closely spaced buildings: obscure prediction target¶
Fig. 2.4.5 Significant perspective distortions¶	Fig. 2.4.6 Heavily occluded building facade¶