The statistical analyses were conducted offline, based on the data downloaded from the Umeng analytics server. Prior to data analysis, the data went through preprocessing to remove image property tags, which describe the color (e.g., red, pink), action (e.g., diving, riding), object property (e.g., tall, round), and image quality (e.g., blur, fog). These property tags were not of interest for this study.
As Azure's Computer Vision can recognize thousands of types of objects, objects needed to be grouped into categories to gain a high-level understanding of the visual targets viewed through the video magnifier. Based on observation of the object tag data, 18 categories were first used to group the objects. After the initial tallying was completed, the category list was condensed to 11 by combining some related and minor groups to make it easier to understand the data.
Table lists the categories used in final analysis and their descriptions. There may be more than one object in an image belonging to a given category. In our analyses, a category is counted only once per image. The object categories were tallied in two ways: the number of presences (NP) and the weighted presences (WP). If a category is present in the interpretation of an image, it will be tallied once for NP regardless of the other categories in the image. For WP, the weight of a category is defined as 1 divided by the number of categories in the image. Normalizing the NP and WP category sums by the total tally gives the overall percentage of each category.
One object may be associated with multiple tags. For example, a water bottle with a textual label on it may be labeled by two tags, “water bottle” and “text.” Thus, category combinations (e.g., “food_text”) were also tallied for a separate analysis of the image scenes. Images with one category or a two-category combination were investigated in this study. Images with more than two categories were rare and excluded from this analysis.