MapTCHA, the open source CAPTCHA that improves OpenStreetMap
Day 1 | 10:30 | 00:25 | AW1.120 | Anna Zanchetta
Note: I'm reworking this at the moment, some things won't work.
Bots and spam are challenges for online platforms. Traditional CAPTCHAs help block bots, but often involve improving proprietary maps and software, while exposing user information to third-party CAPTCHA providers. OpenStreetMap (OSM) has many objects remaining to be mapped, but the quality of AI-generated objects is not high enough for direct inclusion. We introduce “MapTCHA”, a CAPTCHA that leverages the uncertainty of interpreting imagery with computer vision, and provides human verification for AI predictions: users are asked to identify images containing correctly interpreted objects, e.g. building outlines.
We separate known positive cases, where both the AI prediction and OSM contain an object, from unknown cases, where objects are only in the prediction. We also generate known negatives from areas where objects are neither in OSM nor in the prediction. We show a mix of these images without telling the user which are which. Humans are validated by confirming the known positives and negatives, and we determine the truth of the unknown images by aggregating users’ responses through voting. When the voting indicates high confidence that an object exists, we suggest the location for OSM mapping.
Our prototype identifies buildings using aerial imagery with high enough resolution to visualise individual buildings and medium-sized objects. Image recognition is provided by fAIr, an open-source AI-assisted mapping system developed by the Humanitarian OpenStreetMap Team (HOT). It allows the training and fine-tuning of pre-trained machine learning models to segment building footprints.
Future plans include expanding to more objects and types of imagery; refining AI models; integrating MapTCHA into various login systems; and enhancing the user interface.
In this session we will talk about how we are building this solution, how it might enhance mapping efforts in OSM that will support projects on the ground, and some of the challenges of working with AI derived data.