← Back to Posts

Data Entry in 2026

The Task

Describe 5,000 heritage photographs in a spreadsheet with 96 fields per image covering subject, dates, and contents.

The Problem

  • AI gives generic, unusable descriptions because the photos are highly non-standard and require a heritage specialist
  • The researcher needed half an hour per image (roughly 10 minutes of that was just entering data into the spreadsheet)
  • The process was exhausting because the researcher had to navigate a massive table and jump between multiple tabs repeatedly
  • The resulting spreadsheet had many gaps and inconsistent values

The Solution

My first instinct was to try automating the entire pipeline (research + linking + table filling) with AI, but after seeing the researcher's reaction to a few test runs, I knew there was no hope of automating the research step 🤷‍♂️

(He was actually enjoying the manual research 😎🙄)

For the data entry half, I was inspired by a neat feature from Todoist:

The app extracts date, time, recurrence, location, and priority from free-form text… with zero AI!

So I built a simple interface that shows images one by one with a single text input box beside each (instead of 96 fields).

The researcher describes the image in free-form text without navigating a huge spreadsheet, and the script captures the necessary information and fills the fields behind the scenes, consistently and reliably.

The researcher sees the final result and confirms.

The Real Win

Great… but the win isn't complete if the researcher types slowly, so I added a speech-to-text tool. That's when we hit the magic result:

The researcher describes the image with their voice and the filled table appears (from 10 minutes down to 30 seconds per image)

Bonus Wins 🍒🍰

  • This pattern shows up elsewhere too: in Automated Weekly Reports from WhatsApp, I used the same “meet people where they are” principle to turn messy daily messages into clean reporting.
  • Working through the UI is safer than working directly with the spreadsheet and reduces errors
  • The process became smooth enough to onboard volunteer researchers without giving them all access to the giant spreadsheet
  • Volunteers are more likely to remain and be productive instead of demotivated and intimidated by the ugly spreadsheet
  • Results became more consistent and complete

What's Next

Once a good-sized set of images has been described this way, we may be able to train an AI model to continue the process with the same style and quality, leaving the researcher to just review and approve.

Fingers crossed.