Thoughtful Digitization: Why Archivists Look Beyond the Database
It might be counterintuitive, but archival theory teaches us to save the originals after we digitize documents.
Before the pandemic, I was doing research in the archive where I was employed on a sculpture located on campus. The sculpture was made of Corten steel. If you know anything about Corten, you know that it is supposed to rust. That rusted look is the entire point of the material. It forms a protective layer and gives the sculpture its distinctive color and texture.
The university’s facilities department had been painting the sculpture every two years. They were locked in a never-ending battle with the rust.
At the same time, the advancement department asked me to look into the sculpture's history. They wanted to know if the university could remove it and what the agreement with the donor actually said. Was there language about maintaining the sculpture permanently? Were there restrictions on altering it or moving it?
Advancement had its files professionally digitized in the early 2000s. The department had sent their boxes of files away to a vendor where everything was scanned using some kind of feeder scanner. The output came back as large PDFs, each file representing an entire donor file that might contain hundreds of pages.
Around the same time, the university had adopted a CRM system. The information from the files had been migrated into that system so that development officers could quickly reference donor information.
On paper, this sounded like a well-organized system. The files were digitized. The information was in the CRM. Everything should have been easily searchable.
But systems like CRMs are designed to capture structured information. They rely on predefined fields. Names, dates, gift amounts, campaign names. Those fields work well for tracking fundraising activity, but they do not capture everything in a document.
Margin notes rarely make it into those fields. Handwritten annotations are almost never transferred over. Contextual notes that explain why something happened or what someone meant often disappear because they do not fit neatly into the database structure.
And in this case, no one had bothered to enter the information from the margins into the notes field. So instead of relying on the CRM, I went back to the original PDFs created during the digitization project. I began scrolling through the scanned pages of the file.
In the margins of one of the documents, someone had written a single word: CORTEN.
I googled it, and suddenly everything made sense. The sculpture was supposed to rust. Facilities had been fighting against the very thing the artist intended the material to do.
The advancement department had asked me to answer a very specific question about the donor agreement, but the note in the margin answered an entirely different question that no one had even thought to ask. It explained why facilities had been trapped in a cycle of painting the sculpture every two years and why the sculpture kept “failing” in ways that it was never designed to prevent.
The answer had been sitting quietly in the margins of the file the entire time.
Digitization Captures Pages, Not Always Meaning
My experience with this sculpture illustrates something that archivists think about constantly when designing digitization projects. Digitization captures images of documents. It creates access and preserves the content of pages. But it does not automatically capture the meaning embedded within those pages.
When records are migrated into structured systems like CRMs or databases, the information that survives the transition is usually limited to the fields that the system expects. Anything outside those fields becomes background material that may never be reviewed again.
The handwritten notes in the margins of documents often carry important context. They can explain decisions, clarify materials, or provide the kinds of small details that make a file understandable decades later.
Those details are easy to overlook during digitization and migration projects because they are not part of the official record structure. But archivists know that they often contain the most useful information.
Thoughtful Digitization Means Thinking Beyond the System
This is why archivists approach digitization differently than many technology-driven projects. The goal of digitization should not simply be to create images of documents or to extract certain pieces of information into a database. The goal should be to preserve the context of the records and make sure the full story remains accessible.
That means thinking carefully about how records will be used in the future. If the only surviving information is the structured data inside a CRM, researchers may never encounter the marginal notes, annotations, or contextual clues that explain how decisions were made. A system might store the official donor agreement, but it will not necessarily capture the conversations, clarifications, and explanations written in the margins of the file.
In this case, the digitization project had done one thing very well: it preserved the scanned pages themselves. Because those images still existed, the margin note was still visible.
Without those scanned pages, the explanation for the sculpture’s behavior might have disappeared entirely
Information Lives in the Details
Archivists read documents differently from how most systems process them. We read the entire page. We notice annotations, handwriting, and the placement of documents within a file.
Sometimes, the most important piece of information in a file is not the official document at all. It is the note someone scribbled in the margin while reviewing it.
Thoughtful digitization means recognizing that information does not always live in the places where databases expect to find it. It lives in the details, in the context, and sometimes in a single word written in the margin of a document.
In this case, that single word was CORTEN.