Join the Crowd(sourcing): Turning to Our Readers for Metadata Help

We’re searching for the five W’s: Who, What, Where, When, Why?

One of the most exciting trends in digital collections of late has been the emergence of “crowdsourcing.” The idea is simple: post some images about which you know nothing (or very little) and turn to the collective knowledge of a user group – say, a Facebook page or Twitter followers – for help. Using the power of the crowd, we can fill gaps in metadata content or other information that would take a single researcher or cataloger far longer to track down on their own.

We’re taking an excursion into the world of crowdsourcing with a small pilot project presented via our Flickr collection (www.flickr.com/baylordigitalcollections). Just follow the link at the end of this post to see a set of 6 images where we need a little help filling out our cataloging information. Some feature groups of people about which we know little; others take place in front of buildings we can’t identify; and still others are lacking a specific date, including a particular year.

If you’d like to try your hand at some metadata sleuthing, just click below and break out your magnifying glasses. If you spot something you’d like to tell us about, send us an email at rdc@baylor.edu and we’ll investigate your tips; if they’re accurate, we’ll add them to our digital collections – and give you the credit!

Baylor University Libraries Digital Collections Crowdsourcing Pilot Project

Special thanks to The Texas Collection and the Baylor University Libraries Athletics Archive for providing images for this project.

“So We Can Throw These Out Now, Right?”: What We Learned From Microfilming Newspapers and How It Shapes Our Digitization Strategy

Pictured: Scanner Fuel, from the stacks at Baylor University.

Recently, I attended a workshop for a topic mostly unrelated to my work in digital collections. At introduction time, I gave a nutshell view of what I do by saying my group digitizes Baylor’s special collections and makes them available online. Despite the whole thing taking about 15 seconds and being intentionally generic, I’ve done this intro enough times by now to know what was going to happen next.

An older gentleman sitting on the front row got what I can only describe as the “ah-ha!” look on his face, and at the first break, he approached me and asked a question I get more often than not when I talk to people about what we do at the Digitization Projects Group.

“I work at a small museum, and we’re being told to digitize our collections. Once we do, we can just throw those old papers out, right? And is a DVD a good storage solution?”

My answer to him was simple, but it probably wasn’t what he expected to hear.

“Do you remember microfilm?” I asked him. “And when was the last time you used it and thought ‘Gosh, I wish I could get my hands on the original just to compare it to what I’m looking at’ only to find it’s been decades since anyone saw a paper copy? That’s why you can’t just throw things out once they’re scanned.”

“Also,” I added, “DVDs are terrible.”

***

Okay, so I wasn’t quite that blunt on the DVD answer, but the effect was the same: a stunned look of disbelief. In some ways, I don’t blame him. There’s a lot of misinformation (and outright falsehoods) out there about digitization, data preservation, and care of digitized materials, and the more channels it has to filter through to reach people at smaller institutions, the more distorted it can get.

If you haven’t done so, I encourage you to check out a book by Nicholson Baker called Double Fold: Libraries and the Assault on Paper. Baker’s central premise is that during the microfilming heyday of the 1980s and 1990s, libraries and other institutions put too much faith in the technology of microfilming and weren’t always diligent about properly preserving and storing the newspapers that had been filmed. It is a polemical, biased, uncomfortable book to read, and it is less than popular among librarians. But that was exactly Baker’s point.

Baker wanted to draw attention to the notion that just because a technology had come along that promised better access and a smaller storage footprint didn’t mean professionals could become lax about enforcing good practices of physical archival storage. While much of Baker’s criticism has been ably (and thoroughly) countered by library professionals in the decade since Double Fold’s publication in 2001, it remains a stirring think piece on the dangers of over-reliance on a “silver bullet” solution at the expense of long-term viability.

At the heart of Baker’s issues with microfilm was the prevailing attitude that, once a run of newspapers had been filmed, it was perfectly acceptable for the originals to be tossed, as the filmed versions were thought to be a reasonable substitute that preserved both the look and content of the papers at a fraction of the space required to store them. But what happens if the film is bad and no one noticed until the originals were long gone? Or what if a page was skipped, or an entire volume? Or what if the film falls prey to “vinegarization” – an inherent agent of deterioration wherein the films layers begin to breakdown and disintegrate, producing a distinctive vinegary, “salad dressing” smell – and now cannot be viewed?

If the originals are gone, the answer is clear: there’s nothing you can do.

***

Which brings me back to my fellow workshop attendee’s question: once things are scanned, they’re safe to pitch, right? The problems outlined in Baker’s book could just as easily apply to the process of digitizing archival materials. We believe the technology behind digitization is reliable, replicable, and sustainable, and we’ve learned a great deal about how to approach digitizing materials thanks to the lessons revealed by the great microfilm boom of the last century. As such, we’ve got processes and technologies in place to monitor our digital files, keeping them secure and accessible for decades to come.

But what about the things we can’t predict? What if the next generation of computers is so different from what we’re used to today that the very idea of digital files changes completely? What if a specialized virus destroys every TIFF file in creation? What if the Mayans were right, and civilization as we know it craters at the end of the year, rendering all our painstaking efforts profoundly moot?

The best answer is to do what people have done since 200 BC: go back to the paper versions.

That’s why we counsel our partners to use the process of digitizing materials to serve as a catalyst for rehousing materials in archival storage if they’re not stored that way already. That’s why we urge conservation of fragile materials before they arrive at our center. That’s why we never tell them it’s safe to throw something away just because it’s been scanned, cataloged and placed in a digital collection.

That’s why I told the man from the workshop that the answer to his question is a very simple, “No.”

And the DVD question? Think about this: when was the last time you popped a CD into your car’s stereo that you hadn’t listened to in a while, only to find that your favorite song was skipping like a hyperactive preschooler thanks to a series of almost-imperceptible scratches? It’s happened to all of us, and the same thing can happen to a supposed “100 year, archival” gold DVD.

But for years, digitizers at institutions large and small were told that backing up your files to a DVD and putting it on a shelf was a great example of a reliable backup, to the point where many early digitization outfits didn’t keep any other versions of files around once they were burned to disc. But we found pretty quickly that those discs weren’t reliable enough to be a sole backup source, so now we keep multiple copies on spinning discs, analog tape, and in the cloud both on- and off-site to ensure long term stability of our digital assets.

***

All of this makes good sense, but if professionals at big institutions like the Library of Congress, the National Archives and even Baylor’s own DPG have to keep constant watch on evolving technology trends just to stay up to speed, how can we expect staffers at small to mid-size institutions to keep up?

Ultimately, it comes down to education and using a common sense approach to digitization projects. Education on the part of large institutions like the Library of Congress, the Texas Historical Commission, and, at a local/regional level, our own staff to educate people at small institutions on the basics of digitization and file management. Workshops, webinars, websites and more can be found that contain basic information about how to scan documents, how to manage the data that results, and what to do to keep it safe, and more access to this kind of information can do great good to counteract some of the old misconceptions that are still out there.

And common sense? That’s something Baker’s Double Fold should give us reason to trust in spades. If something is important enough to scan and put online, isn’t it common sense to think that it’s important enough to preserve physically? If an archival collection was kept safely stored for decades in the right environment, does it make sense to throw it out now that it’s been scanned? And if we know that paper-based items can last for centuries when properly stored, doesn’t it make sense to hold onto them as long as we can, just in case?

***

Is digitization an important undertaking for libraries, museums, and archives of all sizes? Undeniably.

Should we take steps to ensure our cultural heritage – digital and physical – is properly stored, displayed, and accessed? Without a doubt.

Does either of those facts mean it’s safe to discard a decade’s worth of 19th century American newspapers once they’ve been scanned, as happened with microfilmed newspapers in the 1990s?

If anyone’s reading this post in 3012, do me a favor: look me up and let me know.

Guest Post: Sierra Wilson, Our 2012 Summer Intern

 

Welcome to our first guest post here on the BU Libraries Digital Collections blog! We’re excited to welcome Sierra Wilson, a graduate student from the University of Illinois at Urbana-Champaign studying Library and Information Science. Sierra has been with us this summer working as an intern. Her assignment: the sprawling Baylor University News Releases project, outlined in a previous blog post. Take it away, Sierra!

My name is Sierra, and I was an intern this summer at Baylor’s Riley Digitization Center.  My last day is on Friday, and I’m sad to be leaving the RDC behind to return to school.  I am not new to Baylor; I grew up in Waco and graduated from Baylor in 2008.  Last year, I started graduate school at the University of Illinois at Urbana-Champaign, where I study Library and Information Science.   My main interest in grad school has been in archives and special collections, and how these materials can be made more accessible by the use of technology.   My internship this summer was the perfect opportunity to learn more about the equipment and techniques libraries use to achieve this goal.

Although I have been lucky to work on many different projects this summer, most of my time here has been spent working on the Baylor Press Release project that Eric posted about earlier this summer.  After we sorted thousands of press releases into chronological order (no small feat!), the next step was actually digitizing them.   To do this, we load the press releases into binders and scan them with a machine called the Kirtas, which turns the pages of books to speed up the scanning process.  This is the part of the process with which I have been the most involved.  Back in June, I started scanning in 1960 (earlier press releases were scanned on a flatbed scanner); as of this week, my co-workers and I have scanned a decade and half of press releases!

I will admit that there have been times that I never wanted to see another press release again, but I’m sad that I’ll be leaving this project before its completion.  Seeing this task go from a massive, daunting heap of boxes to an organized, streamlined system has been extremely satisfying.  It’s been an important part of my learning experience this summer to see the digitization center’s staffers tackle such a hefty problem.

One of the most interesting parts of the project has been the opportunity to learn more about the history of Baylor and Waco.  I read about the changing landscape of campus, with the addition of buildings like Moody Library and the Hooper-Schaefer Fine Arts Center in the late 1960s and early 1970s, as well as the continual growth of the student body.  Of course, sometimes Baylor history repeats itself: the Noze brotherhood was banned from campus in 1965.

Over the time I’ve spent working on this project, I started keeping a list of the most unusual press releases I came across.  I found myself surprised (and often amused) by the nationally known figures that came to Baylor to speak or perform.  Baylor folk often talk about the “Baylor Bubble,” that invisible barrier that sometimes seems to shield the campus from the outside world, but these press releases prove that Baylor has always played an important, active role in the world around it.  Sometimes Baylor’s visitors were prestigious, and some are just downright unusual, and I would never have imagined before this project that any of them would have come to Baylor.

Sierra’s Top Five Unusual Press Releases

October 20, 1972
Jon Voight comes to Baylor to campaign for George McGovern

 

There’s something strange in the idea that a big movie star like Jon Voight would come to Baylor to campaign for McGovern.  That’s like Brad Pitt coming to campaign for John Kerry in 2004: hard to imagine.  But he did, not that it made much of a difference for McGovern’s campaign for president.

April 30, 1965
Nina Simone performs at Baylor May Day festivities

 

Nina Simone was a well-known singer-songwriter, pianist, and civil rights activist, and I was shocked that Baylor would have brought in someone as famous as Simone to be their featured May Day performer.  May Day seems to have been the predecessor to Diadeloso.

March 23, 1973
Lenore Romney speaks at Chapel

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In March of 1973, Lenore Romney, the wife of George Romney and mother of current presidential candidate Mitt Romney, was a speaker at Chapel.  She herself had recently lost a race for U.S. Senator in the state of Michigan, and spoke about her experience as a woman running for office.  Who knew?

September 28, 1974
Erich von Daniken lectures at Baylor

If you’ve ever come across the History Channel’s “Ancient Aliens” program, then you are familiar with Erich von Daniken’s ideas about alien contact with ancient civilizations.  At the time of this speaking engagement at Baylor, von Daniken had recently published Chariots of the Gods?, which details his unusual (and frequently discredited) theory that the development of human civilization could have aided by extraterrestrial contact.  I wonder what the Baylor community thought about him?

May 28, 1965
President Lyndon Baines Johnson speaks at Baylor commencement

I bet you didn’t know that President Lyndon Baines Johnson had family ties to Baylor, did you?  It turns out his maternal great-grandfather was the president of Baylor from 1861-2.  LBJ wasn’t the first sitting president to speak at Baylor, either; he was preceded by both Eisenhower and Truman.

The Baylor University News Release collection is being scanned and processed at this time. Images above are for illustrative purposes and are not available via the Baylor University Libraries Digital Collections at this time. We’ll post an update to let users know when they can access this impressive collection!