Project Notes - 2009 version

Note: the project notes that follow are from the 2009 version of the Voices of the Holocaust site. Some of the information below is no longer current, and is presented here for reference purposes only. For an updated version of these project notes, please visit this page.

Information on how the content in this collection was created, digitized, indexed, and delivered is provided below. The Voices of the Holocaust project welcomes inquiry on any aspect of the process; use the Contact link for more information.

Boder's Methodology

Dr. Boder interviewed displaced persons in France, Switzerland, Italy, and Germany. The interviewees represented all economic levels, many religions, and various nationalities and language groups from across Europe. His usual approach was to point out that most Americans had a limited knowledge of Nazi atrocities, and that by telling his or her story, the interviewee could contribute to both the education of the American public and the historical record. Boder would often begin by asking the person's name, age, and where they were when the war started, then allowed them to speak at will, without the constraints of preplanned interview questions.

Boder returned to Chicago and with the help of grants from the National Institute of Mental Health transcribed and translated seventy of the interviews into English from 1947-1957. In 1949, edited versions of eight of the interviews were published in I Did Not Interview the Dead (Urbana: University of Illinois Press, 1949). Boder eventually published all seventy of his translations in a set of self-published volumes, under the title Topical Autobiographies of Displaced People Recorded Verbatim in Displaced Persons Camps, With a Psychological and Anthropological Analysis. Boder sent copies of these volumes to dozens of libraries before his death in 1961, though fewer than thirty sets survive today.

Scope

This collection includes transcriptions and audio of one hundred eighteen interviews involving one hundred twenty-one interviewees. While Boder's main focus during his time in Europe was interviewing displaced persons, he also recorded other events, including religious services, speeches, and songs sung by choirs or persons in the DP camps. This material is not currently available in this collection, however many of the songs Boder recorded are available elsewhere, including World ORT's Music During the Holocaust and USHMM's Music of the Holocaust exhibits. Boder's wire recordings also contain a number of "aborted" interviews, lasting only a minute or two, which may have been intentionally terminated or accidently recorded over. These are also not included here, but may be added at some point in the future.

Advisory Committee & Critical Content

In order to inform the many decisions necessary in developing this collection, and to ensure that the resulting site would meet the needs of the academic community, the project enlisted the assistance of Holocaust scholars, historians, researchers, and other librarians and archivists working with collections of survivor testimony. In addition to providing invaluable direction in almost every aspect of the project, members of the advisory committee also contributed the majority of the critical content for the site, including glossary term definitions, camp and ghetto descriptions, biographical material, and commentary and footnotes for individual interviews. Participation in the committee was completely voluntary, and the project would like to express our sincere gratitude for the donations of time and scholarship offered by these contributors.

Audio Restoration

The whereabouts of the original wire spools recorded in Europe in 1946 are currently unknown. However, Boder created a copy set of wire spools that was deposited with the National Institute of Mental Health, and is now held by the Library of Congress. Unfortunately, this set is incomplete—approximately eleven spools are missing: 9-14, 9-33, 9-39, 9-44, 9-47, 9-58, 9-59, 9-69, 9-88, 9-159, and 209. As part of their preservation program, the Library of Congress made a transfer copy of the spools to open reel-to-reel tape, although the date of this transfer is uncertain (and these reels may have later been transferred to yet another set of open reels). The reels were later digitized at a sample rate of 44.1 kHz/16-bits via a Sony PCM 1630 encoder and recorded on U-Matic or VHS tape, most likely some time during the mid 1980s. In 1999, the Paul V. Galvin Library requested and obtained copies of the recordings, which were delivered on Digital Audio Tape (DAT) at a sample rate of 48 kHz/16-bits. These DAT copies were finally transferred to WAV files in 2007-2008; all audio material available on this site is derived from these files.

In order to provide the best possible listening experience, a number of steps have been taken to modify the files for their presentation in this collection. These steps have always been carried out with a deep concern for the historical integrity and aural character of the recordings; the changes are minor, and no material has been deleted in any way. First, longer interviews that were recorded on multiple spools have been edited together into a single program, and conversely, spools that contained more than one interview have been edited into separate programs. Second, the files have been digitally remastered and restored to improve fidelity and to address the sonic defects that are the unfortunate by-products inherent in the deterioration of original wire recordings and the multiple transfers that have taken place over the years. Such defects include hum, distortion, static, clipping, whine, buzzing, cut-outs, and uneven volume levels. These problems have been treated using bandwidth limiting, impulse removal, tonal disturbance filtering, broadband noise reduction, and other audio processing tools. Despite the use of modern digital restoration techniques, problems with the audio persist, and in some cases have significantly hampered the transcription of the material.

Transcriptions & Translations

Boder's own transcription method was unorthodox: He would listen to the original wire recording and then dictate an English translation onto a second wire recorder or reel-to-reel tape machine. His dictation was then transcribed and typed by his assistants, and later edited by Boder himself. Hence, no transcription of the interviews was ever made in their original languages during his lifetime, except for the few interviews conducted in English.

Many of the translation texts available in this collection were re-typed verbatim from Boder's Topical Autobiographies for the first version of the site in 1999. Some have now been edited to match the actual recordings more closely. Original-language transcriptions of the seventy interviews translated by Boder, as well as transcriptions and translations of forty-eight interviews that Boder was unable to process before he died in 1961, were created in 2008-2009 using the restored audio files. Unfortunately, due to the sonic defects described above, some interviews contain unintelligible passages which cannot be accurately transcribed. Whenever possible, the project has attempted to use transcribers and translators who are both native speakers of the language and have a background in the history of the Holocaust. However, commercial transcription and translation services were also used when efficient alternatives were unavailable.

Text Encoding

The interview texts have been encoded in XML using the schema developed by the Text Encoding Initiative, version P5. The TEI schema offers a robust data model which is used to encode not only the text itself, but also the biographical, historical, and geographical metadata related to the transcriptions, interviewees, and content; scholarly commentary and footnotes; and time-code information from the audio files to facilitate text-audio synchronization. Text encoding was performed using <oXygen/> XML Editor, which provides built-in support for the TEI schema. The Glossary of Terms, Glossary of Camps & Ghettos, and GIS data are also stored in TEI XML format; information from these files is included within the interview files using XInclude.

Click here to view a sample TEI XML interview file.

Data Standards

Geographic data, including location names and coordinates, was taken from the Getty Thesaurus of Geographic Names whenever possible. Other sources for geographic data included Wikipedia, Maplandia.com, and Google Earth. The ISO 3166 standard was used for country names and code elements. Language designations were formatted according to ISO 639-1 alpha-2 names and codes.

Audio Synchronization

Synchronizing the text and audio files for simultaneous presentation involves entering time-code information in milliseconds into the XML files as the value of an attribute of the element containing the text content for corresponding utterance. The TEI schema contains provisions for encoding the start and end times of utterance elements; for this project only the start times were encoded. Time-code information was derived using Transcriber, an open source software tool for segmenting and labeling audio files, and then imported into the XML files.

Search Functionality

Search functionality, including keyword searching, browsing, and faceted search results refinement, is accomplished using Solr, an open source enterprise search server based on the Lucene Java search library. The TEI XML interview files are transformed into a Solr-ingestible XML format using XSLT prior to indexing. Copies of the project's Solr config files are available upon request; please use the Contact link for more information.

Presentation

The interview texts are displayed from the source XML files using PHP. Adobe Flash is used for the audio players, including the synchronized text-audio player. The interactive maps are created using OpenLayers. The project is happy to share source code upon request; please use the Contact link for more information.