California Digital Newspaper Collection


The California Digital Newspaper Collection ' is a freely-available, archive of digitized California'' Newspapers; it is accessible through . The collection contains 433,033 issues comprising 4,976,984 pages and 32,437,924 articles. The project is part of the at the University of California Riverside.

History

The Center for Bibliographical Studies and Research, was one of six initial participants, in the National Digital Newspaper Program ; a newspaper digitization project established from a partnership between, the Library of Congress and the National Endowment for the Humanities. Between 2005 and 2011, the CBSR received three, 2-year grants, and contributed around 300,000 pages to Chronicling America, the public face of the NDNP. Published newspaper titles submitted include, the San Francisco Call, , Amador Ledger, and the Imperial Valley Press. In 2015, the CBSR received a 4th grant from the National Digital Newspaper Project. Between 2015 and 2017, the project contributed another 100,000 pages from the Gold Rush Era, as well as, Foreign Language newspapers.
The California Digital Newspaper Collection was officially launched in 2007, and contained the initial 100,000 pages produced for the National Digital Newspaper Project from 2005 to 2007. Another 50,000 pages were created, with support from the
Institute of Museum and Library Services, under the provisions of the Library Services and Technology Act,, administered in California by the State Librarian. All content contributed to NDNP is also hosted in the CDNC, with important differences, noted below in Digitization. Between 2007 and 2013, the CDNC digitized roughly 300,000 pages through the LSTA program, administered by the California State Library. In 2014, the project announced a ', supported by LSTA, to digitize one title per county, up through 1923.
In 2010, the CDNC initiated the Born Digital Project, with the goal to collect and host contemporary PDFs from newspaper publishers. Roughly a dozen publishers have or do participate in the project. See for more information.

Digitization

The California Digital Newspaper Collection follows standards established by the National Digital Newspaper Program. Microfilm or newsprint is scanned to create TIFF images; whenever possible, master negative film is used. The CBSR manages an archive of approximately 100,000 reels of negative film. These are stored and maintained by the California Newspaper Microfilm Archive. When negative film isn't available positive can be used, but image quality and OCR will not be as good.
The TIFF images are then processed or "digitized" to create derivative files, including a JP2, PDF, and METS/ALTO XML for each page.
Unlike NDNP, the CDNC has traditionally digitized to article-level rather than just page-level. Individual "segments" on a page—articles, illustrations, advertisements, etc.--are identified during digitization and can be retrieved by the researcher. For an illustration of the difference between page- and article-level, compare the San Francisco Call in the CDNC to the same title in Chronicling America.
Recently the CDNC has begun digitizing some titles to page-level, but most are still article-level. The main advantage of page-level is lower cost when done in an automated fashion, without human input.

Papers covered