Have you ever searched for an ancestor’s name on Newspapers.com™ but not gotten a hit, even though you suspect a match should be there?
There are a few reasons why this can happen: Maybe the person wasn’t mentioned in the paper by the name or spelling you’re searching for. Or maybe our site doesn’t currently have the paper where their name appears. But another possible reason has to do with OCR.
What is OCR?
All the newspaper pages on Newspapers.com have been indexed using Optical Character Recognition (OCR). This means that a computer has tried to identify the words on each page and produce a digital version to search. When the newspaper image is clean and in good condition, this process is very accurate and can make searching papers easy. For older papers or other papers where the image is less clear, the OCR processing may miss words or read them incorrectly.
In the majority of instances, the Newspapers.com Search can correctly locate the names and words you’re searching for—thanks to OCR. However, given the condition of some historical newspapers, OCR has limits as it tries to decipher text from papers dating back as far as the 1600s!
So is it a lost cause if one of your Newspapers.com searches doesn’t return the matches you want due to the idiosyncrasies of OCR?
No! We’ve gathered our top 5 strategies for those occasional times when the OCR can’t identify the name you’re searching for.
1. Substitute letters (and numbers) with similar shapes.
Depending on the newspaper’s typeface and the quality of the image, the OCR might mistake certain characters for others that look similar. The trick here is to make an educated guess about what the OCR thinks it’s seeing and then adjust your search terms to reflect that. This often means substituting letters (and numbers) that have a similar shape.
Common substitutes to try include y & g, b & h, c & e, t & f, l & 1, and s & 8.
Keep in mind that this might extend to multi-letter combinations as well, such as rn & m and rr & n.
So while your ancestor’s name may have been C-a-r-r-i-e Smith, the OCR might read it as C-a-m-e Smith in some instances. Searching for both variations will likely return more matches in your search results.
2. Search with wildcards.
In addition to searching with similarly shaped letters and numbers, you can also try using a “wildcard” to replace a letter (or multiple letters) that the OCR might be misreading.
READ MORE: Learn about using wildcards
Two wildcards you can use on Newspapers.com are the question mark [?] and asterisk [*].
Use a question mark to replace a single letter. For example, if the person you’re searching for has the surname “Johnson” but you think the OCR might be misinterpreting the lowercase h as another letter, you can search [Jo?nson] to return a wider set of results.
Use an asterisk to replace multiple letters. If you think there’s a possibility the OCR may be misreading a multi-letter combination, try searching with an asterisk. For instance, with our earlier “Carrie Smith” example, you can try searching for [Ca*ie Smith] to account for a possible OCR issue.
3. Search without quotation marks.
Although placing quote marks around multi-word search terms is useful in many cases, it can be less helpful when dealing with a potential OCR issue.
Using quote marks around your search terms forces Search to return results for that exact phrase. But if the OCR is misreading one (or more) of the letters in your search terms, searching for an exact phrase may actually cause it to miss a match. Removing the quote marks around your search terms will give the search more flexibility, increasing the chances of finding what you’re looking for.
4. Search for other names or for related terms.
Sometimes when OCR is the issue, you may need to use a different search strategy altogether. If the OCR isn’t picking up a name you’re searching for, instead try searching for the name of someone else who is likely to have appeared in the same newspaper article. For example, when trying to find someone’s obituary, try searching for the name of their spouse, parents, children, siblings, or another relative. While the OCR might have missed the name you were originally trying to find, the odds are good that it will still pick up a different name on the same page.
A similar approach is to search for a word or phrase that might be used in an article about the person, such as the name of an organization, club, or church they belonged to. Then use the Newspapers.com search filters to narrow your results by time and location until you find a likely match.
5. Browse the paper.
As convenient and accurate as Search is on Newspapers.com, in rare cases you might need to browse a newspaper instead. So if a search isn’t locating a match you suspect should be there, determine which newspaper editions seem like the best candidates and then go through them page by page until you find what you’re looking for.
Newspapers.com makes browsing easy as well. Select the “Browse” tab at the top of the site, then narrow by location, paper, and date until you find the issue you want to browse. Once you’ve started reading, use the arrows or film strip at the bottom of the Viewer to move to the next page of the paper.
READ MORE: Learn about browsing on Newspapers.com
We hope you find these strategies and tips useful! If you have any to add, share them with us in the comments!
Like this post? Try one of these!