Millions of historic photos, drawings soon on Flickr

1 September 2014, 10:56 pm EDT By Menchie Mendoza Tech Times
July gaming preview – Splatoon 2, Destiny 2 beta, Final Fantasy 12: The Zodiac Age and more
History buffs would soon be able to view historic images over the Internet. Visitors of the site are allowed to copy and use the pictures freely.  ( Internet Archive/Facebook )

The images belong to the first batch of "The Commons" which is a new collection of the Internet Archive made up of photographs that came from over 600 million book pages digitally scanned by the Internet Archive organization. The pages amount to over 19 petabytes of data with over 14 million images expected to be accessible online.

Currently, Kalev Leetaru has successfully uploaded 2.6 million images to Flickr which is made searchable with the tags that have been automatically added. The images are said to have been difficult to access until this time.

As per Leetaru, digitization projects had so far placed more emphasis on words and ignored pictures.

"For all these years all the libraries have been digitizing their books, but they have been putting them up as PDFs or text searchable works," says Leetaru. "They have been focusing on the books as a collection of words. This inverts that."

The most impressive feature of the Internet Archive's project is the amount of detail that it places to each image. Apart from the descriptions by Flickr, the Internet Archive adds other details such as the book title, where the image came from, the publisher and the year it was published, author, and even subject whenever it's applicable.

Users who are searching for a certain image will receive page hyperlinks where the image had appeared which are all viewable through the Internet Archive's website. Furthermore, users will get a link to the book's description and to the other scanned images of the Internet Archive based on the given title.

Whenever available, the Internet Archive will also come up with lists of any text that comes with the image.

"The latter is especially powerful, as it allows to keyword search 500 years of images, instantly accessing particular topics or themes," stated Flickr in its blog.

Leetaru started working on the project while he was researching on communications technology at Georgetown University located in Washington, D.C. The project is part of a fellowship that is sponsored by Yahoo which owns the photo-sharing site Flickr.

The Internet Archive used a sort of an optical character recognition (OCR) that will analyze each of the 600 million scanned pages to convert the image of each word into searchable text. The software recognizes which parts of a page were pictures and discards them. Leetaru saves each one as a separate Jpeg picture file format. Each Jpeg with an associated text is then posted to a new Flickr page.

© 2017 Tech Times, All rights reserved. Do not reproduce without permission.

From Our Sponsor

Eco-friendly Packaging On Demand Technology Changes The Way Retailers Think About Shipping Packages

On-demand packaging can be eco-friendly as well as save costs. Now, who would have thought of that? Packsize did, and it's changing the business of shipping packages.
Real Time Analytics