TinEye


TinEye is a reverse image search engine developed and offered by Idée, Inc., a company based in Toronto, Ontario, Canada. It is the first image search engine on the web to use image identification technology rather than keywords, metadata or watermarks. TinEye allows users to search not using keywords but with images. Upon submitting an image, TinEye creates a "unique and compact digital signature or fingerprint" of the image and matches it with other indexed images. This procedure is able to match even heavily edited versions of the submitted image, but will not usually return similar images in the results.

History

Idée, Inc. was founded by Leila Boujnane and Paul Bloore in 1999. Idée launched the service on May 6, 2008 and went into open beta in August that year.
While computer vision and image identification research projects began as early as the 1980s, the company claims that TinEye is the first web-based image search engine to use image identification technology. The service was created with copyright owners and brand marketers as the intended user base, to look up unauthorized use and track where the brands are showing up respectively.
In June 2014, TinEye claimed to have indexed more than five billion images for comparisons. However, this is a relatively small proportion of the total number of images available on the World Wide Web.
As of August 2018, TinEye's search results claim to have over 30.8 billion images indexed for comparison.

Technology

A user uploads an image to the search engine or provides a URL for an image or for a page containing the image. The search engine will look up other usage of the image in the internet, including modified images based upon that image, and report the date and time at which they were posted. TinEye does not recognize outlines of objects or perform facial recognition, but recognises the entire image, and some altered versions of that image. This includes smaller, larger, and cropped versions of the image. TinEye has shown itself capable of retrieving different images from its database of the same subject, such as famous landmarks.
TinEye is capable of searching for images in JPEG, GIF, or PNG format., other formats that contain images online, such as Adobe Flash, are not searchable.
Results generated from TinEye include the total number of matches in their database that the submitted image has generated, a preview image and URL to each match, and a function called Compare Images. Compare Images provides a window where the user can switch back and forth between the original image and the search result. TinEye can sort results by best match, worst match, biggest image, or smallest image.
User registration is optional and offers storage of the user's previous queries. Other features include embeddable widgets and bookmarklets. TinEye has also released their commercial API.

Algorithm

Although TinEye doesn't disclose the exact algorithms used, there are techniques similar to the company's how it works description that achieve the same goal of matching images. One such algorithm is perceptual hashing which is used to create a hash from sample image. Here is an example of a basic average-hash algorithm, which is similar to but simpler than a perceptual hash, written by Dr. Neal Krawetz:
  1. Reduce size In pictures, high frequencies give detail while low frequencies show structure; we want the latter. The fastest way to remove high frequencies and detail is to shrink the image. In this case, shrink it to 8x8 so that there are 64 total pixels. Don't bother keeping the aspect ratio, just crush it down to fit an 8x8 square. This way, the hash will match any variation of the image, regardless of scale or aspect ratio.
  2. Reduce color Compute the mean value of the 64 colors.
  3. Average the colors To get the lowest frequencies in the image, take only a smaller part of the already reduced image. For example, if the DCT is 32x32, just keep the top-left 8x8.
  4. Compute the bits Each bit is simply set based on whether the color value is above or below the mean.
  5. Construct the hash Set the 64 bits into a 64-bit integer. The order does not matter, just as long as you are consistent. Your end result hash will look something like this: 8f373714acfcf4d0
The resulting hash won't change if the image is scaled or the aspect ratio changes. Increasing or decreasing the brightness or contrast, or even altering the colors won't dramatically change the hash value.
To compare two images, construct the hash from each image and count the number of bit positions that are different. This is a Hamming distance. A distance of zero indicates that it is likely a very similar picture or a variation of the same picture. A distance of 5 means a few things may be different, but they are probably still close enough to be similar. A distance of 10 or more is a probable indication that the images are different.

Usage

TinEye's ability to search the web for specific images makes it a potential tool for the copyright holders of visual works to locate infringements on their copyright. It also creates a possible avenue for people who are looking to make use of imagery under orphan works to find the copyright holders of that imagery. Being that orphan works can be defined as "copyrighted works whose owners are difficult or impossible to identify and/or locate," the use of TinEye could potentially remove the orphan work status from online images that can be found in its database.