OCR of multi-page TIFFs not supported

Category: project oxford


Mark Murphy Invu on Mon, 07 Mar 2016 12:40:36


 I tried OCRing a multi-page TIFF using the code below, but only the first page was returned by GetRetriveText.

 I notice there is no claim to TIF support in https://dev.projectoxford.ai/docs/services/54ef139a49c3f70a50e79b7d/operations/5527970549c3f723cc5363e4 or in https://msdn.microsoft.com/en-us/library/mt634739.aspx and I see that multi-page support will work against the pricing policy.

 Are there plans to support multi-page TIFFs? I understand I could break the TIFF up into separate images and OCR each one.

using Microsoft.ProjectOxford.Vision.Contract;


                OcrResults ocr = vision.RecognizeText(imagefilename);
                ocrText = vision.GetRetrieveText(ocr);




cthrash99 on Mon, 07 Mar 2016 20:41:02

The API is designed to be one image per-call, and the rate-limiting is on a per-call basis, so no there is no plans to change that.  Fortunately, making multiple calls is relatively easy with .net:

using System;
using System.Linq;
using System.Drawing;
using System.Drawing.Imaging;
using Microsoft.ProjectOxford.Vision;
using System.IO;

namespace TestApplication
    class Program
        static void Main(string[] args)
            var client = new VisionServiceClient("{your-key}");

            using (var tiff = new Bitmap(@"{your-tiff-file}"))
                var pages = tiff.GetFrameCount(FrameDimension.Page);
                for (int page = 0; page < pages; page++)
                    tiff.SelectActiveFrame(FrameDimension.Page, page);

                    using (var stream = new MemoryStream())
                        tiff.Save(stream, ImageFormat.Png);
                        stream.Position = 0;

                        var result = client.RecognizeTextAsync(stream).Result;

                        var words = result.Regions.SelectMany(region => region.Lines.SelectMany(line => line.Words.Select(word => word.Text)));

                        Console.WriteLine(string.Join(" ", words));