Unable to recognize letter labels in a image using ocr() function.

2 views (last 30 days)
Hi, I tried using OCR() function to recognize letter labels (i/p's & o/p's) in the image below containing a logic circuit. The image is in grayscale uint8 format, so I converted it to RGB format and tried to use OCR() function to recognize letter labels. But the function doesn't work. Is it due to any noise present in the image. Can someone please suggest a way to make it work.
If the OCR() function can't be used, please suggest an alternative function that can be used on such images.
I tried using pattern recognition, it works, but I am looking for some function that can be used to locate the label letters in the image and draw a circle around them.

Accepted Answer

Birju Patel
Birju Patel on 19 Mar 2015
Hi Jack,
You'll need to do a bit of pre-processing to get OCR to work on your image. Here's what I did to get it working:
img = imread('C.gif');
% Make the image a bit bigger to help OCR
img = imresize(img, 3);
imshow(img)
% binarize image
lvl = graythresh(img);
BWOrig = im2bw(img, lvl);
figure, imshow(BWOrig)
% First remove the circuit using connected component analysis.
BWComplement = ~BWOrig;
CC = bwconncomp(BWComplement);
numPixels = cellfun(@numel, CC.PixelIdxList);
[biggest,idx] = max(numPixels);
BWComplement(CC.PixelIdxList{idx}) = 0;
figure, imshow(BWComplement)
% Next, because the text does not have a layout typical to a document, you
% need to provide ROIs around the text for OCR. Use regionprops for this.
BW = imdilate(BWComplement, strel('disk',3)); % grow the text a bit to get a bigger ROI around them.
CC = bwconncomp(BW);
% Use regionprops to get the bounding boxes around the text
s = regionprops(CC,'BoundingBox');
roi = vertcat(s(:).BoundingBox);
% Apply OCR
% Thin the letters a bit, to help OCR deal with the blocky letters
BW1 = imerode(BWComplement, strel('square',1));
% Set text layout to 'Word' because the layout is nothing like a document.
% Set character set to be A to Z, to limit mistakes.
results = ocr(BW1, roi, 'TextLayout', 'Word','CharacterSet','A':'Z');
% remove whitespace in the results
c = cell(1,numel(results));
for i = 1:numel(results)
c{i} = deblank(results(i).Text);
end
% insert recognized text into image
final = insertObjectAnnotation(im2uint8(BWOrig), 'Rectangle', roi, c);
figure
imshow(final)
This final image is
Hope that helps, Birju

More Answers (4)

Image Analyst
Image Analyst on 19 Mar 2015
First of all, remove blobs with areas larger than the number of pixels in a letter, like 200 or so. See my Image segmentation tutorial to learn how to do that. http://www.mathworks.com/matlabcentral/fileexchange/?term=authorid%3A31862 That will get rid of the circuitry and leave only the letters.
Then the OCR in the Computer Vision System Toolbox should work. http://www.mathworks.com/help/vision/ref/ocr.html#bt548t1-2_1
  2 Comments
Jack Smith
Jack Smith on 19 Mar 2015
Hi, thanks for your answer. I tried editing the image in MS Paint, removed all the circuitry than tried using OCR() function. Still it doesn't work. So, I feel like OCR() may not work even after segmentation to create an image that has all the circuitry removed and and only the letters on that. If it is really so, please suggest a good alternative to OCR() that can be used to recognize letters in the image.
Image Analyst
Image Analyst on 30 Oct 2021
@Jack Smith, saying ocr() doesn't work is a very strange thing to say since the answer you accepted does in fact use the ocr() function.
If you read the ocr() documentation, it says the characters must be at least 20 pixels high. Are yours that high or higher?
Also, you can see published papers on analyzing circuit diagrams and engineering drawings here:

Sign in to comment.


azmi haider
azmi haider on 13 Feb 2018
Amazing work. Thanks

ali saren
ali saren on 9 Jan 2019
Hi,
lots of thanks for your amazing code.
is there a simple way to delete these characters from the pick ?
we have their position but i want to make these words on the picture replace with white spaces .
  4 Comments
Image Analyst
Image Analyst on 9 Jan 2019
Did you look at roi? They're bounding boxes. So the format for each row is [xLeft, yTop, width, height].
for row = 1 : size(roi, 1)
thisROI = roi(row, :); % Extract [xLeft, yTop, width, height]
row1 = ceil(thisROI(2)); % yTop
row2 = row1 + thisROI(4); % yBottom = yTop + height
col1 = ceil(thisROI(1)); % xLeft
col2 = col1 + thisROI(3); % xRight = xLeft + width.
grayImage(row1:row2, col1:col2) = 255; % Whiten this rectangle.
end
ali saren
ali saren on 9 Jan 2019
yes, I've looked at roi it was a little bit confusing to me. but now with you explanation ir's crystal clear.
Thank you so much for your time

Sign in to comment.


Nikhil Challa
Nikhil Challa on 30 Oct 2021
Amazing Code!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!