Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limiting false positives #237

Open
iFlash opened this issue Nov 1, 2017 · 15 comments
Open

Limiting false positives #237

iFlash opened this issue Nov 1, 2017 · 15 comments

Comments

@iFlash
Copy link

iFlash commented Nov 1, 2017

During excessive testing, I found that I get about 5 to 10 percent false positives. But I found out – and I do not think that this is documented - if you check the error of the decodedCodes, you will find that by rejection codes with a certain error margin will increase your hit rate 100 percent.

This is the code I use. I take the average error margin. If it is below 0.1 it is fair to assume we detected the code correct.

So far I had no false positives at all while still a very fast detection.

var countDecodedCodes=0, err=0;
$.each(result.codeResult.decodedCodes, function(id,error){
    if (error.error!=undefined) {
        countDecodedCodes++;
        err+=parseFloat(error.error);
    }
});
if (err/countDecodedCodes < 0.1) {
    // correct code detected
} else {
    // probably wrong code
}

Hope this helps!

@braindigitalis
Copy link

braindigitalis commented Nov 17, 2017

Hi,

I too am struggling with error rates. I have tried your code and unfortunately still get errors, the longer the code the more likely there will be an error.

For example scanning the attached
code for me sometimes scans correctly, at other times returns garbage within the string such as "6e&n Connery":

sean-connery-barcode

My own personal sanity check on the bar codes is to dictate a valid format of the returned string, e.g. reject any results that contain non-alphanumeric such as &, #, @, etc. This may work for you, but if you actually want to accept strings containing these characters this workaround is not a valid solution.

I've combined your solution with my own, which has reduced my error rate to zero in my use case:

         Quagga.onDetected(function(result) {
                var code = result.codeResult.code;

                if (App.lastResult !== code) {
                        App.lastResult = code;

                        var countDecodedCodes=0, err=0;
                        $.each(result.codeResult.decodedCodes, function(id,error) {
                                if (error.error!=undefined) {
                                        countDecodedCodes++;
                                        err += parseFloat(error.error);
                                }
                        });
                        if (err / countDecodedCodes < 0.1 && sanityCheck(code)) {
                                Quagga.stop();
                                $("#scanModal").modal("hide");
                                $(linked_input).val(code);
                                border_pulse(linked_input);
                        }
                }
        });
        function sanityCheck(s) {
                return s.toUpperCase().match(/^[0-9A-Z\s\-\.\/]+$/);
        }

Hope this helps!

@iFlash
Copy link
Author

iFlash commented Nov 17, 2017

My routine was done for UPC and EAN-13 codes. It might not work with other codes as nicely.

You should inspect the object codeResult.decodedCodes closely, especially its error field. Have them logged in the console to see the range and adjust your threshold accordingly.

OR: Change the line

if (err/countDecodedCodes < 0.1)

To a lower value like 0.08. The lower the value, the higher the chance the code was interpreted correctly. But also it might take longer to correctly identify the code.

@agusdutra
Copy link

agusdutra commented Dec 21, 2017

@serratus Could we add these feature to the source adding a property to the reader say errorLimit ?: number and check this error limit before publishing to onDetected .

It's a nice validation since a lot of codes have a error > 0.1 and are false-positives.

I'm not quite sure where it would be right to add this validation, but would be eager to do it if I could get some guide.

@sam-lex
Copy link

sam-lex commented May 16, 2018

Based on @iFlash answer, I made it using median instead of averages.

private _getMedian(arr: number[]): number {
  arr.sort((a, b) => a - b);
  const half = Math.floor( arr.length / 2 );
  if (arr.length % 2 === 1) // Odd length
    return arr[ half ];
  return (arr[half - 1] + arr[half]) / 2.0;
}

// Initializers
private _initDetectedHandler() {
  this.onDetectedHandler = (result) => {
    const errors: number[] = result.codeResult.decodedCodes
      .filter(_ => _.error !== undefined)
      .map(_ => _.error);
    const median = this._getMedian( errors );
    if (median < 0.10)
      // probably correct
    else
      // probably wrong
  };

  Quagga.onDetected( this.onDetectedHandler );
}

During my tests (built-in webcam), I noticed that many times it reads correctly, but its averages were all above 0.1 because some of the errors has a much higher value like 0.3 ..0.4 while others are 0.05. .0.07, thus "pulling" the average up.

Medians represents most of the dataset a bit better. That being said, I still get false positives occasionally. Hit rate of probably 7-8/10, but a faster match than averages.

@fffx
Copy link

fffx commented Aug 21, 2020

If you using this for ISBN, you can make use of check digit

@ericblade
Copy link
Collaborator

fwiw, check digits alone don't seem to work well in practice. At least with UPC, it seems quite possible to get a significantly erroneous read, and still have the check digit come out matching the erroneous read. As well, I've run into quite a few UPCs that are stamped on actual product packages, but don't pass check digit validation. Mostly older stuff, though, it's probably improved quite a bit in the last several years on newer items.

I've validated this with both my own validation library, as well as other online validation resources, just to make sure that my library worked correctly.

So, a strategy that I'm wanting to put together soon is something along the lines of "if check digit validates and error rate < 50%, -or- if error rate < 20%" or something like that. tweak the numbers some to see what works.. but allow both to pass through to my app.

@fffx
Copy link

fffx commented Aug 24, 2020

@ericblade Thanks, I realized the barcode is too small(ISBN), then I set a [zoom value] (#307), the accuracy has boosted dramatically.

@ericblade
Copy link
Collaborator

Yeah, that is also something that I think I'd like to start playing with, I just noticed it my last run through of the source code (it's amazing how many times you can go through all this stuff, and not notice certain things), and it does seem like something that could be useful.

@reon777
Copy link

reon777 commented Oct 27, 2020

Its good for me.

let codes = []
function _onDetected(result) {
	codes.push(result.codeResult.code)
	if (codes.length < 3) return
	let is_same_all = false;
	if (codes.every(v => v === codes[0])) {
		is_same_all = true;
	}
	if (!is_same_all) {
		codes.shift()
		return
	}
}

@julienboulay
Copy link

@ericblade
Base on @sam-lex answer, I get good results (near 100%) validating errors against two threshold : median et max values

function isValid(result) {
const errors: number[] = result.codeResult.decodedCodes
   .filter(_ => _.error !== undefined)
   .map(_ => _.error);

const median = this._getMedian(errors);

//Good result for code_128 : median <= 0.08 and maxError < 0.1
return !(median > 0.08 || errors.some(err => err > 0.1))
}

@ericblade
Copy link
Collaborator

0.08 :O that seems very low. out of curiosity, do you have any idea how long it takes you to get a result with that, normally?

that's usually part of the tradeoffs -- taking long to get a good result, and the battery lifetime involved in doing so

@ericblade
Copy link
Collaborator

I am now playing with something resembling this

function getMedian(arr) {
    const sorted = [...arr].sort((a, b) => a - b);
    const half = Math.floor(sorted.length / 2);
    if (arr.length % 2 === 1) {
        return arr[half];
    }
    return (arr[half - 1] + arr[half]) / 2;
}

function getMedianOfCodeErrors(decodedCodes) {
    const errors = decodedCodes.filter((x) => x.error !== undefined).map((y) => y.error); // TODO: use reduce
    const median = getMedian(errors);
    return { probablyValid: !(median > 0.10 || errors.some((err) => err > 0.1)), median };
}

...
        const err = getMedianOfCodeErrors(result.codeResult.decodedCodes);
        const validated = barcodeValidator(result.codeResult.code);
        console.warn('* errorCheck', result.codeResult.code, err, validated);
        if (err.probablyValid || (err.median < 0.25 && validated.valid === true && validated.type === 'upc')) {
            onDetected(result);
        }

barcodeValidator is from https://github.com/ericblade/barcode-validator

@primeKal
Copy link

primeKal commented Sep 1, 2021

you can try this make an array and select the highest mode value

 function mode(array){
            if(array.length == 0)
                return null;
            var modeMap = {};
            var maxEl = array[0], maxCount = 1;
            for(var i = 0; i < array.length; i++)
            {
                var el = array[i];
                if(modeMap[el] == null)
                    modeMap[el] = 1;
                else
                    modeMap[el]++;
                if(modeMap[el] > maxCount)
                {
                    maxEl = el;
                    maxCount = modeMap[el];
                }
            }
            return maxEl;
        }


  var last_result=[];


            Quagga.onDetected(function (result) {
                var last_code = result.codeResult.code;
                last_result.push(last_code);
                if (last_result.length >20){
                console.log(last_result);
                //when we reached the last scanned object take the most repeated is the correct one
                code = mode(last_result);
                console.log(code +" Is the most valid one");
                  }
            });

@dansleboby
Copy link

Great list of REGEX depending of barcode type:
https://www.neodynamic.com/Products/Help/BarcodeWinControl2.5/working_barcode_symbologies.htm

@ericblade
Copy link
Collaborator

Great list of REGEX depending of barcode type: https://www.neodynamic.com/Products/Help/BarcodeWinControl2.5/working_barcode_symbologies.htm

Perhaps useful, but it's worth noting that that regex comparisons only tell you that it might be valid, not that it is valid. I would doubt that a decoder is going to return something that wouldn't pass a regex, but it can easily due to read errors, return something that doesn't actually pass a checksum test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants