Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User interface #1

Open
drdhaval2785 opened this issue Oct 30, 2014 · 4 comments
Open

User interface #1

drdhaval2785 opened this issue Oct 30, 2014 · 4 comments

Comments

@drdhaval2785
Copy link

@rakeshvar

Wouldn't it be a good idea to have a frontend for your code - so that people with not too much technical knowledge (like me) will be able to test?

maybe appspot for trial?

@rakeshvar
Copy link
Collaborator

Dr Dhaval,
This currently works only for non-contiguous scripts like South Indian ones, and Gujrati, Oriya etc. This framework does not work for languages with shirorekha!
I am busy with the Machine Learning part (which is almost ready), I need a volunteer to build the front-end, as I am not an expert on such things, I also need a website, and back-end infrastructure.

  • Rakeshvar

@drdhaval2785
Copy link
Author

@rakeshvar
Any progress in this regards ?
At least a bunch of installation files would be a welcome step.
Right now I am not able to test the OCR.

If you can make a video of how-to use this machine - it would be of great help.

I read an article regarding issue of shirorekha in OCR and how it can be circumvented.
The concept crux was something like this.

  1. Let's presume the height of shirorekha as x.
  2. We take the values of black dots on y - axis above a given point and plot it on X-axis.
  3. If the total is >x, we retain it.
  4. If the total is <=x, we remove it (thereby removing the shirorekha and getting separate letters).

Let me explain it with an example.
capture

In this -
cross section 'a' - has value 3x (3 black intersection).
cross section 'b' - has value x (only shirorekha intersection).
cross section 'c' - has value 4x (4 intersections).

So we remove the black points where the value is only 'x' or maybe less than that.
After that we will get separate letters - without shirorekha.

If I get access to the full paper - I will share.
Hope it was useful

@drdhaval2785
Copy link
Author

http://research.ijcaonline.org/volume39/number6/pxc3877076.pdf

This is the paper I was referring to.

@rakeshvar
Copy link
Collaborator

Please check your gmail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants