Skip to content

TensorFlow implementation of a segmentation system for document images.

License

Notifications You must be signed in to change notification settings

gaxler/dataset_agnostic_segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dataset Agnostic Word Segmentation

TensorFlow implementation of Towards A Dataset Agnostic Word Segmentation

Usage

Train

python main.py --name train_example --train-hmap --train-regression --iters 100000 --dataset iamdb  --data_type train --gpu_id 0

Evaluation

python main.py --name train_example --eval-run --iters 500 --dataset iamdb  --data_type val1 --gpu_id 0

Expected output

Sample of Bentham (iclef) Document Sample of Bentham (iclef) Document heatmap
Original Document (Bentham) Heatmap (Bentham)

Dependencies

  • Python 2.7
  • TensorFlow 1.3
  • OpenCV
  • CUDA

Also see requirements.txt

License

Code is released under the MIT License (refer to the LICENSE file for details).

Data

You can load several datasets automaticly by passing two commandline flags:

  1. dataset
  2. data_type
  3. data_type_prob

Possible values for the dataset flag are 'iamdb', 'iclef', 'botany', 'icdar' and 'from-pkl-folder' see below for further details. Available data_type values are 'train', 'val1', 'val2' and 'test' you can mix several data_types by passing a string where each type is separated by '#'. Multiple data types will be sampled uniformly unless you pass a string of probabilities to 'dtype_mix_prob' separated by '#' (NOTICE: string must be same length as data_type string)

For example:

python main.py ... --dataset iamdb  --data_type #train#val1#val2 --data_type_prob #0.75#0.125#0.125

You can change the above by playing with the code in input_stream.py

Assumed directory structure for the available datasets

   Root
   └── datasets
       ├── icdar
       ├── iclef
       ├── iamdb
       ├── botany
       └── botany

You probably want to use a symlinks and not to keep the data in the repo directory

cd datasets
ln -s /path/to/icdar icdar

ICDAR2013

  • Available for download from here (You need the images and word ground truth data)

  • Open the rar files to the following directory structure:

      datasets/icdar
      ├── gt_words_test
      ├── gt_words_train
      ├── images_test
      └── images_train
    
  • Commandline flags: dataset 'icdar' as value to dataset flag, available values for data_type are {train, val1, test}

  • Binary ground truth data converted to bounding box coordinates are available as a pkl from [here]

You can get dataset splits and cached Numpy arrays of bounding box data from here

IAM Database

  • Available for download from here (registration is required)

  • Make sure you have the following directory structure:

     datasets/iamdb
      ├── ascii
      ├── forms
      ├── lines
      ├── sentences
      ├── words
      └── xml
    
  • Commandline flags: dataset 'iamdb' as value to dataset flag, available values for data_type are {train, val1, val2, test}

You can get dataset splits from here

Bentham dataset (iclef)

  • Available for download from here

  • Make sure you have the following directory structure:

     datasets/iclef
      ├── bboxs_train_for_query-by-example.txt
      ├── pages_devel_jpg
      ├── pages_devel_xml
      ├── pages_test_jpg
      ├── pages_test_xml
      ├── pages_train_jpg
      └── pages_train_xml
    
  • Commandline flags: dataset 'iamdb' as value to dataset flag, available values for data_type are {train, val1, val2, test}

You can get dataset splits from here

Botany

  • Available for download from here

  • Make sure you have the following directory structure:

     datasets/botany
     ├── Botany_Train_III_PageImages
     ├── Botany_Train_II_PageImages
     ├── Botany_Train_I_PageImages
     ├── Botany_Test_PageImages
     └── xml
         ├── Botany_Train_III_WL.xml
         ├── Botany_Train_II_WL.xml
         ├── Botany_Train_I_WL.xml
         └── Botany_Test_GT_SegFree_QbS.xml
    

About

TensorFlow implementation of a segmentation system for document images.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages