Skip to content
@DS4SD

IBM Deep Search

Developer tools for IBM Deep Search

Welcome to IBM Deep Search

Deep Search extracts and structures data from documents in four steps: Parse, Interpret, Index, and Integrate. Try out the first steps on our public system, where we have a live PDF to JSON inspector. With the inspector, you can see how your (programmatic) PDF documents get converted into JSON.

Deep Search also provides a programmatic access to the service, for easy integration with other tools or in order to do bulk conversion. Our python toolkit provides these functionalities both as a client and library. Our examples repository is very useful to get started.


Publications

Find here our extensive list of publications!

Gallery

Image extraction Table Understanding
image table
List resolution Math Formula
list math
Complex Layout Colored layout
complex complex

Pinned Loading

  1. docling docling Public

    Get your documents ready for gen AI

    Python 9.2k 434

  2. deepsearch-toolkit deepsearch-toolkit Public

    Interact with the Deep Search platform for new knowledge explorations and discoveries

    Python 135 19

  3. deepsearch-examples deepsearch-examples Public

    Examples using the Deep Search functionalities

    Python 46 14

  4. DocLayNet DocLayNet Public

    DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis

    269 15

Repositories

Showing 10 of 23 repositories
  • docling Public

    Get your documents ready for gen AI

    DS4SD/docling’s past year of commit activity
    Python 9,167 MIT 434 50 (3 issues need help) 11 Updated Nov 15, 2024
  • DS4SD/docling-ibm-models’s past year of commit activity
    Python 39 MIT 9 5 3 Updated Nov 15, 2024
  • docling-parse Public

    Simple package to extract text with coordinates from programmatic PDFs

    DS4SD/docling-parse’s past year of commit activity
    C++ 26 MIT 8 3 2 Updated Nov 14, 2024
  • docling-core Public

    A python library to define and validate data types in Docling.

    DS4SD/docling-core’s past year of commit activity
    Python 29 MIT 7 2 3 Updated Nov 14, 2024
  • deepsearch-glm Public

    Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.

    DS4SD/deepsearch-glm’s past year of commit activity
    C++ 21 MIT 3 2 1 Updated Oct 23, 2024
  • deepsearch-toolkit Public

    Interact with the Deep Search platform for new knowledge explorations and discoveries

    DS4SD/deepsearch-toolkit’s past year of commit activity
    Python 135 MIT 19 8 11 Updated Oct 17, 2024
  • docling-serve Public

    Running Docling as an API service

    DS4SD/docling-serve’s past year of commit activity
    Makefile 15 MIT 3 0 1 Updated Oct 11, 2024
  • MolGrapher Public

    MolGrapher: Graph-based Visual Recognition of Chemical Structures

    DS4SD/MolGrapher’s past year of commit activity
    Python 47 MIT 3 0 1 Updated Oct 9, 2024
  • DS4SD/DS4SD.github.io’s past year of commit activity
    CSS 10 MIT 1 0 0 Updated Oct 8, 2024
  • quackling Public archive

    Build document-native LLM applications

    DS4SD/quackling’s past year of commit activity
    Python 50 MIT 1 0 0 Updated Sep 11, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…