Skip to content

Knowledge Graph Embeddings (KGE) for RAG-LLMs. Our goal was to compare the mathematical differences between Traditional Static Multimodal Vector Embeddings (TVE) from Word2Vec and CLIP encoders for {text:image} datasets, and Knowledge Graph Embeddings (KGE) generated with REBEL triplets trained on PyKeen.

License

Notifications You must be signed in to change notification settings

dsgiitr/kge-clip

Repository files navigation

Upgrading from Vectors to Graphs: Knowledge Graph Embeddings and Graph-RAG

Knowledge Graph

You can access the full project documentation at Gitbook Link!

Readme Sections:

  1. Access the Dataset
  2. Creating Traditional Vector Embeddings
  3. Embeddings Visualization in 3D
  4. Generating Knowledge Graphs
  5. PyKeen Knowledge Graph Embedding Training
  6. Storing Embeddings in FAISS index
  7. Running the KG visualiser web-app
  8. RAG_VLM

Directory Structure

.
β”œβ”€β”€ 1_Traditional_Vector_Embeddings   # Traditional text and image embeddings using Word2Vec and CLIP
β”œβ”€β”€ 2_Knowledge_Graphs                # Code and resources for generating Knowledge Graphs and extracting triplets
β”œβ”€β”€ 3_KG_Embeddings                   # Knowledge Graph Embeddings (KGE) training using PyKeen and dimensionality reduction
β”œβ”€β”€ 4_Deployment_dev                  # Scripts for deploying and testing embedding models
β”œβ”€β”€ 6_FAISS_embeddings                # FAISS-based search for efficient embedding retrieval and comparisons
└── README.md                         # Project documentation

Follow the directories to get src, assets for image and text datasets

Additional Directories

πŸ“‚ /src:
Core code for training Knowledge Graph Embeddings (KGE) using PyKeen, including scripts, configs, and data utilities.

πŸ“‚ /assets:
Contains embedding results, visualizations, and key outputs from the models.

πŸ“‘ /notebooks:
Jupyter notebooks for visualizing and comparing traditional and Knowledge Graph Embeddings (KGE).

Setup Guide and Results

1. Access the dataset

  • The dataset of 1k reduced COYO700M dataset can be found Here

2. Creating Traditional Vector Embeddings

4 Methods were used to create text embeddings and 1 CLIP notebook can be accessed for Image embeddings.

  1. CLIP Embeddings
  2. InferSent Embeddings
  3. Universal Sentence Encoder
  4. Bert
  5. CLIP for Image Embeddings
Step Description
1 Open the eg.CLIP_Embeddings.ipynb notebook.
2 Run all the cells to load the CLIP model and generate embeddings.
3 Follow the instructions in the notebook to input your data and obtain embeddings.

Requirements

  • Python 3.x
  • Required libraries (list them here)

How to Install

  1. Clone the repository.
    git clone https://github.com/dsgiitr/kge-clip.git
    cd 1.Traditional_Vector_Embeddings
  2. Install the required libraries using pip install -r requirements.txt.
  3. Open the Jupyter notebooks and follow the instructions.

3. Embeddings Visualization in 3D

To visualize text and image embeddings, use the following notebooks:

  1. Text Embeddings Visualizer
  2. Image Embeddings Visualizer

Each embedding and cluster will be saved in metadata.tsv.

To launch TensorBoard, use:

%tensorboard --logdir /path/to/logs/embedding

4. Generating Knowledge Graphs

Knowledge graphs foe both {text:image} pairs were generated using the following steps:

  1. Triplet Extraction
    Run the Rebel_extraction.ipynb notebook to extract triplets using the BabelScape REBEL-large model. You can find the notebook here.

  2. Knowledge Graph Generation and Visualization
    Use the KG.ipynb notebook to generate knowledge graphs and visualize them using Neo4J, NetworkX, and Plotly. Access the notebook here.

Running Neo4J Database Instance

To run a local Neo4J instance and visualize the knowledge graph:

  1. Install Neo4J
    Download and install Neo4J from the official site.

  2. Start Neo4J
    Run the following code snippet to set up a Neo4J database remotely after setting up an account.

from neo4j import GraphDatabase

# Connect to Neo4j
uri = "neo4j+s://647567ec.databases.neo4j.io"  # Replace with your Neo4j instance URI
username = "neo4j"
password = "mnx05CnETPwiMvSG7vQBZQwvJLz951fKhX-3zDfNVQg"  # Replace with your Neo4j password
driver = GraphDatabase.driver(uri, auth=(username,password))

def create_nodes_and_relationships(tx, head, type_, tail):
    query = (
        "MERGE (a:head {name: $head}) "
        "MERGE (b: tail {name: $tail}) "
        "MERGE (a)-[r : Relation {type: $type}]->(b)"
    )
    tx.run(query, head=head, type=type_, tail=tail)

#df_rebel_text=df_rebel['triplet'].tolist()
# Open a session and add data
with driver.session() as session:
    for row in triplets:
        session.write_transaction(create_nodes_and_relationships, row['head'], row['type'], row['tail'])

print("Knowledge graph created successfully!")

driver.close()
  1. Run the following CyPhwer query on Neo4J Database instance:
MATCH (n)-[r]->(m)
RETURN n, r, m

5. PyKeen Knowledge Graph Embedding Training

The PyKeen model is trained on Text and Image KG triplets extracted using Babelscape REBEL-large.

PyKeen Model Configuration

from pykeen.pipeline import pipeline

result = pipeline(
    model='TransE',  # Choose a graph embedding technique
    loss="softplus",
    training=training_triples_factory,
    testing=testing_triples_factory,
    model_kwargs=dict(embedding_dim=3),  # Set embedding dimensions
    optimizer_kwargs=dict(lr=0.1),  # Set learning rate
    training_kwargs=dict(num_epochs=100, use_tqdm_batch=False),  # Set number of epochs
)

The trained KGE for both text and Image are further reduced to 3D space using PCA/UMAP & t-SNE. Result embeddings and media can be found in the assets folder here


6. Storing Embeddings in FAISS index

FAISS database was used to store the {text:image} Vector and Knowledge Graph embeddings for using it further with RAG-LLMs

Access the FAISS index notebook here Set the dimensions as per what the LLM model needs.

import faiss

dimension=512
index=faiss.IndexFlatL2(dimension)

index.add(embeddings_img_array) #add the img embedding in faiss
index.add(embeddings_text_array) # add text embedding in faiss

faiss.write_index(index, 'faiss_traditional_vector_embedding.index')

7. Running the KG visualiser web-app

This repository contains a Flask-based web app that supports:

  • Text-Based Knowledge Graph Generation
  • Image-Based Knowledge Graph Generation
  • Text & Image Vector Embedding and Knowledge Graph Embedding with TensorBoard

The app utilizes Python libraries, the REBEL model, and Graphviz for advanced graph visualization.

Follow these steps to set up and run the web app.

Prerequisites

Ensure your environment meets the following requirements:

  1. Python 3.7 or higher
  2. pip (Python package installer)
  3. Graphviz for advanced graph visualization

Installation

  1. Clone the Repository

Fork the project and clone it to your local machine:

git clone https://github.com/dsgiitr/kge-clip.git
cd kge-clip/deployment_dev

Set Up and Run the Flask App. Activate a virtual environment to manage dependencies:

  • On Windows:
python -m venv venv
venv\Scripts\activate
  • On macOS/Linux:
python3 -m venv venv
source venv/bin/activate

Install Dependencies Install the required Python packages:

pip install flask transformers torch pandas networkx matplotlib plotly graphviz

Running the Flask App Activate the Virtual Environment and start the Flask App.

  • On Windows:
venv\Scripts\activate
set FLASK_APP=app.py
  • On macOS/Linux:
source venv/bin/activate
export FLASK_APP=app.py

Run the Flask app with:

flask run

Open your web browser and navigate to http://127.0.0.1:5000/ to start using the app.


8. RAG_VLM

This module demonstrates how FAISS-based Knowledge Graph Embeddings (KGE) and Traditional Vector Embeddings (TVE) are utilized in conjunction with a Vision-Language Model (VLM) for image inference. The VLM (LLaVA) leverages CLIP embeddings for processing the test image.

  • CLIP Embeddings: CLIP provides a shared latent space for images and text, enabling multimodal embeddings that are used for cross-modal retrieval.
  • FAISS Index: Both the KGE (Knowledge Graph Embeddings) and TVE (Traditional Vector Embeddings) are stored in FAISS, facilitating fast similarity searches.
  • VLM (LLaVA): This model was utilized to generate text descriptions from images, and the embeddings generated by the CLIP processor are used for retrieving the most similar images from FAISS indices.

Workflow:

  1. Image Captioning with VLM (LLaVA):

    • The VLM model generated the following caption for the test image:
      • ['A young girl is smiling and showing her teeth', 'She is wearing a colorful shirt and a brown scarf'].
  2. CLIP Embeddings Generation:

    • CLIP processor was used to create image embeddings for the test image.
  3. FAISS Index Loading:

    • Loaded FAISS KGE (Knowledge Graph Embeddings) and TVE (Traditional Vector Embeddings), trained on PyKeen with REBEL triplets and image embeddings.
  4. Similarity Search:

    • A similarity search was performed on the test image embedding across both FAISS indices (KGE and TVE).
  5. Ranking of Similar Images:

    • The top-ranked images were retrieved based on the highest similarity scores in both FAISS indices.
    image_path = ["/content/RAG_test_image.jpeg"]
    image_search_embedding = get_features_from_image_path(image_path)
    distances, indices = index_tve.search(image_search_embedding.reshape(1, -1), 2)
    distances = distances[0]
    indices = indices[0]
    indices_distances = list(zip(indices, distances))
    indices_distances.sort(key=lambda x: x[1], reverse=True)
  6. Results:

    • TVE Similarity: [(73, 81.27001), (149, 77.19481)]
    • KGE Similarity: [(2406, 121.6897), (163, 121.454765)]
  7. Image Relevance:

    • The retrieved images from both FAISS indices were visually compared for relevance to the original test image.
  8. Dependency og KGE FAISS:

    • More fine tuned Triplet Extraction
    • PyKeen Training methods for Embedding generation
    • Combining Entity and Relation Embeddings.

Results and Comparisons

Note

Detailed result and descriptions are explained in the DSG Gitbook

The results were divied into

  1. Traditional Vector embeddings 3D Reduced visualisation using Tensorboard. πŸ“‚ Results Folder
  2. Similarity scores of reduced embeddings of different Text encoder. πŸ“‚ Results Folder
  3. Comparing image and text vector embeddings disparity and contextual drawbacks. πŸ“‚ Results Folder
  4. Scene Graph Generation of {text:image} pair using VLM & Relationformer. πŸ“‚ Results Folder
  5. KG Visualisation with Neo4j, NetworkX, Plotly and Graphviz. πŸ“‚ Results Folder
  6. KG and traditional vector Embeddings .csv πŸ“‚ Results Folder

Core Contributors

The list of core contributors to this repository are (mentioned alphabetically):


Contributions πŸš€

We welcome contributions to improve this project! To contribute:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Commit your changes with clear and descriptive messages.
  4. Push the changes to your fork and submit a pull request.

Important

Please ensure your contributions align with the project's coding standards and include relevant documentation or tests. For major changes, consider opening an issue to discuss your approach first.

About

Knowledge Graph Embeddings (KGE) for RAG-LLMs. Our goal was to compare the mathematical differences between Traditional Static Multimodal Vector Embeddings (TVE) from Word2Vec and CLIP encoders for {text:image} datasets, and Knowledge Graph Embeddings (KGE) generated with REBEL triplets trained on PyKeen.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages