Skip to content

apurba-pp/RL-based-Maze-Solver

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

RL-based Maze Solver

Open Projects 2021

Input Maze Animated Solution
fig1 Input Maze Animated Solution

Input Maze Solution
fig2 Input Maze Solution

Random Maze Solution
fig3 Random Maze Solution

Abstract

The aim of this project is to solve a given 2D maze (given as an image input) using RL (Q-Learning) algorithm. The program first processes the image input of the maze and converts it into a matrix. The Q-Learning algorithm then uses this matrix to generate a q-matrix, which is finally used to get the shortest path. The program can also generate random mazes and then solve it.

Motivation

Reinforcement learning in robotics has been a challenging domain in the field of AI for the past few years. The ability to equip a robot with a powerful enough tool to allow an autonomous discovery of an optimal behavior through trial-and-error interactions with its environment has paved the path to many deep research projects.

Our project inspires us to solve many problems, like Autonomous Mobile Robot Obstacle Prevention using Q-learning.

Image Processing
fig4 Image Processing

Software Aspect of the Design

Tech Stack

  • Python - It is an interpreted high-level object-oriented programming language designed by Guido van Rossum. It emphasizes code readability with significant use of indentation. It has tons of third-party open-source libraries (which can be installed using its own package manager pip) to assist all types of programs.

  • Jupyter Notebook - It is a product developed under Project Jupyter (spun off from IPython in 2014 by Fernando Perez) and is used to create documents containing live code, equations, visualizations and narrative text. It is used in data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, etc.

  • NumPy - It is a Python library providing fundamental package for scientific computing. It is used for working with arrays in the domain of linear algebra, fourier transform, matrices, etc. Numpy arrays are 50x faster than Python lists.

  • Matplotlib - It is a Python library for creating static, animated, and interactive visualizations and plots and was introduced by John D. Hunter in 2002.

  • Python Imaging Library (PIL) - It is a Python library providing support for opening, manipulating, and saving many different image file formats. It supports Python version 1.5.2 to 2.7. A subsequent fork of the PIL repository named Pillow added Python 3.x support.

Working

  • Terminologies

    • Episode - In a given episode the Agent solves the maze by making some random moves (exploration) and some moves by looking up at the q-matrix (exploitation).

    • Iteration - Every move in an episode is called as an iteration.

    • State - Every coordinate point of the Maze can be defined as a unique state.

    • Q-Matrix - A Q-matrix is defined as a numpy matrix of size (no. of different moves available, i.e. 4) × (No. of states) with each entry as Zero defined initially.

    • Bellman Equation - It is used to update the Q-Matrix.

    Bellman Equation
    fig5 Bellman Equation

  • Image Processing

    Image Processing Workflow
    fig6 Image Processing Workflow

    Click Here for detailed explanation

    Click here to access the Image Processing Code

  • Maze Solving

    Maze Solving Workflow
    fig7 Maze Solving Workflow

    Click here to access the Maze Solving Code

Cost Structure

Software (Components) Cost
Python Open-Source/None
Jupyter Notebook Open-Source/None
NumPy Open-Source/None
Matplotlib Open-Source/None
PIL Open-Source/None

Applications

This project can be used to develop a navigation system with the ability to learn to adapt to unknown environments. Such game-playing AIs are designed in a way that their solutions are relevant in many practical applications :

  • Industrial Applications - Mobile robots have been increasingly used over the last two decades in various industries to move goods from one place to another with optimal movement policy.

  • Restaurants of the Future (ROTF) - ROTF is one of the greatest physical application of the project. Since such a RL based implementation in a mobile robot will help us to give and take orders from the customers, in a faster and more efficient way.

  • Navigation - Maze solving can further be extended for autonomous navigation in an occupancy grid to get to the nearest landmark like an EV charging station or a petrol pump.

Limitations

At the current state, our project has a few limitations :

  • Before running, the program requires total number of episodes and number of random episodes to be given as a manual input. These can be developed as a function of the maze size to get a higher degree of automation.

  • The reduceMatrix(), trim() and sharpen() functions also require manual passing of parameters.
  • The Image Processing part of our project is not strong enough to process very low-resolution maze images.

  • The simple reinforcement learning algorithm would collapse when dealing with complex mazes.

Future Improvements

In future, we plan to :

  • generate an animated maze solution.

  • implement Multi-Objective target search, wherein the agent of Q-Learning must visit intermediate flag positions before going to the end of the maze.

  • extend our project to 3-D Mazes.

Team Members

Mentors

References

About

Open Projects 2021

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.5%
  • Python 1.5%