A personal python 3 project to generate a directed graph of data retrieved from docx files. The doc files is the course outlines, and have interdependencies with other courses.
Basically, there is a folder called courses, with 7 specializations. Each specialization has its own folder. Each folder contains several course outlines in .docx format. The code will look into these folders for course files, and search for specific keywords, like Course Code, Course Title, and Prerequisites. Data of these keywords is used to generate a graphviz dot format file. This file can then be use to generate the graph as shown above..
-
You will definitely need python 3.
-
Instaling the third party python-docx module, not the docx. I use pip
sudo pip install python-docx
-
You may then copy your course outlines into the folders. This is of course assuming that you have a similar or identical keywords used in our course outlines. You can fork and modify the code if you wish to fit your requirements. Please include acknowledgement.
-
run the code
python genDotFileVisualizer.py > outputfile.txt
- Visualizing
copy paste the dot text from outputfile.txt into online graphviz visualizers like Graphviz Online or Webgraphviz
- Need to be more robust in my regex to get data, since documents are made by people and they tend to forget commas "," or prerequisites sometimes does not include course title which will course error in the generated dot file.
- Specialization should be automatic from folder structure, right now its hardcoded
- Does not cater other useful information, like credit points, required courses and etc, will intend to support that somehow in the future
- Need a build in visualizer instead of using the online version. This will take time, so right now i just generate the dot file.