Decoding Urban Industrial Complexity:
Enhancing Knowledge-Driven Insights via IndustryScopeGPT

✏️ Hightlights

🎯 Top Open-Source Urban Knowledge Graph Dataset

IndustryScopeKG is the first and largest open-source, multimodal large-scale knowledge graph dataset. It integrates geospatial data such as street view images, points of interest, and crowd activities, along with socio-economic data including company information, real estate prices, and population statistics. The dataset contains 2,232,037 entities and 51,684,939 triples.

🎯 First Application of LLM in Industrial Park Planning and Operations

The IndustryScopeGPT framework was developed to enhance the planning, action, and reasoning capabilities of large language models (LLMs) in industrial park planning and operations. This framework represents the first integration of large language models with spatial computing and dynamic reasoning on graph databases that incorporate external geospatial data, marking the first exploration of such applications in industrial park planning and operations.

📰 Updates

2024/10/28 We’re thrilled to share that our work, Decoding Urban Industrial Complexity: Enhancing Knowledge-Driven Insights via IndustryScopeGPT, has been published by the ACM MM 2024 conference. We are profoundly grateful for the contributions of all the contributors to this work！
2024/07/23 We are honored to share and present our work at the MIT Media Lab-City Science Lab Community Meeting. For more details, please visit the City Science Lab website: Design and Data-Driven Hybrid Community Building.

🔗 IndustryscopeKG Data

The raw data is available for download from Kaggle or Baidu Netdisk.

industryscopekg.dump

This file contains the standalone Knowledge Graph data, designed for users who want to work directly with the raw data or already have a Neo4j setup and wish to directly import the dataset.

industryscopekgdocker.zip

If you downloaded the file from Kaggle as industryscopekgdocker.sip, please rename the file extension from .sip to .zip and unzip the contents. If you obtained the file from Baidu Netdisk, it will already be named industryscopekgdocker.zip, and no further renaming is required. Simply unzip the contents to proceed.

After unzipping the .zip file, you will get the industryscopekgneo4j.tar file.

For seamless integration, we recommend using Neo4j to analyze and interact with the knowledge graph.

🛠️ Getting Started with Neo4j

We provide a Neo4j Docker Image to simplify the process of setting up and working with the IndustryScopeKG dataset.

Step 1: Install Docker

Ensure you have Docker installed on your system. For installation instructions, visit Docker's official website.

Step 2: Load the Docker Image

Use the following command to load the industryscopekgneo4j.tar into your local Docker environment:

docker load -i industryscopekgneo4j.tar

After the image is loaded, Docker will display the name and tag of the image, typically neo4jv_backup:latest.

Step 3: Start Neo4j with IndustryScopeKG

Run the following command to start a Neo4j container with the dataset preloaded:

docker run \
  -itd \
  --name industryscope \
  -p 7474:7474 -p 7687:7687 \
  neo4jv_backup:latest

Default credentials:

Username: neo4j
Password: bitnami1

Note: We use bitnami/neo4j:5.13.0 as our base image, and we compiled spatial plugin manually with maven which already exists in our docker image provided.

Step 4: Access Neo4j

Open your browser and navigate to: http://localhost:7474.
Login with the credentials (neoo4j/bitnami1).

🌰 Example Queries

Here are some useful Cypher queries you can play with:

Question 1: 静安区的所有产业园区都有哪些？

MATCH (p:park) 
WHERE p.所属区县 = "静安区"
RETURN p.名称 AS 园区名称

Question 2: 上海张江高科技园内的网格根据零售服务POI数量的排序是什么？

MATCH (p:park {{名称: "上海张江高科技园区"}})-[:IS_IN]-(f:fishnet)
RETURN p.fishnet AS FishnetID, f.零售服务POI数量 AS RetailServicesCount
ORDER BY RetailServicesCount DESC

Question 3: 朕天总部经济工业园里有没有便利店？

MATCH (poi:poi_retail)-[:IS_IN]->(:park {{名称: '朕天总部经济工业园'}})
WHERE poi.类别 CONTAINS '便利店' OR poi.类别 CONTAINS '零售' OR poi.类别 CONTAINS '便民商店'
RETURN COUNT(poi) AS 便利店数量

Question 4: 张江高新区杨浦园里有几家华润？

MATCH (poi)-[:IS_IN]->(:park {{名称: '张江高新区杨浦园'}})
WHERE poi.名称 CONTAINS '华润'
RETURN COUNT(poi) AS 华润数量

Question 5: 张江高新区杨浦园里的华润名称是什么？

MATCH (poi)-[:IS_IN]->(:park{{名称: '张江高新区杨浦园'}})
WHERE poi.名称 CONTAINS '华润'
RETURN poi.名称 AS 华润名称

📈 Key Statistics

Table 1: The Statistics of Entities in IndustryScopeKG

Basic Statistics	Industrial Park	Grid	Grid Dominant Function	POI	Enterprise	Total
Count	264	128,866	15	112,931	1,058,656	1,300,732
(Leading) Industries	Primary	Secondary	Tertiary	Scope of Operations	Planned
Industrial Park	202	258	261	261	70	1,052
Grid	1,142	6,270	10,281	20,246	/	37,939
Enterprise	18	90	392	891,814	/	892,314

Table 2: The Statistics of Triples in IndustryScopeKG

Relation	Head & Tail Entity	Triple Records
Locate in	(POI, Grid), (Enterprise, Grid), (POI, Industrial Park), (Enterprise, Industrial Park), (Grid, Industrial Park)	2,516,160
Adjacent to	(Grid, Grid), (Industrial Park, Industrial Park)	488,401
Similar to	(Industrial Park, Industrial Park)	3,765
Related to	(Industrial Park, Industrial Park)	10,687
Has	E.g., (Industrial Park, Planned Industries), (Grid, Leading Scope of Operations), (Grid, Dominant Functions)	4,252,341
Attribution	(Industrial Park, Value) with 111 attributions, (Grid, Value) with 82 attributions, (POI, Value) including 15 attributions, (Enterprise, Value) with 36 attributions	44,413,585

🌟 Usage Notes

The dataset can be used standalone or integrated into your projects. It works particularly well with graph-based reasoning frameworks. For LLM-based planning and operations, the dataset complements the IndustryScopeGPT framework, providing a dynamic and insightful environment for urban and industrial research.

🤖 IndustryscopeGPT

Model Inference

IndustryscopeGPT scripts will be available soon under the folder scripts.

🤗 Citation

If our work sparks your interest or helps your research, a little cite would mean a lot!

@inproceedings{10.1145/3664647.3681705,
author = {Wang, Siqi and Liang, Chao and Gao, Yunfan and Liu, Yang and Li, Jing and Wang, Haofen},
title = {Decoding Urban Industrial Complexity: Enhancing Knowledge-Driven Insights via IndustryScopeGPT},
year = {2024},
isbn = {9798400706868},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3664647.3681705},
doi = {10.1145/3664647.3681705},
abstract = {Industrial parks are critical to urban economic growth. Yet, their development often encounters challenges stemming from imbalances between industrial requirements and urban services, underscoring the need for strategic planning and operations. This paper introduces IndustryScopeKG, a pioneering large-scale multi-modal, multi-level industrial park knowledge graph, which integrates diverse urban data including street views, corporate, socio-economic, and geospatial information, capturing the complex relationships and semantics within industrial parks. Alongside this, we present the IndustryScopeGPT framework, which leverages Large Language Models (LLMs) with Monte Carlo Tree Search to enhance tool-augmented reasoning and decision-making in Industrial Park Planning and Operation (IPPO). Our work significantly improves site recommendation and functional planning, demonstrating the potential of combining LLMs with structured datasets to advance industrial park management. This approach sets a new benchmark for intelligent IPPO research and lays a robust foundation for advancing urban industrial development. The dataset and related code are available at https://github.com/Tongji-KGLLM/IndustryScope.},
booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia},
pages = {4757–4765},
numpages = {9},
keywords = {industrial park planning and operation, large language model agent, urban design and planning, urban knowledge graph},
location = {Melbourne VIC, Australia},
series = {MM '24}
}

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
assets		assets
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decoding Urban Industrial Complexity:
Enhancing Knowledge-Driven Insights via IndustryScopeGPT

✏️ Hightlights

🎯 Top Open-Source Urban Knowledge Graph Dataset

🎯 First Application of LLM in Industrial Park Planning and Operations

📰 Updates

🔗 IndustryscopeKG Data

🛠️ Getting Started with Neo4j

Step 1: Install Docker

Step 2: Load the Docker Image

Step 3: Start Neo4j with IndustryScopeKG

Step 4: Access Neo4j

🌰 Example Queries

📈 Key Statistics

🌟 Usage Notes

🤖 IndustryscopeGPT

Model Inference

🤗 Citation

About

Releases

Packages

Contributors 3

Tongji-KGLLM/IndustryScope

Folders and files

Latest commit

History

Repository files navigation

Decoding Urban Industrial Complexity:Enhancing Knowledge-Driven Insights via IndustryScopeGPT

✏️ Hightlights

🎯 Top Open-Source Urban Knowledge Graph Dataset

🎯 First Application of LLM in Industrial Park Planning and Operations

📰 Updates

🔗 IndustryscopeKG Data

🛠️ Getting Started with Neo4j

Step 1: Install Docker

Step 2: Load the Docker Image

Step 3: Start Neo4j with IndustryScopeKG

Step 4: Access Neo4j

🌰 Example Queries

📈 Key Statistics

🌟 Usage Notes

🤖 IndustryscopeGPT

Model Inference

🤗 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Decoding Urban Industrial Complexity:
Enhancing Knowledge-Driven Insights via IndustryScopeGPT

Packages