Berlin Traffic Speed Prediction Using Random Forest

This project applies a Random Forest Regressor to predict average vehicle speed in Berlin based on traffic density data. By analyzing relationships between vehicle counts (total, cars, trucks) and their speeds, the project provides insights into traffic behavior and performance evaluation of the model.

Project Description

Accurate prediction of average vehicle speed helps in understanding traffic patterns and improving transportation systems. This project:

Trains a Random Forest Regressor using traffic density data.
Evaluates the model with metrics like Mean Squared Error (MSE) and R-squared (R²).
Visualizes model results and feature importance for interpretability.

Dataset

The dataset contains hourly traffic data from Berlin:

Features: -vehicle_count_per_hour: Total vehicles per hour.
- car_count_per_hour: Total cars per hour.
- truck_count_per_hour: Total trucks per hour.
Target:
- avg_speed_all_vehicles_kmh: Average speed of all vehicles (km/h).
- The dataset is stored in a CSV file and uses a semicolon (;) as the delimiter.

Installation

Clone the repository: git clone https://github.com/busrayatlav/Berlin-Traffic-Random-Forest.git cd Berlin-Traffic-Random-Forest
Set up a Python virtual environment (optional but recommended): python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
Install dependencies: pip install -r requirements.txt

Usage

Load the dataset:
Ensure the dataset (berlin_traffic_data.csv) is in the same directory as the script.
```
/path/to/berlin_traffic_data.csv
```
Run the script: Execute the script to train the model and generate outputs.
```
python berlin_traffic_random_forest.py
```
Outputs:

Model performance metrics (MSE, R²) will be displayed in the terminal.
Visualizations will either be saved or displayed directly.

Results

• Mean Squared Error (MSE): 188.14 • R-squared (R²): 0.27 • The model shows moderate predictive accuracy but highlights key features influencing average speed.

Visualizations

Actual vs Predicted Speeds: A scatter plot comparing actual traffic speeds to model predictions.
Feature Importance: A bar chart showing the relative importance of input features.
Residual Plot: A scatter plot of residuals to evaluate prediction errors.

Technologies Used

Python: Programming language.
Pandas: Data manipulation.
scikit-learn: Machine learning library.
Matplotlib: Data visualization.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
actual_vs_predicted_average_speed.png		actual_vs_predicted_average_speed.png
berlin_traffic_data.csv		berlin_traffic_data.csv
berlin_traffic_random_forest.ipynb		berlin_traffic_random_forest.ipynb
feature_importance_random_forest.png		feature_importance_random_forest.png
residual_plot_random_forest.png		residual_plot_random_forest.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Berlin Traffic Speed Prediction Using Random Forest

Table of Contents

Project Description

Dataset

Installation

Usage

Results

Visualizations

Technologies Used

License

About

Releases

Packages

Languages

License

busrayatlav/Berlin-Traffic-Random-Forest

Folders and files

Latest commit

History

Repository files navigation

Berlin Traffic Speed Prediction Using Random Forest

Table of Contents

Project Description

Dataset

Installation

Usage

Results

Visualizations

Technologies Used

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages