Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Travel time matrices for assigning activities to zones #20

Open
Hussein-Mahfouz opened this issue Apr 26, 2024 · 6 comments
Open

Travel time matrices for assigning activities to zones #20

Hussein-Mahfouz opened this issue Apr 26, 2024 · 6 comments
Labels
enhancement New feature or request Task 2 assigning activities to geographic locations

Comments

@Hussein-Mahfouz
Copy link
Collaborator

The NTS data we are using only assigns individual activities to regions (e.g. "West Yorkshire", "North West"). We only have the home location (from the SPC) but we need to be able to determine feasible activity locations.

For example, to determine the location of an education facility, we use home location (spc), mode of travel (nts), reported travel time (nts), and travel time matrices by mode to identify which zones the education facility could be in. (current function here)

I am using travel time matrices (at OA level) that I have calculated from another project, but it would be useful to have a pipeline to create these matrices for any study area.

Related to https://github.com/alan-turing-institute/uatk-admin/issues/9

@Hussein-Mahfouz Hussein-Mahfouz added enhancement New feature or request Task 2 assigning activities to geographic locations labels Apr 26, 2024
@dabreegster
Copy link
Collaborator

Some questions...

  • What's the performance for the current travel time matrices approach look like, either for building the matrices or querying them? Does either feel like a limiting factor?
  • How detailed do modes of travel get -- just "cycling" or "cycling with e-bike so hills don't matter, and confident about stressful roads"? For PT, limits on money for tickets?
  • What's your source of network and destination data right now -- both OSM?

@dabreegster
Copy link
Collaborator

If you're using travel time matrices, there are about 180k OAs, so that's around 32 billion entries per mode. Very conservatively assuming 8 bytes per entry, that's around 241GB for one mode's matrix. Seems quite extreme and wasteful, given so many OAs don't interact.

How high does travel time usually go -- not often over 2 hours, hopefully? Do you want to find all destinations within 2 hours of a start point, or stop when you find the closest one, or make some randomized decision about whether to keep searching as you encounter each one? Or since the travel time is from a survey, maybe ignore destinations closer and really insist on some that're about X minutes way?

@Hussein-Mahfouz
Copy link
Collaborator Author

  • What's the performance for the current travel time matrices approach look like, either for building the matrices or querying them? Does either feel like a limiting factor?

I am using r5r and I am building a matrix for a specific city. In my case it was Leeds (~2600 OAs) and I was creating a matrix for each of car, walk, cycle, and 5 matrices for bus (morning_wkday, afternoon_wkday, evening_wkday, night_wkday, morning_wkend, night_wkend). I'm doing this on my laptop. It is very fast for PT (maybe 30 seconds per matrix) but very slow for car trips (could take 30 minutes for the same matrix). I think the difference in performance is because r5 was built for pt routing. The routing engine takes another 30 seconds to start running

The code for the routing wrappers is here and the code for running r5r is here

  • How detailed do modes of travel get -- just "cycling" or "cycling with e-bike so hills don't matter, and confident about stressful roads"? For PT, limits on money for tickets?

These are all the options you can pass (r5r::travel_time_matrix()). For hills, you can add an elevation file in the setup. If it's an ebike, you can ignore the elevation and/or change the bike_speed parameter. For PT, you can add monetary limits through max_fare, but I haven't done that since you would need to add a fare_structure file to your gtfs feed. See this vignette for more details

  • What's your source of network and destination data right now -- both OSM?

@Hussein-Mahfouz
Copy link
Collaborator Author

If you're using travel time matrices, there are about 180k OAs, so that's around 32 billion entries per mode. Very conservatively assuming 8 bytes per entry, that's around 241GB for one mode's matrix. Seems quite extreme and wasteful, given so many OAs don't interact.

Yeah I'm definitely not running this on a national level. I'm currently constraining it to OAs within a specific city, and limiting the travel time to 2 hours.

How high does travel time usually go -- not often over 2 hours, hopefully?

I need to check the NTS to see the travel time distribution. One option could be a design decision to only include intracity trips, and limit the time to 2 hours

Do you want to find all destinations within 2 hours of a start point, or stop when you find the closest one, or make some randomized decision about whether to keep searching as you encounter each one? Or since the travel time is from a survey, maybe ignore destinations closer and really insist on some that're about X minutes way?

There are normally a bunch of different people in each OA, and each one will have a different travel distance from the NTS, so I don't think we could insist on a travel time in the routing phase. It makes sense to me to create the matrix, and for each individual, use the matrix to determine the zones they can reach given the specified travel time from the NTS. This is what I was doing in this function

@Hussein-Mahfouz
Copy link
Collaborator Author

@sgreenbury this is the workflow for creating travel time matrices:

1. Getting the data

  • OSM Road Network:
    • Option 1: download manually through through geofabrik
    • Option 2 (preferred): download using a script/cli (e.g. using pyrosm). Sam's message here is a good starting point for that. It would also be useful for POI data, as shown in the issue where the message is from
  • GTFS feeds:
    • GTFS data for the UK can be found under timetables here. You need to create an account to download
  • Zoning layer: we arecalculating travel times for an OD matrix. I normally use zone centroids as Origins and Destinations. We are currently using OA21CD boundary layer for the UK. Ideally the code should be agnostic to the layer you provide it (OAs, MSOAs, custom zoning layer)

2. Preprocessing the data:

2. Routing:

  • Routing: I have done this in r5r (we may want to use r5py for better integration with the rest of the code). r5 can create a travel time matrix for a specified combination of modes (see the travel_time_matrix api). If you want to calculate for different mode scenarios, you need to run the function multiple times. I did the following:

@Hussein-Mahfouz
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Task 2 assigning activities to geographic locations
Projects
None yet
Development

No branches or pull requests

2 participants