-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancing GTFS Schedule and Realtime with original_trip_id #534
base: master
Are you sure you want to change the base?
Conversation
The original_trip_id is added to both GTFS Schedule and GTFS Realtime. This field allows the association of trips across different realtime and schedule standards, e.g., NeTEx and SIRI. It also allows matching between schedule and realtime.
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
You need to update the .proto file for the real time field. Please use a larger number and avoid the field numbers I am proposing in #504 , as I intend to produce it as soon as I can for integration with other systems such as Darwin (my static GTFS has this field already). |
Thank you @miklcct, I missed that. I looked at #504 and it seems that 8 is available as field number for the original_trip_id (within TripDescriptor, underneath optional ModifiedTripSelector modified_trip = 7;). This is also the number we currently use in our implementation. Would that interfere with your work? |
That's great, so I can continue to use 5 and 6 for trip_headsign and trip_short_name respectively. Are you using 5 or 6 for something else? |
No, we only add I'll push that now. |
According to discussions in google#534 to reflect the documentation in reference.md
Critical question: why is this important for passenger information. I read the interoperability argument, but not how that is used. |
It allows consumers to match the GTFS data with external data from other sources. |
Concrete examples please. The GTFS ecosystem is that GTFS can be matched with GTFS-RT. In what situation is it valuable to have other trip identifiers. I can think of some, but those should be written down. |
I am using the field to match upstream data from systems of Network Rail, where their IDs are only unique on a single day. |
Are you asking why this matters to the rider? |
Exactly. |
We give an example in the introductory text under "Implementation": "In one case our consumers use the original_trip_id in GTFS Realtime to match with timetable data in the (proprietary) HRDF format." Our consumers use both GTFS and HRDF. However, HRDF is able to better reflect certain services in Switzerland due to its more comprehensive and complex data structure, such as for linked trips. This approach allows to maintain the efficient structure of GTFS (which is one of the reasons for its popularity with our consumers), while providing the full bandwidth of our available customer information by combining it with our other formats.
I would also like to point out this statement by @miklcct, which would be another motivation for this field. This is also true for our SJYID. |
Could you say a little more why your consumers want to use the HRDF together with GTFS and not go either full GTFS/GTFS-RT, HRDF or Netex/SIRI? If you're already dealing with the complexities of linked trips in HDRF, would the extra complexity of (say) SIRI-ET make a difference? To summarise: I'm a bit sceptical of changing the GTFS specification to accommodate non-GTFS workflows. |
I have the same skeptism. But there is fundamental thing both GTFS and NeTEx are overlooking. For every time in GTFS or NeTEx a property changes, a new identifier must be introduced. Now you could argue "this makes a lot of sense" and for some organisations (and even implementers) it does not. They are instead managing these validities of properties at different levels. That is why virtually everything is in conflict with each other once HRDF is mentioned. This is the absolute root cause. |
The very reason why this field is needed is that a consumer with local knowledge can use it to reference other passenger-facing systems outside the GTFS world using the |
It is not just HRDF. As Stefan mentions there is a core problem in NeTEx and GTFS: uniqueness of the trip in the file. Traditional public transport has uniqueness of the ServiceJourney per operating day. Even when the trip is slightly different e.g. for Wednesday, it still is the trip that starts at 08:01 to Zürich. This can be expressed by the global id. We have to split the trip into different ones for GTFS, but for many other use cases (and systems). It is still useful/crucial to know that this is indeed the one. |
So what the PR really does is to accomodate both "worlds". |
Can you not do the same "split" when converting to GTFS-RT? BTW, I don't doubt that it would be useful to your consumers but I doubt that it's GTFS's responsibility to deal with other representations of public transport. |
From a GTFS standpoint it can also be interesting. For example aggregating all the "truly unique trips". It is a very specific use case, therefore I hope some more examples can be provided. |
Non sequitur @leonardehrenfried: With your argumentation we could say: Why should we produce GTFS at all. HRDF and VDV 454 contain all necessary information. It can't be the responsibility of Switzerland to facilitite work for others? We believe this PR is a simple way to simplify interactions between the different formats. In the ideal world one can use on realtime stream and a time table stream of your choice. |
I'll grant you that: it's not a complicated proposal. |
Also there is currently no facility in GTFS (unlike NeTEx) to specify that the modified Wednesday 08:01 is the same trip as the Wednesday 08:01, that if the timetable has been modified from the base timetable it is not currently possible for a client with saved pre-planned journey to know that the timetable has been changed. They will just fail to find the trip in the updated timetable. I have a use case where a traveller can plan journeys up to months beforehand and saved in the user's device. If the timetable is changed, requiring a new ID in the GTFS, it is impossible for the client to find the new ID (however, I think this is worth another PR to associate a trip to a calendar exception, as it is not the purpose of Roughly speaking, it can be described in the following way: In such case, a trip running on a modified timetable on Easter can associate with the original Mon-Fri timetable by specifying another new field (NOT |
I honestly think NeTEx did not standardise "global trip id" either. And yes, there is PrivateCode but that is not the "concept" that we mean here? |
Indeed, but this is not a one-way street. Would you argument that the other standards (such as NeTEx/SIRI) should not facilitate inter-operability with standards, such as GTFS, as well? Also, and maybe we need to amend the description to clarify this:
Such systems can be as much as (in Switzerland) the national identification of journeys accross systems, standards, and operators. |
Please don't get me wrong. I have doubts but I'm not the Guardian of GTFS Purity. :) I would probably not vote at all. |
OpenGeo has been producing this information as Not asking here to change the name in this proposal. But it has been added since 2014. |
In NeTex some still think that the id attribute will suffice. They are wrong. We will in the profiles have to make clear that the content of the original_trip_id must be somewhere. (either in a KeyList with a given key or in the privateCodes with a given key. |
This pull request is related to issue #462
Context: In Switzerland we've introduced the Swiss Journey ID (documentation only in DE/FR/IT: https://www.oev-info.ch/de/datenmanagement/sid4pt-swiss-id-public-transport/swiss-journey-identification-sjyid).
This ID is valid for one operating day and across different days of a scheduled year. It therefore maps to one or more trip_ids.
Proposal: Based on the suggestion by @miklcct (in the referenced issue) we propose to use the original_trip_id (as defined in https://developers.google.com/transit/gtfs/reference?hl=en) in GTFS Schedule and GTFS Realtime to represent constructs such as our SJYID. With this it is possible to combine trips from GTFS Schedule and GTFS Realtime with other standards such as SIRI or NeTEx, which have a similar concept.
Implementation: Since the 12.12.2024 we offer the original_trip_id (filled with our SJYID) as part of GTFS Schedule (doc: https://opentransportdata.swiss/en/cookbook/gtfs/#tripstxt) and GTFS Realtime (doc: https://opentransportdata.swiss/en/cookbook/gtfs-rt/#Trip_updates). In one case our consumers use the original_trip_id in GTFS Realtime to match with timetable data in the (proprietary) HRDF format.
Generalizability: Based on early discussions with other public transport providers, we think this enhancement can benefit many other producers and consumers and increase the inter-operability of GTFS with other standards.