Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping from porto.h5 to trj.h5 #7

Open
ZrrrKIT opened this issue Aug 30, 2019 · 3 comments
Open

Mapping from porto.h5 to trj.h5 #7

ZrrrKIT opened this issue Aug 30, 2019 · 3 comments

Comments

@ZrrrKIT
Copy link

ZrrrKIT commented Aug 30, 2019

Dear @boathit,

I am currently trying to find out how the raw GPS dataset porto.h5 can be mapped to the vector representation trj.h5.

porto.h5 contains 1 704 759 raw GPS-trajectories, whereas trj.t stores only 101 000 sequences of hot cells. The vector representation in trj.h5 also hold 101 000 values, leading me to believe they are the corresponding embeddings from trj.t.

Is there some way to find out which trajectories from porto.h5 correspond to the ones in trj.t? (For example: the first 101 000 from the 1 704 759.)

Thank you in advance,
ZrrrKIT

@boathit
Copy link
Owner

boathit commented Aug 30, 2019

That is easy. You can call trip2seq function to transform all trips in porto.h5 to sequences, and save them into trj.t, just like this line.

@ZrrrKIT
Copy link
Author

ZrrrKIT commented Aug 31, 2019

This works perfectly, thank you!

The only thing not clear to me is how to interpret the values, which are stored in the trj.t file itself. I see that they range from 1 to 18866, which means they are the indices of the hot cells (a.k.a. the vocabulary IDs). The problem is that I am not sure how to infer their positions on the grid. For normal cell :: Int values we just take mod of the grid width to obtain its x coordinate and the div of the width to obtain the y coordinate.

The problem here is that we are dealing with vocab_IDs. Is there a way to infer the 2D location of the hot cells from their IDs?

@boathit
Copy link
Owner

boathit commented Sep 2, 2019

You can you use either cell2gps to get the centroid gps of the cell or seq2trip to transform a sequence of cells into their gps locations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants