Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db support -- sqlite: crashing on a feed with utf-8 route names w/python 2.7. Load also taking a long time (forever?) with trimet.org's gtfs data. #18

Open
XavierPrudent opened this issue Mar 27, 2017 · 7 comments

Comments

@XavierPrudent
Copy link

Dear authors,

I have run the given example:
bin/gtfsdb-load --database_url sqlite:///gtfs.db http://developer.trimet.org/schedule/gtfs.zip

and get a long list of DEBUG outputs which hangs forever at this point:

16:41:00,630 DEBUG [gtfsdb.model.route] Route.load (0 seconds)
16:41:00,646 DEBUG [gtfsdb.model.route] RouteDirection.load (0 seconds)
16:41:01,324 DEBUG [gtfsdb.model.stop] Stop.load (1 seconds)
****16:41:02,624 DEBUG [gtfsdb.model.stop_feature] StopFeature.load (1 seconds)
16:41:02,759 DEBUG [gtfsdb.model.transfer] Transfer.load (0 seconds)
***************************************************************************************************************16:42:02,480 DEBUG [gtfsdb.model.shape] Shape.load (60 seconds)
/Library/Python/2.7/site-packages/sqlalchemy/sql/sqltypes.py:596: SAWarning: Dialect sqlite+pysqlite does not support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.
'storage.' % (dialect.name, dialect.driver))
16:42:04,079 DEBUG [gtfsdb.model.shape] Pattern.load (2 seconds)
****16:42:07,540 DEBUG [gtfsdb.model.trip] Trip.load (3 seconds)

Is there a way to stop it without corrupting the resulting DB?

Besides that, when feeding it with the following open-source gtfs
https://www.donneesquebec.ca/recherche/dataset/e82b9141-09d8-4f85-af37-d84937bc2503/resource/b7f43b2a-2557-4e3b-ba12-5a5c6d4de5b1/download/gtfssherbrooke.zip

Traceback (most recent call last):
File "bin/gtfsdb-load", line 13, in
sys.exit(gtfsdb.scripts.gtfsdb_load())
File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/scripts.py", line 10, in gtfsdb_load
database_load(args.file, **kwargs)
File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/api.py", line 20, in database_load
gtfs.load(db, **kwargs)
File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/model/gtfs.py", line 34, in load
cls.load(db, **load_kwargs)
File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/model/base.py", line 141, in load
db.engine.execute(table.insert(), records)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 2055, in execute
return connection.execute(statement, *multiparams, **params)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 945, in execute
return meth(self, multiparams, params)
File "/Library/Python/2.7/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
compiled_sql, distilled_params
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
context)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
exc_info
File "/Library/Python/2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
context)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. [SQL: u'INSERT INTO agency (agency_id, agency_name, agency_url, agency_timezone, agency_lang, agency_phone) VALUES (?, ?, ?, ?, ?, ?)'] [parameters: ('0', 'Soci\xc3\xa9t\xc3\xa9 de Transport de Sherbrooke', 'http://www.sts.qc.ca/', 'America/Montreal', 'FR', '819-564-2687')]

I am quite confused by the "You must not use 8-bit bytestrings", cannot it deal with any text file?

Thanks in advance,
regards,
Xavier

@fpurcell fpurcell changed the title gtfsdb hanging for the example, and crashing for a standard static gtfs db support: sqlite3 gtfsdb hanging for the example, and crashing on a gtfs feed with utf-8 route names Dec 4, 2017
@fpurcell
Copy link
Member

fpurcell commented Dec 4, 2017

I just tried gtfsdb with the Sherbrooke gtfs, and I do see both problems with sqlite. Things work fine with Postgres. (UTF-8 issues look similar to issues with MS Sql Server ... maybe Python 3 will magically help).

@fpurcell fpurcell changed the title db support: sqlite3 gtfsdb hanging for the example, and crashing on a gtfs feed with utf-8 route names db support: crashing on a feed with utf-8 route names. Load also taking a long time (forever?) with trimet.org's gtfs data. Dec 4, 2017
@fpurcell fpurcell changed the title db support: crashing on a feed with utf-8 route names. Load also taking a long time (forever?) with trimet.org's gtfs data. db support -- sqlite: crashing on a feed with utf-8 route names. Load also taking a long time (forever?) with trimet.org's gtfs data. Dec 4, 2017
@fpurcell fpurcell changed the title db support -- sqlite: crashing on a feed with utf-8 route names. Load also taking a long time (forever?) with trimet.org's gtfs data. db support -- sqlite: crashing on a feed with utf-8 route names w/python 2.7. Load also taking a long time (forever?) with trimet.org's gtfs data. Dec 4, 2017
@XavierPrudent
Copy link
Author

XavierPrudent commented Dec 5, 2017 via email

@fpurcell
Copy link
Member

fpurcell commented Dec 5, 2017

I used the link above, Xavier.

bin/gtfsdb-load --database_url sqlite:///gtfs.db https://www.donneesquebec.ca/recherche/dataset/e82b9141-09d8-4f85-af37-d84937bc2503/resource/b7f43b2a-2557-4e3b-ba12-5a5c6d4de5b1/download/gtfssherbrooke.zip

sqlalchemy.exc.ProgrammingError: (pysqlite2.dbapi2.ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. [SQL: u'INSERT INTO agency (agency_id, agency_name, agency_url, agency_timezone, agency_lang, agency_phone) VALUES (?, ?, ?, ?, ?, ?)'] [parameters: ('0', 'Soci\xc3\xa9t\xc3\xa9 de Transport de Sherbrooke', 'http://www.sts.qc.ca/', 'America/Montreal', 'FR', '819-564-2687')]

@XavierPrudent
Copy link
Author

XavierPrudent commented Dec 5, 2017 via email

@fpurcell
Copy link
Member

fpurcell commented Dec 5, 2017

RE: trimet.org GTFS not working with SQLite:

Use the --ignore_blocks flags (will skip creating a roll-up view of trips assigned to a block).

bin/gtfsdb-load --database_url sqlite:///gtfs.db --ignore_blocks http://developer.trimet.org/schedule/gtfs.zip

@XavierPrudent
Copy link
Author

XavierPrudent commented Dec 6, 2017 via email

@XavierPrudent
Copy link
Author

XavierPrudent commented Dec 11, 2017 via email

kardaj added a commit to kardaj/gtfsdb that referenced this issue Jun 11, 2018
fpurcell added a commit that referenced this issue Dec 27, 2018
fix issue #18 -- utf8 support with sqlite
merging ... reported that this closes #14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants