You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently working through IBM's coursera notebooks, and there appear to be some errors in the .ipynb's for certain transformations. Specifically:
"claimed/component-library/transform/spark-csv-to-parquet.ipynb" : destination path and parqet filename is stored in a variable "output_data_parquet" (third code cell). In code cell 5: data_dir + data_parquet fails to run because data_parquet is not defined. I think this should be output_data_parquet as appears in the eighth code cell.
"claimed/component-library/transform/spark-sql.ipynb" : In cell 4, where the environment variables are defined, "data_dir" is defined twice. The first occurance appears to be correct based on the comment. The second occurance appears to be incorrect, as the comment suggests it should be a sql query. As a result, in cell 7, the variable "sql" is not defined. I think that the second occurance of data_dir should really be a line along the lines of: "sql = os.environ.get('sql_query, 'select * from df')"
The text was updated successfully, but these errors were encountered:
I'm currently working through IBM's coursera notebooks, and there appear to be some errors in the .ipynb's for certain transformations. Specifically:
"claimed/component-library/transform/spark-csv-to-parquet.ipynb" : destination path and parqet filename is stored in a variable "output_data_parquet" (third code cell). In code cell 5: data_dir + data_parquet fails to run because data_parquet is not defined. I think this should be output_data_parquet as appears in the eighth code cell.
"claimed/component-library/transform/spark-sql.ipynb" : In cell 4, where the environment variables are defined, "data_dir" is defined twice. The first occurance appears to be correct based on the comment. The second occurance appears to be incorrect, as the comment suggests it should be a sql query. As a result, in cell 7, the variable "sql" is not defined. I think that the second occurance of data_dir should really be a line along the lines of: "sql = os.environ.get('sql_query, 'select * from df')"
The text was updated successfully, but these errors were encountered: