You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I encountered below error, I have azureml-opendatasets==1.55.0
Calling to_spark_dataframe()
Traceback (most recent call last):
File "/mnt/c/users/cruiseli/OneDrive - Microsoft/Desktop/workspace/SynapseML-Utils/test_aml.py", line 30, in
nyc_tlc_df2 = nyc_tlc.to_spark_dataframe()
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/accessories/_loggerfactory.py", line 139, in wrapper
return func(*args, **kwargs)
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/accessories/open_dataset_base.py", line 164, in to_spark_dataframe
return self._to_spark_dataframe()
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/accessories/open_dataset_base.py", line 305, in _to_spark_dataframe
return self._blob_accessor.get_spark_dataframe(
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/dataaccess/_blob_accessor.py", line 303, in get_spark_dataframe
paths = [wasab_format % (self._blob_container_name, self._blob_account_name,
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/dataaccess/_blob_accessor.py", line 304, in
self._get_relative_path(path)) for path in target_paths]
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/dataaccess/_blob_accessor.py", line 470, in _get_relative_path
if "blob.core.windows.net" in url:
TypeError: argument of type 'azureml.dataprep.rslex.StreamInfo' is not iterable
Code to reproduce:
import azureml.core
from azureml.core import Datastore, Dataset
from azureml.core.workspace import Workspace
from azureml.core.experiment import Experiment
from azureml.core.authentication import InteractiveLoginAuthentication
import logging
import pandas as pd
import time
import sys
print("Testing opendatasets -- start")
from azureml.opendatasets import NycTlcYellow
from datetime import datetime
from dateutil import parser
end_date = parser.parse('2018-05-30')
start_date = parser.parse('2018-05-28')
nyc_tlc = NycTlcYellow(start_date=start_date, end_date=end_date)
print("Calling to_pandas_dataframe()")
ts = time.time()
nyc_tlc_df = nyc_tlc.to_pandas_dataframe()
te = time.time()
print("Time taken to perform to_pandas_dataframe():" + str(te-ts))
print("Calling to_spark_dataframe()")
ts2 = time.time()
nyc_tlc_df2 = nyc_tlc.to_spark_dataframe()
te2 = time.time()
nyc_tlc_df2.show(2, truncate = False)
print("Time taken to perform to_spark_dataframe():" + str(te2-ts2))
print("Testing opendatasets -- end")
The text was updated successfully, but these errors were encountered:
lhrotk
changed the title
'azureml.dataprep.rslex.StreamInfo' is not iterable
TypeError: argument of type 'azureml.dataprep.rslex.StreamInfo' is not iterable
Feb 19, 2024
Thank you for reporting this issue, I have investigated it and found the underlying bug. This is now fixed and will be released in the next update to open-datasets package.
Hi, I encountered below error, I have azureml-opendatasets==1.55.0
Calling to_spark_dataframe()
Traceback (most recent call last):
File "/mnt/c/users/cruiseli/OneDrive - Microsoft/Desktop/workspace/SynapseML-Utils/test_aml.py", line 30, in
nyc_tlc_df2 = nyc_tlc.to_spark_dataframe()
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/accessories/_loggerfactory.py", line 139, in wrapper
return func(*args, **kwargs)
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/accessories/open_dataset_base.py", line 164, in to_spark_dataframe
return self._to_spark_dataframe()
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/accessories/open_dataset_base.py", line 305, in _to_spark_dataframe
return self._blob_accessor.get_spark_dataframe(
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/dataaccess/_blob_accessor.py", line 303, in get_spark_dataframe
paths = [wasab_format % (self._blob_container_name, self._blob_account_name,
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/dataaccess/_blob_accessor.py", line 304, in
self._get_relative_path(path)) for path in target_paths]
File "/home/cruise/mambaforge/lib/python3.10/site-packages/azureml/opendatasets/dataaccess/_blob_accessor.py", line 470, in _get_relative_path
if "blob.core.windows.net" in url:
TypeError: argument of type 'azureml.dataprep.rslex.StreamInfo' is not iterable
Code to reproduce:
The text was updated successfully, but these errors were encountered: