Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prep.py Error #30

Open
nyirock opened this issue Jan 28, 2018 · 8 comments
Open

prep.py Error #30

nyirock opened this issue Jan 28, 2018 · 8 comments

Comments

@nyirock
Copy link

nyirock commented Jan 28, 2018

It seems that google have updated their API, so when running prep.py it raises a remote error:
raise RemoteDataError('Unable to read URL: {0}'.format(url)) pandas_datareader._utils.RemoteDataError: Unable to read URL: http://www.google. com/finance/historical?q=usb&startdate=Jan+27%2C+2017&enddate=Jan+27%2C+2018&out put=csv
Is there a way the offline versions of JSON files could be made available?

@isunli
Copy link

isunli commented Apr 5, 2018

Same, any one can fix this problem?

@mdtdev
Copy link

mdtdev commented Dec 4, 2018

Clearly the data source is no longer supported. Does anyone know an alternate source to use for the data? Or equivalent data to download?

@mrocklin
Copy link
Collaborator

mrocklin commented Dec 4, 2018

I don't personally know of a good place to download this data, but I wouldn't be surprised if one exists.

The dask repository now includes a dask.datasets.timeseries function that generates entirely fake data that might fit in, though would be less interesting. If someone wants to do this I suspect it would be welcome.

@cdeil
Copy link

cdeil commented Jan 23, 2019

I also wanted to try this tutorial, but couldn't get the data:

(parallel) hfm-1804a:parallel-tutorial deil$ python prep.py
Traceback (most recent call last):
  File "prep.py", line 21, in <module>
    dask.set_options(get=dask.multiprocessing.get)
  File "/Users/deil/software/anaconda3/envs/parallel/lib/python3.6/site-packages/dask/context.py", line 18, in set_options
    raise TypeError("The dask.set_options function has been deprecated.\n"
TypeError: The dask.set_options function has been deprecated.
Please use dask.config.set instead

  Before: with dask.set_options(foo='bar'):
              ...
  After:  with dask.config.set(foo='bar'):
              ...

I don't personally know of a good place to download this data, but I wouldn't be surprised if one exists.

How big is the data that was downloaded by prep.py. If it's less than 1 GB maybe you could just put a copy in this Github repo?

Would be great to have this tutorial working....

@mrocklin
Copy link
Collaborator

I agree that putting the data into the repository is possible. Unfortunately I no longer know how to obtain the data. My recommendation that someone rework the examples to use the dask.datasets.timeseries function is, I think, still the best approach I can think of personally. Alternate solutions would be welcome if people want to implement them.

@cdeil
Copy link

cdeil commented Jan 23, 2019

I agree that putting the data into the repository is possible. Unfortunately I no longer know how to obtain the data.

@minrk - maybe you still have a copy of the files around?

My recommendation that someone rework the examples to use the dask.datasets.timeseries function is, I think, still the best approach I can think of personally.

I could try tomorrow. But to me, bundling example data in the tutorial repo seems like the better solution if it's small, to increase chances of it working in the future.

@mrocklin
Copy link
Collaborator

dask.datasets.timeseries produces random data using the numpy.random module. It's definitely as robust as packaging data, and has the benefit of working over conference wifi.

I think it's ok to have a few megabytes of data here, but we need to expect this tutorial to be run over very poor internet connections. Anything over a few tens of megabytes is unpleasant.

@jjbankert
Copy link

In order to even get to the google error I've set dask=0.20.2 and pandas =0.22 in the environment.yml file. Dask ran into the same issue as @cdeil reported, and pandas reported the following exception:

(parallel) [parallel-tutorial]$ python prep.py
Traceback (most recent call last):
  File "prep.py", line 44, in <module>
    write_stock(symbol)
  File "prep.py", line 37, in write_stock
    data_source='google')
  File "/opt/anaconda3/envs/parallel/lib/python3.6/site-packages/dask/dataframe/io/demo.py", line 202, in daily_stock
    from pandas_datareader import data
  File "/opt/anaconda3/envs/parallel/lib/python3.6/site-packages/pandas_datareader/__init__.py", line 2, in <module>
    from .data import (DataReader, Options, get_components_yahoo,
  File "/opt/anaconda3/envs/parallel/lib/python3.6/site-packages/pandas_datareader/data.py", line 14, in <module>
    from pandas_datareader.fred import FredReader
  File "/opt/anaconda3/envs/parallel/lib/python3.6/site-packages/pandas_datareader/fred.py", line 1, in <module>
    from pandas.core.common import is_list_like
ImportError: cannot import name 'is_list_like'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants