When Zipline Gets Stuck

I stumbled across this Analytics Vidhya article by Sabir Jana How to Import Indian equities Data to zipline on your local machine? and after more than 2 years I was tempted to give it a shot once again. Jana's post specifically addressed the roadblock that prevented most Indians from using the library. Zipline, after all, is the retail industry standard when it comes to backtesting in Python. Entire books are devoted to it, and all other backtesters are considered a distant second.

Jana's article is well written, and code is not just easy to follow, it's very well written. (There is a problem with tabs in the medium post, and indentation isn't clear because of the font used, so I used this version instead.)

However as soon as I was done fetching BSE data from Quandl and ready to feed it to Zipline, I ran into a new hurdle: Zipline just absolutely refused to install! The error message overflowed the terminal's buffer and I had to route the output to a log file, only to find out that a dozen odd dependencies weren't satisfied, and it was futile even trying to manually install them. They simply wouldn't install, for various reasons.

Note: Installing Zipline is slightly more involved than the average Python package. See the full Zipline Install Documentation for detailed instructions.
(understatement of the year on zipline's PyPI page)

 

Helpful souls somewhere on GitHub issues pages and StackOverflow suggest that Anaconda should be used for installing Zipline, but I am wary of its pythonic (double irony) death grip, so I searched for, and found, this Miniconda route to installing Zipline.

One last time I sabotaged my night by thinking that I can one-up the HowTo: even though it clearly says "You must explicitly set the python version as shown below" (which was Python3.5), I had read on Zipline's PyPI that it supports Python3.6 and wanted that extra 0.1 upgrade in my installation. So I went for it. And sure enough, Zipline was successfully installed, or so I thought.

The HowTo was nice enough to include steps for ingesting Quandl data to test the Zipline installation. Sure enough, my Python3.6 dare didn't go down well, and this time I didn't dare to fight the errors. I gave up. Deleted the new environment and like a good boy with collar button buttoned, restarted the How To from the first step, and this time dotted all the i's and crossed all the t's exactly as the instructions on that page.

I write this blog while Zipline ingests Quandl data, so that you don't have to waste three odd hours in the AM like I had to. Just follow the damn manual(s).



PS: in case the dual_moving_average.py test script gives a "no benchmark specified error", I found that adding from zipline.api import set_benchmark to the imports and set_benchmark(False) to the initialize() definition works.

Addendum: if you get an AssertionError when running the test strategy from Jana's article, add import pandas as pd to the imports, and use start_date = pd.Timestamp(datetime(2015, 1, 1, tzinfo=pytz.UTC)) and end_date = pd.Timestamp(datetime(2020, 5, 20, tzinfo=pytz.UTC)) to convert datetimes.

Comments

Popular posts from this blog

Hello Website

Notes on Flask

Why B̶l̶o̶g̶ Write?