jdunn's posterous

Armed to the teeth with epithets. 

GARCH(1,1) Volatility

Been messing around with volatility estimation in R using a GARCH(1,1) model. After installing the "fGarch" package (from Rmetrics), I can run the following code:

Here is a plot of the daily VIX closing prices for the past 252 days:

Vix_vol

And here is a plot of my GARCH(1,1) volatility:

Garch_vol

Nifty, huh? Next I used the Rmetrics "fOptions" package to price some options using my generated vols. That, however, will be left as an exercise for the reader.

Comments [0]

Quant Resources

Recently someone started a thread on quant.ly listing "resources for the budding quant". I decided to recreate the list here and also add a few book suggestions.

Forums

Blogs

Other Resource Lists

Books

  • Trading and Exchanges by Larry Harris
  • Empirical Market Microstructure by Joel Hasbrouck

Comments [0]

Pairs Trading and R, part I

"Honestly, I don't understand the markets very well—to me it all seems like a very large tower of cards." Thus a friend confided to me last week. We were talking about mathematical (or algorithmic) models, and how people come up with them. My friend found it hard to believe that anyone could study the markets, build an algorithmic model, and confidently invest money in it. My explanation involved the example of pairs trading. (Pairs trading is sometimes called spread trading, though this term is often used specifically in the context of futures trading.)

Pairs trading, which falls under the umbrella of practices known as statistical arbitrage, is a trading strategy which theoretically allows a trader to profit in rising, falling, and sideways markets. In a pairs trade, two correlated instruments experience a temporary divergence. A trader then buys one instrument and sells the other in hopes that the instruments will converge. Once the trader is satisfied that convergence has (sufficiently) occurred, he closes his positions.

Let's look at an example. We'll take two companies, WD-40 Company (symbol WDFC) and Sherwin-Williams (symbol SHW), and assume that we've already determined that their stocks are correlated. That is, if WDFC is at $30 and moves to $32, a 6.6% increase, and SHW is at $64, then SHW will also increase about 6.6%, to around $68.27. The following chart compares the past year of price activity for the two companies:

Shw3

We can see that on September 28th, 2009, the pair experienced what looks to be a divergence, followed by gradual convergence. Let's assume that we lacked the prescience to enter a trade at this point, as we hadn't seen most of the chart yet. Instead, we noticed the divergence around June 1st, 2010, and convinced that it would soon converge, we entered a trade. We bought WDFC at 32.25 and sold SHW at 76.01, and to keep things simple, both our trades were for 100 shares. Three weeks later, on June 24, the prices neared each other on our chart, and in a flurry of victorious profit-taking, we closed our trades. We would have sold WDFC at 32.30 for a paltry 5 cents profit per share, but covered SHW at 71.28, meaning $4.73 profit per share. The completed trade would have netted $478, minus commissions—a 4.4% return on investment!

As I previously noted, pairs trade opportunities present themselves when strongly-correlated financial instruments temporarily become out of sync. Thus is is important that the supposed correlation between two candidates for a pairs trade have strong support; otherwise, a pairs trader might open a trade, then watch in dismay as the prices of his chosen pairs move further and further away, never to converge. In fact, all pairs trades exchange a normal trade's directional risk (i.e. that the price will go down instead of up after the trader has bet that it will rise) for the cygnusian risk that the correlation between the two instruments might break down after the trade has been opened.

An eager pairs trader, having been advised of this risk, might use a computer to scan a list of securities for pairs to watch. A linear model could be used to establish an idea of the normal relationship between two members of a pair, and would include the pair's beta coefficient, or how much one member of a pair tends to move per unit movement of the other member. Given the pair's beta, the trader now has a way to price one instrument as a function of the other (and also a way to balance the two legs of his trade so that if maximum convergence occurs exclusively in one side of the pair, he breaks even). If this pricing is generally accurate and the trader is reasonably confident that the instruments are correlated, when one of the prices suddenly differs from the calculated values (the spread), the trader knows it is time to enter a pair of trades.

I like to play around with pairs trading ideas in R, and have put together a few simple functions to help out. Some of my code is adapted from Paul Teetor's article, Using R to Test Pairs of Securities for Cointegration. Speaking of cointegration, it means loosely what we've been using the word "correlation" to describe. As Wikipedia puts it:

Cointegration is a statistical property of time series variables. Two or more time series are cointegrated if they each share a common type of stochastic drift: that is, to a limited degree they share a certain type of behaviour in terms of their long-term fluctuations, but they do not necessarily move together and may be otherwise unrelated.

The man with the master plan in my code is the checkPair function. Passed two symbols (and an optional xts-format date filter), it will download the daily historical data for each from Yahoo, build a linear model, determine the beta, construct the spread, and test whether the spread is cointegrated. Phew! The code can't empirically know that the pair is cointegrated, but it can provide a level of confidence in an assumption of cointegration by means of the p-value, which Wikipedia, as usual, has a lot to say about. Suffice it to say that when the p-value is low, there is a lower chance that we could have observed a mean-reverting spread when the spread was not, in fact, mean-reverting.

Here's an example:

> source('meanrev.R')
Loading required package: zoo
Loading required package: Defaults
Loading required package: TTR
> checkPair('WDFC','SHW')
Date range is 2007-01-03 to 2010-09-27
Assumed hedge ratio is 0.5233585
PP p-value is 0.06177883
The spread is not mean-reverting.

Hmm...maybe it's a good idea we didn't actually make that example trade. I have an idea: let's try some energy stocks against XLU, the Utilities SPDR ETF:

> checkPair('XLU', 'CXO')
Date range is 2007-08-03 to 2010-09-27
Assumed hedge ratio is 0.8269768
PP p-value is 0.01
The spread is likely mean-reverting.
> checkPair('XLU', 'KMR')
Date range is 2007-01-03 to 2010-09-21
Assumed hedge ratio is 0.6670936
PP p-value is 0.04503542
The spread is likely mean-reverting.

Cool! Let's take a look at these two symbols plotted against each other, and XLU. Here is a one-month chart:

Xlu1

Based on this chart, an interesting trade, right now, might be to buy KMR and short XLU. But can we pick trades by simply eyeballing the chart of a cointegrated pair? Also, can we use R to easily help us find good pairs to trade, and let us know when it's a good time to get in or out of a trade? I'll cover these questions, and more, in the next part.

(R code here.)

Comments [6]

Carl Herold, a brushmaker, who was seventy-one years old, made doubly sure of death yesterday by shooting himself in the head and breast in the rear tenement at 12 Stanton Street while frantic with anger at being thwarted in an endeavor to brain the housekeeper, Louisa Roth, by Carl Haas, a little, but plucky, shoemaker, who is her brother-in-law.

NYT, April 7, 1893, Page 9

I love the way that sentence is put together. Ran across it while Googling something unrelated.

Comments [1]

My Typical R Script Workflow

Here's how I might write a script that I can have open in my editor and use in R at the same time:

After I start an R session (by running 'R'), I then type:

> source('chart_watch.R')

And my script is loaded into the R session. I can run the central function of the script by typing:

> chartMyStuff()

I've also got the file open in Vim, and after modifying that function (or adding another function, or whatever), to load the updated code into R, I just run:

> s()

...which you can see is a function I've defined in my file, which re-sources the file.

Those are the basics of it. Another thing I'll do is build a library of functions into a file — for instance, I have a collection of about ten mean reversion / stat-arb-related functions — and then use succinct scripts to call those functions. Example:

...where "meanrev.R" is my collection of functions. You'll notice that the file I pasted just now, however, has no function defined. That's because I use it with "Rscript" from the command line. It takes the arguments passed to it and passes them to a function defined in the file that it imports.

Comments [2]

Whois Fun with Ruby

This Ruby snippet checks whether a huge list of words ending in -us are available for registration as .us domains, using this excellent whois library.

Comments [1]

R Is (Not) the Next Big Thing

There's been some talk in the R community about a blog post by Dr. AnnMaria De Mars, entitled, "The Next Big Thing," in which the author writes:

Contrary to what  some people seem to think, R is definitely not the next big thing, either. [...]

I know that R is free and I am actually a Unix fan and think Open Source software is a great idea. However, for me personally and for most users, both individual and organizational, the much greater cost of software is the time it takes to install it, maintain it, learn it and document it. On that, R is an epic fail. It does NOT fit with the way the vast majority of people in the world use computers. The vast majority of people are NOT programmers. They are used to looking at things and clicking on things.

As I'm a newbie to R and its community myself, I wanted to succinctly touch on these points with a personal observation. I've been a Linux user for the past decade, and I see some correlation between Linux's history and what's happening with R. It seems to me that R is in the "Linux, circa 1998" stage. It's unpolished, often confusing, a huge initial learning curve for the uninitiated, and can help you do Really Neat Stuff™ if you stick with it. I will not be surprised if in ten years R is the standard for statistical data analysis, much as Linux has supplanted commercial UNIX and gone on to explore territory that its predecessor never touched (look at Ubuntu).

R may not be the next big thing, but R is certainly a big thing that is forthcoming.

Comments [2]

Back from Lima

We got back from Lima late last night (read: this morning at 2 AM). Highlights of the trip:

  • Finding a better place to stay the next time we're in Lima.
  • Getting the last of my residency paperwork squared away.
  • Working all day yesterday at Starbucks.
  • Improvising a way of knocking people off Starbucks' unencrypted 802.11 because latency was so bad that my SSH sessions wouldn't stay open for more than a minute. I saw three other people with computers, but nmap reported 14 hosts up. I knew I had to take action.
  • Watching Tooth Fairy (currently sporting a 4.0/10 on IMDB), in which Dwayne 'The Rock' Johnson becomes, well, what the title of the movie suggests.

Comments [1]