Welcome to the first part of what we hope will be a long series of articles exploring QuantLib, Open Source Risk Engine (ORE), libbitcoin and other interesting finance-related FOSS (Free and Open Source) projects. However, I'm afraid this will be a bit of a dull first post, as we need to clarify our objectives before we can jump into the fray.
Even though I've been a software engineer in the financial sector for over fifteen years, I've always felt I lacked a deep understanding of the foundational concepts that make up the domain. As a self-confessed reductionist, I find this state of affairs extremely uncomfortable, akin to hearing a continuous loop of David Hilbert's words: wir müssen wissen, wir werden wissen1. The situation had to be remedied, somehow, and the material you are now reading is the proposed fix for my ailments. As to the methodology: given I've had some success in applying the Feynman Technique2 to other complex domains, it seemed only natural to try to use it for this endeavour as well. Past experience also demonstrated writing is an adequate replacement for in vivo communication, which is just as well in this brave new world of social distancing.
So that's that for the why and the how. But, just what exactly are we researching?
These posts shall largely amble where our fancy takes us, within the porous boundaries of finance. Alas, we can hardly keep calling our target domain "trading, accounting, crypto and a bit of quantitative finance, when viewed through the lens of FOSS" - descriptive though that might sound. We are Software Engineers after all, and if there is one thing we do is to name things, especially when we lack competence to do so3. In this vein, I decided to call this motley domain of ours "Computational Finance". Should the name have merit, I'm afraid I have little claim to it, as it was shamelessly stolen from this passage in Wolfram's writings:
Doctors, lawyers, teachers, farmers, whatever. The future of all these professions will be full of computational thinking. Whether it’s sensor-based medicine, computational contracts, education analytics or computational agriculture - success is going to rely on being able to do computational thinking well.
I’ve noticed an interesting trend. Pick any field X, from archeology to zoology. There either is now a “computational X” or there soon will be. And it’s widely viewed as the future of the field.
These seemed like inspiring words for anyone embarking on a long and uncertain journey, so we made them our own and, in turn, it gave us a name to rally around. But what of its boundaries? One of the biggest challenges facing the reductionist is that, in the limit, everything is interconnected with everything else, for there is no natural halting function. Thus, if you are not careful, all paths will eventually lead you into the realm of quarks and particle physics, regardless of your starting point. Now, that would not be an entirely useful outcome. I have never found a completely satisfactory answer to this question in any on my previous meanderings, but in general I tend to follow an empiric approach and let taste be my guide4. Granted, its probably not the scientific solution you were expecting, but it seems that there are "intuitive" boundaries in subjects, and when we hit one of those we shall stop5. As an example, for our purposes we need not look in detail at legal frameworks when trying to understand financial concepts, though the two disciplines are deeply intertwined.
An issue which is closely interrelated with the previous one is on how to strike a balance between computational exploration versus domain definitions. Too much exploration and you proceed full steam ahead without knowing the meaning of things; too many boring definitions and they become just words without bringing any light to the subject under scrutiny. The sweet spot lies somewhere in the middle.
Our approach can be described as follows. We shall try to progress very slowly and methodically through the concepts in the domain, building them up as we climb the abstraction ladder but without making them too dense and technical. We'll make extensive use of Wikipedia definitions, where possible, but keeping these focused only on the point at hand rather than exploring the myriad of possibilities around a theme.
Finally, we shall try to marry domain concepts with our chosen implementations - the computational experiments part - in order to illustrate their purpose and get a better understanding at what it is that they are trying to do. So, each post will be focused on one fairly narrow subject area, start with a bunch of definitions which are hopefully self-explanatory and then proceed to explore the available implementations on that topic, or code that we write ourselves.
The target audience for this material is the fabled homo developus, that non-existent "standard developer" - in this particular case, one moderately competent on C++ but unfamiliar with computational finance. On the "finance" part, if you are already familiar with the domain, you will no doubt find the content very slow going. I'm afraid this is by design: the objective is to try to build the concepts on a solid foundation for those not in the know, so slowness is unavoidable6.
With regards to the computational part: the astute reader will likely point out that there are a great deal of tutorials on QuantLib, ORE and many other libraries of a similar ilk, and many books have been written on quantitative finance. One could be forgiven for wondering if there is a need to pile on more literature onto a seemingly already crowded subject.
In our defence, we are yet to find work that directly targets "plain" software developers and provides them with a sufficiently broad view of the domain. In addition, most of the existing material is aimed at either those with strong mathematical abilities but no domain knowledge, or its converse, leaving many gaps in the understanding. What we are instead aiming for is to target those with strong programming abilities but no particular knowledge of either computational finance or mathematics. And this leads us nicely to our next topic.
Our assumption is that you, dear reader, are not able to attain deep levels of understanding by staring at page after page of complex mathematical formulae. I, for one, certainly cannot. Unfortunately, non-trivial mathematics is difficult to avoid when covering a subject matter of this nature so, as a counterweight, we shall strive to use it sparingly and only from a software engineering application perspective. Note that this approach is clearly not suitable for the mathematically savvy amongst us, as they will find it unnecessarily laboured; then again, our focus lies elsewhere.
Our core belief is that an average reader (like me) should be able to attain a software engineer's intuition of how things work just by fooling around with software models of formulae. The reason why I am very confident on this regard is because that's how developers learn: by coding and seeing what happens. In fact, it is this very tight feedback loop between having an idea and experimenting with it that got many of us hooked into programming in the first place, so its a very powerful tool in the motivational arsenal. And, as it turns out, these ideas are related to Wolfram's concept of Experimental Mathematics. Ultimately, our aspiration is to copy the approach taken by Klein in Coding the Matrix, though perhaps that sets the bar a tad too high. Well, at least you get the spirit of the approach.
Another rather peculiar idea we pursued is the use of cryptocurrencies throughout, to the exclusion of everything else. Whilst very popular in the media, where they are known as cryptos, in truth cryptocurrencies still have a limited presence in the "real" world of finance, and nowhere more so than in derivatives - i.e., the bulk of our analysis. So at first blush, this is a most puzzling choice. We have decided to do so for three reasons.
Firstly, just because I wanted to learn more about cryptos. Secondly, because there is a need to bridge the knowledge gap between these two distinct worlds of finance; to blend the old with the new if you will. Personally, I think it will be interesting to see what the proliferation of derivatives will do to cryptos - but for that we need to disseminate financial knowledge. Finally, and most important of all, because in order to properly illustrate all of the concepts we shall cover, and to drive the open source libraries to their potential, one needs vast amounts of data of the right kind. Lets elaborate further on this point.
One of the biggest problems with any material in quantitative finance is in obtaining data sets which are sufficiently rich to cover all of the concepts being explained. This, in my opinion, is one of the key shortcomings with most tutorials: they either assume users can source the data themselves, or provide a small data set to prove a very specific point but which is insufficient for wider exploration7. This document takes a slightly different approach. We will base ourselves on a simulated world - a parallel reality if you'd like, thinly anchored to our reality by freely available data taken from the crypto markets. We shall then generate all of the remaining data, to the level of precision, richness and consistency required both to drive the code samples, but also to allow for "immersive" exploration. In fact, the very processes for data generation will be used as a pathway for domain exploration.
Of course, generated data is not perfect - i.e., realistic it is not, by definition - but our point is to understand the concepts, not to create new quant models that trade in the real world, so it is adequate for our needs. In addition, the data sets and code samples, as well as the means used to derive them shall be part of a git repository under an open source licence, so they can be extended and improved over time.
If you are not familiar with cryptos, don't worry. For starters, we can assume the intricate mechanistic details to a large extent - the blockchain and so forth - and introduce key concepts as required. We need not concern ourselves with this because there is plenty of freely available material covering it in excruciating detail, and designed specifically for software engineers. Instead, we shall treat cryptos as if they were regular currencies, except where they are just too different - in which case we'll point out the differences. Its a bit of a strange approach, but hopefully it will produce the desired results.
If you are trying to learn techniques on how to trade, this is not the material for you. Even when we discuss trading strategies and other similar topics, our focus is always on trying to understand how the machinery works rather than on how to make money with it. Similarly, if you are a quant or are trying to become one, you are probably better off reading the traditional books such as Hull or Wilmott rather than these posts, as our treatment of mathematics will be far too basic for your requirements. However, if you are an expert in this subject area, or if you find any mistakes please do point them out.
As with anything to do with finance, we need to set out the standard disclaimers. To make sure these are seen, we shall add them to each post.
All of the content, including source code, is either written by the author of the posts, or obtained from freely available sites in the internet, with suitable software licences. All content sources shall be clearly identified at the point of use. No proprietary information of any kind - including, but not limited to, source code, text, market data or mathematical models - shall be used within this material.
All of the views expressed here represent exclusively myself and are not those of any corporation I may be engaged in commercial activities with.
The information available in these blog posts is for your general information and use and is not intended to address your particular requirements. In particular, the information does not constitute any form of financial advice or recommendation and is not intended to be relied upon by users in making (or refraining from making) any investment decisions.8
All software written by the author for these posts is licensed under the Gnu GPL v3. As per the licence, it is "distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details."
With all of the preliminaries out of the way, we can move on to the meat of the subject. On Part II we shall discuss our first real topic, and it could not be much more fundamental: Money.
"We must know, we will know". As per Wikipedia:
The epitaph on his tombstone in Göttingen consists of the famous lines he spoke at the conclusion of his retirement address to the Society of German Scientists and Physicians on 8 September 1930. The words were given in response to the Latin maxim: "Ignoramus et ignorabimus" or "We do not know, we shall not know".
The Feynman Technique is a well-established learning methodology. For more details, see Richard Feynman: The Difference Between Knowing the Name of Something and Knowing Something.
There are no circumstances under which I have seen software developers lacking confidence. I feel that the motto of our profession should be the Latin translation of Make up with confidence that which you lack for in competence.
An idea that was most likely inspired by Linus' views on good taste. For details see Applying the Linus Torvalds “Good Taste” Coding Requirement.
Of course, your intuition is not my intuition. I'm afraid you will have to take my taste as a given, even where you disagree. Feel free to make your views heard though.
As they say in my home country of Angola, malembe malembe. The expression can be loosely translated to English as "slowly but surely", or "slowly does it".
As an example, the latter approach is taken by a library I respect very much, the Open Source Risk Engine (ORE).