Last week, I watched the documentary "The world of Thinking" on Vimeo. I chose to watch it because it portraits the Institute of Advanced Study (IAS for short) which is a fascinating place to me. It is located in Princeton, New Jersey, USA, and was the scientific home to many great and well-known mathematicians and theoretical physicists. It does indeed harbor scientists of other disciplines as well but those are of less interest to me. "The Institute", as it is often called, is a special place in that it is the closest implementation of the Ivory Tower of Science. The purpose of the Institute is to allow a few select scientists to focus as good as possible on expanding the boundaries of human knowledge.
If I could write a blog from scratch, what would I write about? Why would I want to write a blog in the first place? I have had a blog in one form or another for many years but I never got into a habit of contiually posting to on them. The obvious answer to the question on what to write about is to write about the things that occupy my mind, that I find interesting or that move me. I guess the only way to learn what I want to write about is to just write about something for a while. Many people advocate this technique and I have tried it before but the attempt died of perfectionism. I guess some of the articles that I have published in my last run were just too ambitious and took too much time to research and prepare. This time I should try to write simpler and more frequent blog posts.
After I've started working with reward-modulated STDP in spiking neural networks, I got curious about the background of research on which it was based. This led me to the book by Richard Sutton and Andrew Barto called "Reinforcement Learning". The book is from 1998 and it's freely readable on the internet! In the book's Introduction they cover the example of an agent learning to beat a given (imperfect) agent in the game of Tic Tac Toe. Two remarks have to be made: 1. The agent has to be imperfect because a perfect agent in Tic Tac Toe (if it's the one doing the first move) can never be beaten. 2. The agent does not learn to play Tic Tac Toe, this skill is assumed, but it learns a value map for its policy.