Since I have not been in town the past week, I have not been able to work on the main project or post on the blog as frequently. Instead, I decided to do a mini week long project that I could do in my limited time that still relates to exoplanets and their discovery.
The Problem: One of the ways to detect an exoplanet is by looking at the light curve of a star (a plot of the brightness of a star vs time). Theoretically, the light curve of a perfect, glowing sphere with no imperfections would be a flat line (the brightness would always be constant). Then if there was a planet around the star, the light curve would dip when the planet passed between the star and the Earth, because it blocks some of the light and causes the brightness to decrease. However, in reality, it is much more difficult. Stars, being huge balls of fusing hydrogen, are not the most stable objects and can have natural fluctuations in their brightness. Also, planets, being so small compared to their stars, do not cause that big of a dip in the brightness. With these two problems, it becomes difficult to determine whether fluctuations in brightness are caused by natural processes taking place in the star, or whether the fluctuations are because there is a transiting planet.
The Solution: The solution that I am currently attempting to use to help solve this problem is what is called a neural network, which is a type of machine learning modeled off how the brain and neurons work. I think the best way to understand what it does is to have a visual (see below).
So, to break the visual down, there are four main components: the input layer, the hidden layer, the output, and the connections (weights). The input layer is simply the starting data. In my case, I have 3197 brightness measurements, so instead of 3 input neurons, I have 3197. The input data is then sent through the weights (shown as the arrows in the visual that connect each node of one layer with each node of the next layer). All the weights do is simply multiply the individual input data by a number and pass it on to the next node. When the data gets to the hidden layer, each node adds up the numbers received (1 from each previous node). It then applies a function* which spits out another number. This number is then passed on to the next set of weights which have the same purpose as before. Finally, the data reaches the output layer which does the same thing as the hidden layer, except this time, the number that is produced is the final result. In my case, a 1 means that there is an exoplanet, and a 0 means that there is no planet.
The interesting part of the neural network comes in the training/’learning’. At the very beginning, the model starts with random weights. This means that if I ran a test light curve through the model, it would essentially give a random answer. The real learning part of the network comes in the weight optimization. What happens is that I give it a set of training data that contains the light curve and the correct answer for 5087 stars. It then uses this data to go back and tune the initial weights until the right answer is given for the data. In doing so, it is essentially ‘learning’ how to see a light curve and spit out whether or not there is a planet. One of the common ways to optimize these weights involves calculating an error function (difference between right answer and calculated answer) and then minimizing that error (if you are more curious about this, I would google backpropagation or I can try my best to explain it). Once the weights have been tuned, you can then give the network a brand new star that it has never seen before and it should be able to tell you whether or not there is a planet or not with fairly good accuracy.
While I would comment on the accuracy of the neural network I created, it is actually still in the training portion (optimizing and changing the weights), as this stage takes quite a bit of time, especially with large data sets like mine. For this particular problem, I have roughly 5 million weights that have to be tuned (it’s been running for 18 hours and still isn’t done). Once it does finish however, I will make another post that details whether or not it works.
Sorry for a bit of a long post, but I have not had as much time to document each step, so I had to put them all into this post. This topic is also a confusing one that people spend years understanding and learning. Because of this, I really do encourage questions, because I’m sure that I can elaborate and better explain something than I did in the quick overview that I’ve given.
*The function is a sigmoid function that outputs a continuous result from 0-1.