Is it really March already? It feels like yesterday it was February and I was just starting at Avirtek. It’s been an eventful month. I went from knowing one programming language to about five, and I had to more or less teach the other four to myself. I now know Java (thank you Mr. B), Python (thank you Google/Stack Overflow/Python Documentation), SQL (thank you Code Academy), the Command Line (thank you again Code Academy), and XML (thank you W3 Schools). A month ago I only knew Java, which thankfully gave me an incredible foundation for learning the others. If I hadn’t had that, the rest would have been impossible.
Progress update on the XML parser: I finally (after a day or so of nearly tearing my hair out in frustration) connected my program to the local Avirtek database, so every time I run the parser it uploads all the XML file data. I had to teach myself regex, or Regular Expressions, in Python to extract one of the features, which was another day I wanted to pull my hair out. After I had successfully completed both of those tasks, I spent a solid five minutes celebrating because I had a working program and a full head of hair. That, my friends, is a small miracle.
In terms of the data I had to extract from XML, I’m almost done with extracting all the different features. I have about two more to go before that part of the project is finished. I have also discovered that the feature extraction is only the first step in this process, after that the real cyber security work begins with the data analysis.
I’ve had to do a decent amount of research (apart from Googling “how to do x in Python” every few seconds) as of late. Dr. Hariri gave me a paper that outlines what Avirtek was founded on and how they determine if files are malicious, and has been generous enough to go over it with me and help me understand it. He also gave me a textbook on threat modeling, which is basically a book on how to go about creating secure software. Finally, he gave me a list of common XML attacks. The original syllabus I had for this project went out the window a couple weeks ago, but I have plenty of research material to work with and the portions of the XML parser I can actually disclose will be my final product.
XML is the next step in what Avirtek is working on, which means I’m being phased in. I was actually mentioned in the meeting today. They’re moving along with the project, and the meetings are beginning to make sense. The more time passes, the clearer picture I have of what’s actually being done.
In terms of my own project, once I finish with all the features I’ll move on to the data analysis and malicious file detection, which is the core of cyber security. Pulling attributes from XML was just the beginning. If you’re wondering why I’ve been talking about XML so openly, it’s because I’ve discovered that it’s not secure information. Most of what I’ve done so far can be found in the research paper (which is in the public domain, it’s actually an Israeli paper) Dr. Hariri gave me or in the threat testing textbook. There are portions I’ve had to omit however, and for obvious reasons I can’t tell you which portions.
Finally, I’m going to be off next week. I’ll be in Montana attending the IEEE Aerospace Conference with my dad, and I’ll be presenting my own paper on XML at the Junior Conference. Did I mention the conference is held at a ski resort? It’s held at a ski resort. This is also my last year at the junior conference, so wish me luck!
To do list for when I get back: refine the parser (i.e. reduce runtime, consolidate functions, delete unnecessary variables, shift to Object Oriented Programming, ensure all necessary aspects of the features are detected and stored, etc.) and begin working on data analysis
See you in two weeks!