Posts

Showing posts from August, 2017

Day 26 (Final Presentations)

This morning, we had our final presentations for the internship. All of the presentations seemed very good, even though I had already watched them in practice. My presentation was third, and I felt pretty good about how it went. After the presentations, we received our letters of recommendation and our certificates. I talked to Dr. Kanan and some of the other interns, then I showed my family the lab upstairs. Today was a positive end to an excellent summer experience.

Day 25

This post is about yesterday (August 16th). First, we had our last morning meeting. Joe talked to us about the internship program and asked us if we had any suggestions for or comments about it. I didn't really have any, but others said they would have liked a tour of the different labs on the first day of the program. It was also my last day in the kLab. I began by tying up some loose ends. I uploaded my data and other necessary files to Google Drive so that Ron could access them after I left. I also continued to run my feature extractor experiment. It appears that I will not finish all of the tests for that in time, but at least Ron will have the first part of the data for it. I also continued trying to fix the CRF (Conditional Random Field) postprocessor with the grad student who was working on postprocessors, but that also took a long time to run before it even got to the error. After lunch, I did not have much to do, because my programs were still running, so Ron had me work o

Day 24

We began in the auditorium this morning to practice our presentations. Overall, it seems like everyone's presentations are pretty good. I gave mine third because I will going third on Thursday. I'm glad I am near the beginning of the schedule because I think it would be difficult for me to focus on the other presentations if I was yet to give mine. My presentation today seemed to go well. I was just over ten minutes (the amount of time each individual presentation is supposed to take before questions), and I only had to make a few minor corrections to my slides (such as adding an acknowledgement page). After all of the presentations were done, it was about 12:00, so Ryan and I ate lunch before going to the lab. I had not been able to run my feature extractor experiment overnight, so I started it when I got to the lab and it is still running now. Hopefully, it will continue to run and the results will be ready tomorrow. Since I already gave my practice presentation in the audito

Day 23

At the morning meeting today, we turned in our presentations to Joe. I knew I would have to make changes to it later though. Soon after getting to the lab, Ryan and I practiced our presentations in front of those in the lab. I felt like my presentation had improved since last time. After that, I worked on the revisions to my presentation that Ron and the others had suggested. I also ran some of the tests for my feature extractor experiment. I will try to run more of these tests overnight, but the software we use in the kLab to do so is difficult, and I have previously tried and failed to get it to work. A little while after lunch, Ron had Ryan and I give our presentations again so we could have more practice. This time there were more people in the lab than before. After that, I made a few small edits then gave the updated version of my slides to Joe. Tomorrow, we interns will give our presentations in the auditorium to prepare for Thursday.

Day 22

Today I continued working on the feature extractor experiment. Ron and Ryan needed all of the GPUs to run their experiments, so I had to figure out how to run mine on a CPU. Ryan and I also did practice presentations in the lab today. I went first because I was before Ryan on the presentation schedule draft Joe handed out at the morning meeting. I think my rehearsal was good for a first try, although I did get stuck in one spot trying to explain the SVM-RBF classifier. After I was done, Ryan gave his presentation, which I thought was good. He seemed like he fully understood his research. After the rehearsal, Ron talked to us individually about our slides, and I made corrections to my presentation. This weekend I hope to be able to run my feature extractor experiment, as it will likely take a long time.

Day 21

I finished my postprocessor experiment today. Fortunately, that experiment was much faster than the others so far. The last postprocessor, CRF (Conditional Random Field), took a much longer time to run than MRF (Markov Random Field) and also turned out to be less accurate. Ron thought this was suspicious, so he had me send the results and data to another grad student who is working on postprocessors. I also finished up my slides for my presentation (except for the slides that I might add later about the experiment I am doing now). I added the new results and planned the points I will talk about. Ryan and I are doing presentation rehearsals in the lab tomorrow to see how well prepared we are. Also today, I started another experiment, which was to evaluate feature extractors. I worked on setting up the code to run that experiment, and I still have a few issues that I need to fix. I should have the experiment running tomorrow.

Day 20

Image
I finished my classifier experiment today. The only classifier I had left to test was the SNN (Self-Normalizing Neural Network). I had it narrowed down to two combinations of parameters, so I decided to run thirty trials on each of them and see which one performed better. I got to the lab early today, so I started running those and went to the morning meeting. At the meeting, we each described where we were in our projects and said how prepared we were for the presentation. My PowerPoint was about half done and I said I was just finishing my second experiment (the classifiers experiment). After I got back to the lab, the tests I was running crashed because the network connection was lost, so I had to restart them. That was a bummer. When I finally finished those tests, I added the results from the best parameters to my results table. After that, I began my next experiment, which should take less time than my previous experiments. It is to evaluate postprocessing methods. I already had

Day 19

My birthday is today, so I am now seventeen. I started today by adding the data from the Random Forest tests of yesterday to my results table. I decided to use the data from the test with StandardScaler as the preprocessor. After that, I began testing the SNN (Self-Normalizing Neural Network) classifier. I already knew that I had to use StandardScaler for it, but I needed to figure out what values to use for the layer size, which is how many machine learning "neurons" are in each layer of the SNN. I also needed to figure out some other parameters, so I did a lot of parameter tests on SNN today and recorded a lot of data, most of which I will not need after today. I was finding the best parameters with which to test SNN. While those tests were running, I began working on a PowerPoint for my presentation. I tried to make sure that I explained everything in my presentation so that anyone in the audience, even if they have absolutely no experience with the topic, will be able to

Day 18

When I began today, I had all of my writing so far done, which meant I did not have anything on which to work while my classifier experiment code ran, so Ron introduced me to my next experiment, which is evaluating postprocessing methods. I started by documenting the already-written code for the postprocessors, which were a Markov Random Field (MRF) and a Conditional Random Field (CRF). It remains a challenge for me to document code written by others. I usually document code that I write when I am writing it so that I do not have to go back to it and try to figure out what it does. After I finished documenting the two postprocessors, I began writing the code for my postprocessor experiment, but I will likely have to alter it later because I do not yet know which classifier is the most accurate, because my classifier experiment is not over yet. For now I have the postprocessing code use the SVM-RBF classifier, which is the best one so far. Finally my trials on the MLP classifier with th

Day 17

Image
Today, using the parameters I had previously found through the search, I continued my classifier experiment by running trials on the MLP classifier and recording the results. These results, however, turned out not to be very good, and presented an average overall accuracy of about fifty percent. I tried to improve the MLP's performance in various ways, such as using different numbers of epochs in each trial. I also increased the patience value, which is the number of epochs through which the program will go before it terminates the trial early. It seems that early stopping is a serious inhibitor of accuracy. So far, these changes have not significantly improved the accuracy, though. While the trials were running, I finished writing and citing descriptions of the classifiers I am testing in the Overleaf document. I also wrote an overview of the machine learning pipeline, and edited and added a diagram of the progression of the pipeline that I had previously made. I also prepared to

Day 16

Today at the morning meeting we reviewed more of the outlines, including mine. My outline was put on the projector and I gave a brief description of what my project was and what I am going to present. At the lab, I worked with Ron to figure out the problem I had been having the previous day with the parameter search I was doing for the MLP classifier. The problem turned out to be more complicated than anticipated. The issue was not only with my code, but also with the code for the MLP. The MLP had worked correctly when used by the pipeline, but not when used by a grid search, which I was performing and which is outside of the pipeline. Also, at one point today Ashley, Emily, and Peter dropped by the lab to see what I was working on. I did not have that much to show them, as all of my work is on my computer and I was just trying to fix an error at the time, but I did my best to describe and show them my experiment. After Ron and I fixed the problem with MLP, we and two other students in

Day 15

Today I got to the lab earlier than usual and worked on my experiment before the morning meeting. Then, at the meeting, we reviewed some of our outlines. It seemed from the outlines that everyone's projects were coming along pretty well. I noticed that not many of the other interns have results yet, as their projects require longer-term preparation, while I already have results from my first experiment. When I got back to the lab, I continued working on my classifier experiment. The whole time I was in the lab today, I was writing code for a parameter search on the MLP classifier. I was able to reuse some of the code I had previously written, but the parameter search is outside of the pipeline, so I had to write some parts on my own that the pipeline had done itself previously. I almost finished the parameter search code today, but I ran into a confusing error regarding cross validation (CV). Ron helped me look over the problem, but I am still having a tough time understanding CV a

Day 14

Today I continued testing different parameters for the MLP classifier. While I ran tests, I finished documenting the classifiers that I am testing. I also wrote an introduction to my descriptions of the preprocessing methods, finished citations for the descriptions, and wrote a summary of my preprocessing experiment in an article being written in Overleaf about the machine learning pipeline. Although I am not yet ready to test the next classifier, RandomForest, I began writing code to test for the optimal preprocessing methods and parameters to use with it. Throughout the day, I collected data on the metrics of different combinations of parameters for MLP. I looked through this data and ran a larger number of trials on the most promising parameters, but it still appears that the default parameters are the most accurate, even though the overall accuracy resulting from the default parameters averages at about seventy percent, which is not very good compared to the other two classifiers I