This week my primary focus has been accumulating some numerical data using the ouput of the main analysis script. Of course, the data I have gathered this week is pretty preliminary because I haven't performed extensive manual analysis yet to break the overall categories of change down into subcategories determined by the (perceived) motivation for the changes. For example, when evaluating the amount of general additions and subtractions of synchronization routines over the course of development, we may not find this information quite as useful as how many of these additions/subtractions were intended to solve race conditions vs. how many were simply refactoring changes. I began this manual evaluation stage this week as well but will continue that further into the following week. The graphs below are the results of my script's analysis:
As it is clear to see, throughout the development process, the number of removals went down and the number of synchronization additions went up substantially. I imagine that as the number of servers increases over time, the need for synchronization would increase. I am going to spend the rest of my time here primarily performing manual analysis to determine what the increase in synchronization correlates to as well as what the many possible motivations for such changes would be. So far, I have created some scripts to randomly sample the changes my original script has found so that I don't necessarily have to manually analyze hundreds of changes (we don't have that kind of time left). I've begun reading the LOG files and determining subcategories of change motivation for the randomly sampled changes. I plan to continue this throughout next week and continue meeting with Professor Lu to discuss what I'm finding.