Dynamic Programming in the Wild


After studying Computer Science theory for 4 years and doing lots of practice problems, it's a little disappointing that zesty problems like graph algorithms/divide and conquer don't come up frequently. That is why I was thrilled to be prompted with a real-world problem that needed some solid CS theory:

Problem statement:
Pensions at Company A are calculated by taking the average pay of either
  • The last 10 years of their service
  • Any other combination of 10 periods of 52 weeks
Goal: Maximize the pension of retiring employees.

Usually, employees take the first option because it's easy and often mostly correct, but is there an efficient way to find the optimal solution?

Option 1: Brute force

Runtime XKCD

Relevant XKCD

Hopefully, you groaned as much reading "brute force" as I did writing it. While checking every possible combination of time slices does find the optimal answer, it takes a loooooooong time. How long, you ask?

If we assume a career of 40 years, that is 40 years x 52 weeks per year = 2080 input points.

So how many ways can we choose 10 chunks of length 52 weeks? This can be transformed into an easier combinatorics problem by considering that we're really asking how many ways we can arrange chunks that are selected vs items that are not selected.

Once we make that simplification, there are actually 1570 elements that we are organizing:
  • 10 chunks which contain 520 elements (10 chunks of length 52)
  • 1560 elements which are not in chunks (2080-520)
So now it's just 1570 choose 10.

"Just" 1570 choose 10.

According to Wolfram Alpha that is 2.43 x1025. That number is HUGE. For reference:
  • There are ~7.5 x 1018 grains of sand on planet Earth
  • The observable universe has a diameter of about 1027 meters
If all 7 billion people on earth used a computer that could check one of these combinations every 10 ms, it would take 1,140,794 years to check them all.

Okay clearly brute force is a bad idea, but it's useful to know when it is and isn't feasible. Maybe we can compromise accuracy for speed?

Option 2: Greedy Algorithms


One improvement would be to find the most "dollar dense" section of the series and make that one of your 10 regions. Then you can remove that section from the series and repeat.

This is not a terrible idea. Unlike Option 1 it does finish before the heat death of the universe. Also unlike Option 1, it does not always find the optimal answer. Consider the following data, where we want to select 2 blocks of length 3:



The greedy strategy would take the 0 3 3 chunk and then 0 3 1 is the only valid chunk left. That has a total of 10:



Better is to split up the pair of 3's and get 3 0 3 and 3 0 3, for an optimal total of 12:



Greedy algorithms always have this drawback, and while they are fast they are not typically optimal.

So the real question is: Can we find the optimal answer in a reasonable amount of time?

Option 3: Dynamic Programming


As the title of this write-up implies, I was very impressed to run into a problem with a sound DP solution in the "real world." I've done plenty of Dynamic Programming problems for interview prep and in college but it's just not something we run into every day.

The giveaway on this problem was that the optimal answer was composed of the optimal answers to sub-problems. This is always the nature of Dynamic Programming problems, and in this case, I was able to re-frame the problem into the well-known dynamic programming problem the 0-1 Knapsack Problem.

In the traditional knapsack problem, we are considering which items a thief should put in their bag to maximize the value of what they take (they don't have room in their knapsack for everything). I wasn't choosing what weeks to use, but rather which blocks of weeks to use. As such, instead of looping over all the weeks I considered blocks of weeks and then I was dealing with a typical knapsack problem.

Enough monologing, here's the code (it takes a csv file name as an argument, as well as the chunk size and the number of chunks):


By using dynamic programming we are able to catch edge cases like the one that broke the greedy strategy. After it knows what the best solution is with one block in every position it can combine subproblems to determine that breaking up dense regions provides a better big-picture answer.

Of course, brute force finds the same optimal answers to problems that dynamic programming does. They have different runtimes, however. While brute force can't solve any interesting problems in your lifetime, my DP approach solves the 10-year sample problem instantaneously. I made some test data of ~77,000 weeks of data just to test it and it still solved that in about a second.

There's clearly room for improvement with this code. I could cache/precompute the sums of all the ranges instead of counting them every time, and my code isn't very Pythonic. Ultimately it works and runs very quickly so I haven't taken much time to optimize.

Conclusion

So how much better is optimal than the lazy strategy the company was using? I checked in with them after they had been using this code for a few months and they said in some cases their employees get a 16% higher pension.

Even compared to the best answers they could find by hand (no small effort) the optimal answer my code finds is able to find an extra 8% or so. This was a really fun challenge to code, but knowing it saved people a fair bit of money is a great feeling.

Overall I was quite excited to see a real-world application of DP, and I'm glad I was able to implement a solution. Most of the other approaches I saw offered to this problem suffered from the pitfalls of greedy algorithms or the inefficiency of brute force.

Once in a while, it's nice for some Computer Science Theory to save the day!

Click here to play with a javascript port of this solution.