LRU vs LFU:
LRU: remove element which hasn't used in the longest time, by keeping the access order of the elements.
- Pros:
- Cache automatically tuned when access pattern changes.
- Cons:
- Performance poorly if certain files are accessed every once in a while, consistently, while others are accessed very frequently for a short while and never again.
- Used most frequently in real world according to Dropbox.
- Pros:
LFU: Keeping a counter on each file, stating how many times its been accessed.
- Cons: does poorly when access pattern changes, if usage pattern is stable, it'll do well.
- For example, suppose you have a large project that you’re working on, and then you finish said project and no longer access those files. Your cache would be storing those old files instead of the new ones you’re using. So our analysis told us something surprising: that LFU, which looked so promising, could actually be absurdly bad, in perhaps real-world situations.
https://stackoverflow.com/questions/15613966/parallel-top-ten-algorithm-for-distributed-data
http://zpjiang.me/2017/11/13/top-k-elementes-system-design/
https://www.inf.ed.ac.uk/teaching/courses/exc/slides/DataStreams.pdf
https://www.jiuzhang.com/qa/896/
https://micvog.com/2015/07/18/frequency-counting-algorithms-over-data-streams/