首页 > 资讯 > 后端开发 > Python >LRU算法及Apache LRUMap源码实例解析

106

分享到

LRU算法及Apache LRUMap源码实例解析

2024-04-02 19:04:59 106人浏览薄情痞子

Python 官方文档：入门教程 => 点击学习

摘要

目录1. 什么是LRU1.1 自定义实现LRU的要求1.2 Apache LRUMap示例1.2.1 pom依赖1.2.2 demo2. 源码解析2.1 设计2.2 数据结构2.3

1. 什么是LRU

LRU(least recently used) : 最近最少使用

LRU就是一种经典的算法，在容器中，对元素定义一个最后使用时间，当新的元素写入的时候，如果容器已满，则淘汰最近最少使用的元素，把新的元素写入。

1.1 自定义实现LRU的要求

比如Redis，如何自己实现简易版的redis缓存。

那么我们需要一种数据结构，支持set和get操作。

1） get操作时间复杂度O(1);

2）需要支持RLU算法，空间不足时，需要将使用最少的元素移除，为新元素让空间；

3）时间失效remove(这个先不谈，比较麻烦)。

1.2 Apache LRUMap示例

1.2.1 pom依赖


        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-collections4</artifactId>
            <version>4.2</version>
        </dependency>

1.2.2 demo


        LRUMap<String, String> map = new LRUMap<>(3);
        map.put("1", "1");
        map.put("2", "2");
        map.put("3", "3");
 
        map.get("2");
 
        System.out.println("---------------------------------");
        map.forEach((k,v)->
            System.out.println(k+"\t"+v)
        );
 
        map.put("4", "4");
        map.put("5", "5");
 
        System.out.println("---------------------------------");
        map.forEach((k,v)->
                System.out.println(k+"\t"+v)
        );
 
        map.put("6", "6");
 
        System.out.println("---------------------------------");
        map.forEach((k,v)->
                System.out.println(k+"\t"+v)
        );

结果如下：

---------------------------------

1 1

3 3

2 2

---------------------------------

2 2

4 4

5 5

---------------------------------

4 4

5 5

6 6

可以看出在get("2")，2的位置挪后，然后移除的顺序就延后。

容量不足时，总是移除，使用最少的，时间最远的。

2. 源码解析

2.1 设计


public class LRUMap<K, V>
        extends AbstractLinkedMap<K, V> implements BoundedMap<K, V>, Serializable, Cloneable {

进一步查看AbstractLinkedMap，AbstractHashedMap


public abstract class AbstractLinkedMap<K, V> extends AbstractHashedMap<K, V> implements OrderedMap<K, V> {


public class AbstractHashedMap<K, V> extends AbstractMap<K, V> implements IterableMap<K, V> {

本质是自定义AbstractMap

我们看看HashMap LinkedHashMap


public class LinkedHashMap<K,V>
    extends HashMap<K,V>
    implements Map<K,V>


public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {

可以看出AbstractMap，AbstractHashedMap，LRUMap的本质其实也是HashMap。

2.2 数据结构


protected static final int DEFAULT_MAX_SIZE = 100;
 
public LRUMap() {
        this(DEFAULT_MAX_SIZE, DEFAULT_LOAD_FACTOR, false);
}

可以看出默认初始化容量100，最大容量也是100.

进一步跟踪


public LRUMap(final int maxSize, final float loadFactor, final boolean scanUntilRemovable) {
        this(maxSize, maxSize, loadFactor, scanUntilRemovable);
}
 

    public LRUMap(final int maxSize,
                  final int initialSize,
                  final float loadFactor,
                  final boolean scanUntilRemovable) {
 
        super(initialSize, loadFactor);
        if (maxSize < 1) {
            throw new IllegalArgumentException("LRUMap max size must be greater than 0");
        }
        if (initialSize > maxSize) {
            throw new IllegalArgumentException("LRUMap initial size must not be greather than max size");
        }
        this.maxSize = maxSize;
        this.scanUntilRemovable = scanUntilRemovable;
    }

跟踪super(initialSize, loadFactor);


public abstract class AbstractLinkedMap<K, V> extends AbstractHashedMap<K, V> implements OrderedMap<K, V> {
 
    protected AbstractLinkedMap(final int initialCapacity, final float loadFactor) {
        super(initialCapacity, loadFactor);
    }
//又super，再上一层追踪
 
public class AbstractHashedMap<K, V> extends AbstractMap<K, V> implements IterableMap<K, V> {
    //定义一些基本初始化数据
    
    protected static final int DEFAULT_CAPACITY = 16;
    
    protected static final int DEFAULT_THRESHOLD = 12;
    
    protected static final float DEFAULT_LOAD_FACTOR = 0.75f;
    
    protected static final int MAXIMUM_CAPACITY = 1 << 30;
 
    
    transient float loadFactor;
    
    transient int size;
    
    transient HashEntry<K, V>[] data;
    
    transient int threshold;
    
    transient int modCount;
    
    transient EntrySet<K, V> entrySet;
    
    transient KeySet<K> keySet;
    
    transient Values<V> values;
 
    protected AbstractHashedMap(int initialCapacity, final float loadFactor) {
        super();
        if (initialCapacity < 0) {
            throw new IllegalArgumentException("Initial capacity must be a non negative number");
        }
        if (loadFactor <= 0.0f || Float.isNaN(loadFactor)) {
            throw new IllegalArgumentException("Load factor must be greater than 0");
        }
        this.loadFactor = loadFactor;
        initialCapacity = calculateNewCapacity(initialCapacity);
        this.threshold = calculateThreshold(initialCapacity, loadFactor);
        this.data = new HashEntry[initialCapacity];
        init();
    }
 
    
    protected void init() {
        //没有任何逻辑，仅用于子类构造
    }

DEFAULT_LOAD_FACTOR = 0.75f; 负载因子0.75

可以看出LRUMap的本质，HashEntry数组。

上面的init方法没有实现逻辑，但是在他的子类中AbstractLinkedMap有相关的定义。


    
    transient LinkEntry<K, V> header;
 
    
    @Override
    protected LinkEntry<K, V> createEntry(final HashEntry<K, V> next, final int hashCode, final K key, final V value) {
        return new LinkEntry<>(next, hashCode, converTKEy(key), value);
    }
 
    protected static class LinkEntry<K, V> extends HashEntry<K, V> {
        
        protected LinkEntry<K, V> before;
        
        protected LinkEntry<K, V> after;
 
        
        protected LinkEntry(final HashEntry<K, V> next, final int hashCode, final Object key, final V value) {
            super(next, hashCode, key, value);
        }
    }
 
    
    @Override
    protected void init() {
        header = createEntry(null, -1, null, null);
        header.before = header.after = header;
    }

这个很关键。可以看出LRUMap是持有LinkEntry header，的双链表结构，初始header为null，前后节点都是自身。将HashEntry转成LinkEntry。

解析HashEntry


transient HashEntry<K, V>[] data;
//构造初始化
this.data = new HashEntry[initialCapacity];

再跟踪


 protected static class HashEntry<K, V> implements Map.Entry<K, V>, KeyValue<K, V> {
        
        protected HashEntry<K, V> next;
        
        protected int hashCode;
        
        protected Object key;
        
        protected Object value;

key，value，next单链表。


public int hashCode() {
            return (getKey() == null ? 0 : getKey().hashCode()) ^
                   (getValue() == null ? 0 : getValue().hashCode());
        }

hashCode方法可以看出是key的hash与value的hash按位^运算。

在此我们看透LRU的本质了，数组+单链表。同时是持有头结点的双链表结构（怎么看就是LinkedHashMap的结构，只是有尾节点）。


public class LinkedHashMap<K,V>
    extends HashMap<K,V>
    implements Map<K,V>
{    
    
    transient LinkedHashMap.Entry<K,V> head;
 
    
    transient LinkedHashMap.Entry<K,V> tail;

那么LRUMap是如何实现LRU算法的？

2.3 方法解析put get remove

2.3.1 get方法


public V get(final Object key) {
        return get(key, true);
}
 
public V get(final Object key, final boolean updateToMRU) {
        final LinkEntry<K, V> entry = getEntry(key);
        if (entry == null) {
            return null;
        }
        if (updateToMRU) {
            moveToMRU(entry);
        }
        return entry.getValue();
}
 
//父类方法获取值entry
protected HashEntry<K, V> getEntry(Object key) {
        key = convertKey(key);
        final int hashCode = hash(key);
        HashEntry<K, V> entry = data[hashIndex(hashCode, data.length)]; // no local for hash index
        while (entry != null) {
            if (entry.hashCode == hashCode && isEqualKey(key, entry.key)) {
                return entry;
            }
            entry = entry.next;
        }
        return null;
}

下面看不一样的moveToMRU(entry);



    protected void moveToMRU(final LinkEntry<K, V> entry) {
        if (entry.after != header) {
            modCount++;
            // remove
            if(entry.before == null) {
                throw new IllegalStateException("Entry.before is null." +
                    " Please check that your keys are immutable, and that you have used synchronization properly." +
                    " If so, then please report this to dev@commons.apache.org as a bug.");
            }
            entry.before.after = entry.after;
            entry.after.before = entry.before;
            // add first
            entry.after = header;
            entry.before = header.before;
            header.before.after = entry;
            header.before = entry;
        } else if (entry == header) {
            throw new IllegalStateException("Can't move header to MRU" +
                " (please report this to dev@commons.apache.org)");
        }
    }

看出LRU的一个本质，每次get方法拨动指针，将get的元素移动到header的前一个位置。

2.3.2 remove方法

remove方法使用的父类的方法


    
    @Override
    public V remove(Object key) {
        key = convertKey(key);
        final int hashCode = hash(key);
        final int index = hashIndex(hashCode, data.length);
        HashEntry<K, V> entry = data[index];
        HashEntry<K, V> previous = null;
        while (entry != null) {
            if (entry.hashCode == hashCode && isEqualKey(key, entry.key)) {
                final V oldValue = entry.getValue();
                removeMapping(entry, index, previous);
                return oldValue;
            }
            previous = entry;
            entry = entry.next;
        }
        return null;
    }
 
    
    protected void removeMapping(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        modCount++;
        removeEntry(entry, hashIndex, previous);
        size--;
        destroyEntry(entry);
    }
 
    protected void removeEntry(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        if (previous == null) {
            data[hashIndex] = entry.next;
        } else {
            previous.next = entry.next;
        }
    }
 
    protected void destroyEntry(final HashEntry<K, V> entry) {
        entry.next = null;
        entry.key = null;
        entry.value = null;
    }

这里并没有移除header双链表的数据。

2.3.3 put方法


    
    @Override
    public V put(final K key, final V value) {
        final Object convertedKey = convertKey(key);
        final int hashCode = hash(convertedKey);
        final int index = hashIndex(hashCode, data.length);
        HashEntry<K, V> entry = data[index];
        //仅在元素存在才循环，更新updateEntry，header前一个位置
        while (entry != null) {
            if (entry.hashCode == hashCode && isEqualKey(convertedKey, entry.key)) {
                final V oldValue = entry.getValue();
                updateEntry(entry, value);
                return oldValue;
            }
            entry = entry.next;
        }
 
        addMapping(index, hashCode, key, value);
        return null;
    }

updateEntry(entry, value);


    
    @Override
    protected void updateEntry(final HashEntry<K, V> entry, final V newValue) {
        moveToMRU((LinkEntry<K, V>) entry);  // handles modCount
        entry.setValue(newValue);
    }

moveToMRU((LinkEntry<K, V>) entry); // handles modCount

上面get方法有讲，更新了链表的指针，新添加的元素在双链表的header前一个位置，仅在元素存在的时候，while循环才生效。

那么新增的元素呢？

下面看重点 addMapping(index, hashCode, key, value); 这句代码定义了，容量满了的处理策略。


    
    @Override
    protected void addMapping(final int hashIndex, final int hashCode, final K key, final V value) {
        //容量是否已满
        if (isFull()) {
            LinkEntry<K, V> reuse = header.after;
            boolean removeLRUEntry = false;
            //默认是false
            if (scanUntilRemovable) {
                //这里不知道干啥，难道是以后扩展？
                while (reuse != header && reuse != null) {
                    //removeLRU一定返回true，很奇怪，估计以后扩展用
                    if (removeLRU(reuse)) {
                        removeLRUEntry = true;
                        break;
                    }
                    reuse = reuse.after;
                }
                if (reuse == null) {
                    throw new IllegalStateException(
                        "Entry.after=null, header.after" + header.after + " header.before" + header.before +
                        " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                        " Please check that your keys are immutable, and that you have used synchronization properly." +
                        " If so, then please report this to dev@commons.apache.org as a bug.");
                }
            } else {
                //一定返回true
                removeLRUEntry = removeLRU(reuse);
            }
 
            if (removeLRUEntry) {
                if (reuse == null) {
                    throw new IllegalStateException(
                        "reuse=null, header.after=" + header.after + " header.before" + header.before +
                        " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                        " Please check that your keys are immutable, and that you have used synchronization properly." +
                        " If so, then please report this to dev@commons.apache.org as a bug.");
                }
                reuseMapping(reuse, hashIndex, hashCode, key, value);
            } else {
                super.addMapping(hashIndex, hashCode, key, value);
            }
        } else {
            super.addMapping(hashIndex, hashCode, key, value);
        }
    }
 
    protected boolean removeLRU(final LinkEntry<K, V> entry) {
        return true;
    }

先判断容量


public boolean isFull() {
        return size >= maxSize;
}

未满就直接添加

super.addMapping(hashIndex, hashCode, key, value);


    protected void addMapping(final int hashIndex, final int hashCode, final K key, final V value) {
        modCount++;
        final HashEntry<K, V> entry = createEntry(data[hashIndex], hashCode, key, value);
        addEntry(entry, hashIndex);
        size++;
        checkCapacity();
    }

//这里调用了AbstractLinkedMap的方法

addEntry(entry, hashIndex);


    
    @Override
    protected void addEntry(final HashEntry<K, V> entry, final int hashIndex) {
        final LinkEntry<K, V> link = (LinkEntry<K, V>) entry;
        link.after  = header;
        link.before = header.before;
        header.before.after = link;
        header.before = link;
        data[hashIndex] = link;
    }

放在header的前一个位置，最早的元素链接到header。

双向环回链表。

如果容量满了，执行LRU算法 reuseMapping(reuse, hashIndex, hashCode, key, value);


    
    protected void reuseMapping(final LinkEntry<K, V> entry, final int hashIndex, final int hashCode,
                                final K key, final V value) {
        // find the entry before the entry specified in the hash table
        // remember that the parameters (except the first) refer to the new entry,
        // not the old one
        try {
            //要干掉的元素下标
            final int removeIndex = hashIndex(entry.hashCode, data.length);
            final HashEntry<K, V>[] tmp = data;  // may protect against some sync issues
            HashEntry<K, V> loop = tmp[removeIndex];
            HashEntry<K, V> previous = null;
            //避免已经被删除
            while (loop != entry && loop != null) {
                previous = loop;
                loop = loop.next;
            }
            //如果被其他线程删除，抛异常
            if (loop == null) {
                throw new IllegalStateException(
                    "Entry.next=null, data[removeIndex]=" + data[removeIndex] + " previous=" + previous +
                    " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                    " Please check that your keys are immutable, and that you have used synchronization properly." +
                    " If so, then please report this to dev@commons.apache.org as a bug.");
            }
 
            // reuse the entry
            modCount++;
            //双链表移除旧元素,AbstractHashedMap移除旧元素
            removeEntry(entry, removeIndex, previous);
            //复用移除的对象，减少创建对象和GC；增加AbstractHashedMap单链表next指向
            reuseEntry(entry, hashIndex, hashCode, key, value);
            //复用的元素加AbstractLinkedMap双链表和AbstractHashedMap单链表
            addEntry(entry, hashIndex);
        } catch (final NullPointerException ex) {
            throw new IllegalStateException(
                    "NPE, entry=" + entry + " entryIsHeader=" + (entry==header) +
                    " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                    " Please check that your keys are immutable, and that you have used synchronization properly." +
                    " If so, then please report this to dev@commons.apache.org as a bug.");
        }
    }


    
    @Override
    protected void removeEntry(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        final LinkEntry<K, V> link = (LinkEntry<K, V>) entry;
        link.before.after = link.after;
        link.after.before = link.before;
        link.after = null;
        link.before = null;
        super.removeEntry(entry, hashIndex, previous);
    }
 
    
    protected void removeEntry(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        if (previous == null) {
            data[hashIndex] = entry.next;
        } else {
            previous.next = entry.next;
        }
    }
 
    
    protected void reuseEntry(final HashEntry<K, V> entry, final int hashIndex, final int hashCode,
                              final K key, final V value) {
        entry.next = data[hashIndex];
        entry.hashCode = hashCode;
        entry.key = key;
        entry.value = value;
    }
 
    
    @Override
    protected void addEntry(final HashEntry<K, V> entry, final int hashIndex) {
        final LinkEntry<K, V> link = (LinkEntry<K, V>) entry;
        link.after  = header;
        link.before = header.before;
        header.before.after = link;
        header.before = link;
        data[hashIndex] = link;
    }