UUID

我们平时在使用 UUID 的时候觉得非常简单,甚至很多人觉得这没什么技术含量。

那么深入思考一层,UUID 的实现原理是什么?

源码

类声明

public final class UUID implements java.io.Serializable, Comparable<UUID> {

    /**
     * Explicit serialVersionUID for interoperability.
     */
    private static final long serialVersionUID = -4856846361193249489L;
}

实现了两个常见的接口。

内部变量

 /*
 * The most significant 64 bits of this UUID.
 *
 * @serial
 */
private final long mostSigBits;
/*
 * The least significant 64 bits of this UUID.
 *
 * @serial
 */
private final long leastSigBits;
/*
 * The random number generator used by this class to create random
 * based UUIDs. In a holder class to defer initialization until needed.
 */
private static class Holder {
    static final SecureRandom numberGenerator = new SecureRandom();
}

前面2个时位定义。

下面一个 Holder 熟悉单例模式的都知道,这是通过内部静态类,达到单例效果的一种写法。

构造器

/**
 * Constructs a new {@code UUID} using the specified data.  {@code
 * mostSigBits} is used for the most significant 64 bits of the {@code
 * UUID} and {@code leastSigBits} becomes the least significant 64 bits of
 * the {@code UUID}.
 *
 * @param  mostSigBits
 *         The most significant bits of the {@code UUID}
 *
 * @param  leastSigBits
 *         The least significant bits of the {@code UUID}
 */
public UUID(long mostSigBits, long leastSigBits) {
    this.mostSigBits = mostSigBits;
    this.leastSigBits = leastSigBits;
}

这个构造器非常简单,就是做了下基本属性的初始化。

还有一个私有的构造器,内容如下:

/*
 * Private constructor which uses a byte array to construct the new UUID.
 */
private UUID(byte[] data) {
    long msb = 0;
    long lsb = 0;
    assert data.length == 16 : "data must be 16 bytes in length";
    for (int i=0; i<8; i++)
        msb = (msb << 8) | (data[i] & 0xff);
    for (int i=8; i<16; i++)
        lsb = (lsb << 8) | (data[i] & 0xff);
    this.mostSigBits = msb;
    this.leastSigBits = lsb;
}

这个方法是直接根据 byte[] 数组内容初始化内部变量。

randomUUID()

这个方法是我们平时用的最多的

UUID.randomUUID().toString()

下面我们一起来看一下这个 randomUUID() 的真面目

/**
 * Static factory to retrieve a type 4 (pseudo randomly generated) UUID.
 *
 * The {@code UUID} is generated using a cryptographically strong pseudo
 * random number generator.
 *
 * @return  A randomly generated {@code UUID}
 */
public static UUID randomUUID() {
    SecureRandom ng = Holder.numberGenerator;
    byte[] randomBytes = new byte[16];
    ng.nextBytes(randomBytes);
    randomBytes[6]  &= 0x0f;  /* clear version        */
    randomBytes[6]  |= 0x40;  /* set to version 4     */
    randomBytes[8]  &= 0x3f;  /* clear variant        */
    randomBytes[8]  |= 0x80;  /* set to IETF variant  */
    return new UUID(randomBytes);
}

这里首先通过前面的内部静态类,获取一个单例的随机数生成器。

然后做了 2 个位置的内容变换处理,这个需要一定的位运算知识。

最后在调用前面提到的构造器。

nameUUIDFromBytes(byte[])

/**
 * Static factory to retrieve a type 3 (name based) {@code UUID} based on
 * the specified byte array.
 *
 * @param  name
 *         A byte array to be used to construct a {@code UUID}
 *
 * @return  A {@code UUID} generated from the specified array
 */
public static UUID nameUUIDFromBytes(byte[] name) {
    MessageDigest md;
    try {
        md = MessageDigest.getInstance("MD5");
    } catch (NoSuchAlgorithmException nsae) {
        throw new InternalError("MD5 not supported", nsae);
    }
    byte[] md5Bytes = md.digest(name);
    md5Bytes[6]  &= 0x0f;  /* clear version        */
    md5Bytes[6]  |= 0x30;  /* set to version 3     */
    md5Bytes[8]  &= 0x3f;  /* clear variant        */
    md5Bytes[8]  |= 0x80;  /* set to IETF variant  */
    return new UUID(md5Bytes);
}

这里是初始化了一个 MD5 的算法类。

后面的处理和上一个方法就类似了。

fromString(String)

源码如下:

/**
 * Creates a {@code UUID} from the string standard representation as
 * described in the {@link #toString} method.
 *
 * @param  name
 *         A string that specifies a {@code UUID}
 *
 * @return  A {@code UUID} with the specified value
 *
 * @throws  IllegalArgumentException
 *          If name does not conform to the string representation as
 *          described in {@link #toString}
 *
 */
public static UUID fromString(String name) {
    String[] components = name.split("-");
    if (components.length != 5)
        throw new IllegalArgumentException("Invalid UUID string: "+name);
    for (int i=0; i<5; i++)
        components[i] = "0x"+components[i];
    long mostSigBits = Long.decode(components[0]).longValue();
    mostSigBits <<= 16;
    mostSigBits |= Long.decode(components[1]).longValue();
    mostSigBits <<= 16;
    mostSigBits |= Long.decode(components[2]).longValue();
    long leastSigBits = Long.decode(components[3]).longValue();
    leastSigBits <<= 48;
    leastSigBits |= Long.decode(components[4]).longValue();
    return new UUID(mostSigBits, leastSigBits);
}

这个方法我觉得应该和 toString() 结合起来看,这2个方法应该互为逆过程。

toString()

   /**
     * Returns a {@code String} object representing this {@code UUID}.
     *
     * <p> The UUID string representation is as described by this BNF:
     * <blockquote><pre>
     * {@code
     * UUID                   = <time_low> "-" <time_mid> "-"
     *                          <time_high_and_version> "-"
     *                          <variant_and_sequence> "-"
     *                          <node>
     * time_low               = 4*<hexOctet>
     * time_mid               = 2*<hexOctet>
     * time_high_and_version  = 2*<hexOctet>
     * variant_and_sequence   = 2*<hexOctet>
     * node                   = 6*<hexOctet>
     * hexOctet               = <hexDigit><hexDigit>
     * hexDigit               =
     *       "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
     *       | "a" | "b" | "c" | "d" | "e" | "f"
     *       | "A" | "B" | "C" | "D" | "E" | "F"
     * }</pre></blockquote>
     *
     * @return  A string representation of this {@code UUID}
     */
    public String toString() {
        return (digits(mostSigBits >> 32, 8) + "-" +
                digits(mostSigBits >> 16, 4) + "-" +
                digits(mostSigBits, 4) + "-" +
                digits(leastSigBits >> 48, 4) + "-" +
                digits(leastSigBits, 12));
    }
  • digits()

这里就是把数字转换为 16 进制

/** Returns val represented by the specified number of hex digits. */
private static String digits(long val, int digits) {
    long hi = 1L << (digits * 4);
    return Long.toHexString(hi | (val & (hi - 1))).substring(1);
}

感触

看完之后,是不是觉得并没有得到精髓呢?

我觉得算法的原理不清楚,单独看了下源码,实际上有些问题还是没有解决。

  • 为什么 UUID 可以保证唯一性?

  • 如何实现自己的,位数更少的 id 算法?

关于这部分将在其他文章中进行补充完善。

拓展阅读

java 位运算

参考资料

jdk8 源码