Darts-clone Q&A

Q: Darts-clone uses 8 bits to store a label and 1 bit to store a flag (has_leaf). The array size limit is really 2^29?

Darts-clone uses 21 bits to represent a relative offset. The remaining 1 bit is used to extend the limit. If the bit is 1, an offset value (represented by 21 bits) is shifted left by 8. As the result, the limit is 2^29.

Q: The structure seems to be a trie. A DAWG is just used in intermediate steps?

DoubleArrayImpl can represent not only a trie but also a DAWG.

Darts-clone stores values as a part of DAWG. If there are no duplicate values, there are no merge-able subtrees.

Darts-clone uses unique IDs as values if values are not specified. In this case, there are no duplicate values and Darts-clone directly builds a trie (not a DAWG) from a keyset because it is faster.

Otherwise, Darts-clone builds a DAWG and then builds a double-array from it.