Search in sources :

Example 1 with HashFunction

use of in project flink by apache.

the class StreamGraphHasherV1 method traverseStreamGraphAndGenerateHashes.

public Map<Integer, byte[]> traverseStreamGraphAndGenerateHashes(StreamGraph streamGraph) {
    // The hash function used to generate the hash
    final HashFunction hashFunction = Hashing.murmur3_128(0);
    final Map<Integer, byte[]> hashes = new HashMap<>();
    Set<Integer> visited = new HashSet<>();
    Queue<StreamNode> remaining = new ArrayDeque<>();
    // We need to make the source order deterministic. The source IDs are
    // not returned in the same order, which means that submitting the same
    // program twice might result in different traversal, which breaks the
    // deterministic hash assignment.
    List<Integer> sources = new ArrayList<>();
    for (Integer sourceNodeId : streamGraph.getSourceIDs()) {
    // Start with source nodes
    for (Integer sourceNodeId : sources) {
    StreamNode currentNode;
    while ((currentNode = remaining.poll()) != null) {
        // generate the hash code.
        if (generateNodeHash(currentNode, hashFunction, hashes, streamGraph.isChainingEnabled())) {
            // Add the child nodes
            for (StreamEdge outEdge : currentNode.getOutEdges()) {
                StreamNode child = outEdge.getTargetVertex();
                if (!visited.contains(child.getId())) {
        } else {
            // We will revisit this later.
    return hashes;
Also used : HashFunction( HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) StreamEdge(org.apache.flink.streaming.api.graph.StreamEdge) StreamNode(org.apache.flink.streaming.api.graph.StreamNode) ArrayDeque(java.util.ArrayDeque) HashSet(java.util.HashSet)

Example 2 with HashFunction

use of in project flink by apache.

the class StreamGraphHasherV2 method traverseStreamGraphAndGenerateHashes.

	 * Returns a map with a hash for each {@link StreamNode} of the {@link
	 * StreamGraph}. The hash is used as the {@link JobVertexID} in order to
	 * identify nodes across job submissions if they didn't change.
	 * <p>
	 * <p>The complete {@link StreamGraph} is traversed. The hash is either
	 * computed from the transformation's user-specified id (see
	 * {@link StreamTransformation#getUid()}) or generated in a deterministic way.
	 * <p>
	 * <p>The generated hash is deterministic with respect to:
	 * <ul>
	 * <li>node-local properties (like parallelism, UDF, node ID),
	 * <li>chained output nodes, and
	 * <li>input nodes hashes
	 * </ul>
	 * @return A map from {@link StreamNode#id} to hash as 16-byte array.
public Map<Integer, byte[]> traverseStreamGraphAndGenerateHashes(StreamGraph streamGraph) {
    // The hash function used to generate the hash
    final HashFunction hashFunction = Hashing.murmur3_128(0);
    final Map<Integer, byte[]> hashes = new HashMap<>();
    Set<Integer> visited = new HashSet<>();
    Queue<StreamNode> remaining = new ArrayDeque<>();
    // We need to make the source order deterministic. The source IDs are
    // not returned in the same order, which means that submitting the same
    // program twice might result in different traversal, which breaks the
    // deterministic hash assignment.
    List<Integer> sources = new ArrayList<>();
    for (Integer sourceNodeId : streamGraph.getSourceIDs()) {
    // Start with source nodes
    for (Integer sourceNodeId : sources) {
    StreamNode currentNode;
    while ((currentNode = remaining.poll()) != null) {
        // generate the hash code.
        if (generateNodeHash(currentNode, hashFunction, hashes, streamGraph.isChainingEnabled())) {
            // Add the child nodes
            for (StreamEdge outEdge : currentNode.getOutEdges()) {
                StreamNode child = outEdge.getTargetVertex();
                if (!visited.contains(child.getId())) {
        } else {
            // We will revisit this later.
    return hashes;
Also used : HashFunction( HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) ArrayDeque(java.util.ArrayDeque) HashSet(java.util.HashSet)

Example 3 with HashFunction

use of in project hive by apache.

the class TestMurmur3 method testHashCodesM3_128_double.

public void testHashCodesM3_128_double() {
    int seed = 123;
    Random rand = new Random(seed);
    HashFunction hf = Hashing.murmur3_128(seed);
    for (int i = 0; i < 1000; i++) {
        double val = rand.nextDouble();
        byte[] data = ByteBuffer.allocate(8).putDouble(val).array();
        // guava stores the hashcodes in little endian order
        ByteBuffer buf = ByteBuffer.allocate(16).order(ByteOrder.LITTLE_ENDIAN);
        long gl1 = buf.getLong();
        long gl2 = buf.getLong(8);
        long[] hc = Murmur3.hash128(data, 0, data.length, seed);
        long m1 = hc[0];
        long m2 = hc[1];
        assertEquals(gl1, m1);
        assertEquals(gl2, m2);
Also used : Random(java.util.Random) HashFunction( ByteBuffer(java.nio.ByteBuffer) Test(org.junit.Test)

Example 4 with HashFunction

use of in project hive by apache.

the class TestMurmur3 method testHashCodesM3_128_ints.

public void testHashCodesM3_128_ints() {
    int seed = 123;
    Random rand = new Random(seed);
    HashFunction hf = Hashing.murmur3_128(seed);
    for (int i = 0; i < 1000; i++) {
        int val = rand.nextInt();
        byte[] data = ByteBuffer.allocate(4).putInt(val).array();
        // guava stores the hashcodes in little endian order
        ByteBuffer buf = ByteBuffer.allocate(16).order(ByteOrder.LITTLE_ENDIAN);
        long gl1 = buf.getLong();
        long gl2 = buf.getLong(8);
        long[] hc = Murmur3.hash128(data, 0, data.length, seed);
        long m1 = hc[0];
        long m2 = hc[1];
        assertEquals(gl1, m1);
        assertEquals(gl2, m2);
        byte[] offsetData = new byte[data.length + 50];
        System.arraycopy(data, 0, offsetData, 50, data.length);
        hc = Murmur3.hash128(offsetData, 50, data.length, seed);
        assertEquals(gl1, hc[0]);
        assertEquals(gl2, hc[1]);
Also used : Random(java.util.Random) HashFunction( ByteBuffer(java.nio.ByteBuffer) Test(org.junit.Test)

Example 5 with HashFunction

use of in project hive by apache.

the class TestMurmur3 method testHashCodesM3_32_string.

public void testHashCodesM3_32_string() {
    String key = "test";
    int seed = 123;
    HashFunction hf = Hashing.murmur3_32(seed);
    int hc1 = hf.hashBytes(key.getBytes()).asInt();
    int hc2 = Murmur3.hash32(key.getBytes(), key.getBytes().length, seed);
    assertEquals(hc1, hc2);
    key = "testkey";
    hc1 = hf.hashBytes(key.getBytes()).asInt();
    hc2 = Murmur3.hash32(key.getBytes(), key.getBytes().length, seed);
    assertEquals(hc1, hc2);
Also used : HashFunction( Test(org.junit.Test)


HashFunction ( Test (org.junit.Test)12 Random (java.util.Random)7 ByteBuffer (java.nio.ByteBuffer)5 HashCode ( Hasher ( ArrayList (java.util.ArrayList)3 HashMap (java.util.HashMap)3 BaseEncoding ( IOException ( ArrayDeque (java.util.ArrayDeque)2 HashSet (java.util.HashSet)2 SolrCore (org.apache.solr.core.SolrCore)2 Ignore (org.junit.Ignore)2 NonNull ( NSArray (com.dd.plist.NSArray)1 NSData (com.dd.plist.NSData)1 NSDate (com.dd.plist.NSDate)1 NSDictionary (com.dd.plist.NSDictionary)1 NSObject (com.dd.plist.NSObject)1