Skip to content

Ambiguous encoding in pdatautil.MapHash that can lead to collisions #47241

@JakeDern

Description

@JakeDern

Component(s)

No response

What happened?

Description

The encoding scheme used for ValueTypeBytes in pdatautil/hash.go is ambiguous because it does not include a length. This means that two different maps will have the same encoding/hash to the same value and that it is easy to manufacture such collisions.

Writing a length prefix should fix the issue so that the end of the byte array is unambiguous.

Steps to Reproduce

Here are two different maps that encode to the same value and a unit test to go along with it:

{ a = bytes([F4 62 F6 BB]) } =>        F4 61 F6 F4 62 F6 BB
{ a = bytes([]),  b = bytes([BB]) } => F4 61 F6 F4 62 F6 BB
func TestBytesCollision(t *testing.T) {
	t.Parallel()

	m1 := pcommon.NewMap()
	m1_a := m1.PutEmptyBytes("a")
	m1_a.Append(0xF4, 0x62, 0xF6, 0xBB)

	m2 := pcommon.NewMap()
	_ = m2.PutEmptyBytes("a")
	m2_b := m2.PutEmptyBytes("b")
	m2_b.Append(0xBB)

	m1_hash := MapHash(m1)
	m2_hash := MapHash(m2)

	assert.NotEqual(t, m1_hash, m2_hash, "Hash collision")
}

Expected Result

The unit test passes.

Actual Result

The unit test fails.

--- FAIL: TestBytesCollision (0.00s)
    hash_test.go:318:
                Error Trace:    /home/jakedern/repos/opentelemetry-collector-contrib/pkg/pdatautil/hash_test.go:318
                Error:          Should not be: [16]uint8{0x98, 0x60, 0x0, 0x28, 0x6a, 0xaa, 0x36, 0xc6, 0x4c, 0xa0, 0x4e, 0x9d, 0xa7, 0x1c, 0x3f, 0x43}
                Test:           TestBytesCollision
                Messages:       Hash collision
FAIL
exit status 1
FAIL    github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatautil 0.002s

Collector version

NA

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

Log output

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions