dockerhub analyzer #3861

kashifkhan0771 · 2025-01-31T10:09:27Z

Description:

This PR adds a new analyzer for dockerhub.
Note: We only detect dockerhub PAT(personal access token), and the scope for dockrehub PAT is only for dockerhub repositories.

Checklist:

Tests passing (make test-community)?
Lint passing (make lint this requires golangci-lint)?

ahrav

Looks great, nice job!

ahrav · 2025-02-09T18:23:53Z

pkg/detectors/dockerhub/v1/dockerhub_integration_test.go

@@ -116,7 +116,7 @@ func TestDockerhub_FromChunk(t *testing.T) {
 				got[i].Raw = nil
 				got[i].RawV2 = nil
 				got[i].ExtraData = nil
-			}
+			},


Looks like this slipped in.

ahrav · 2025-02-09T19:16:07Z

pkg/analyzer/analyzers/dockerhub/helper.go

+}
+
+// assignHighestPermission selects the highest available permission
+func assignHighestPermission(permissions []string) string {


suggestion (optional): Depending on the size of the permissions list, we can improve performance by avoiding map creation within this function for each permission. Instead, we can leverage the predefined ranking by creating a package-level ranking map computed once. This way, the function only iterates over the input list once. This approach adds some complexity, so I’ve left it as optional. Use your judgment, as you're likely more familiar with this code.

Ex:

// Precompute a ranking map for the ranking approach. // Lower index means higher permission. var permissionRank = func() map[string]int { rank := make(map[string]int, len(permissionHierarchy)) for i, perm := range permissionHierarchy { rank[perm] = i } return rank }() func assignHighestPermissionRank(permissions []string) string { bestRank := len(permissionHierarchy) bestPerm := "" for _, perm := range permissions { if rank, ok := permissionRank[perm]; ok { // Early exit if highest permission is found. if rank == 0 { return perm } if rank < bestRank { bestRank = rank bestPerm = perm } } } return bestPerm }

Benchmark Results for assignHighestPermission

These benchmarks compare the old (set-based) and new (ranking with early exit) implementations under two scenarios:

Best-case: The highest permission (repo:admin) appears at the beginning.

Worst-case: The highest permission is absent until the very end (or missing altogether).

The tests were run on:

OS: Darwin

Arch: arm64

CPU: Apple M1 Pro

Package: github.com/trufflesecurity/trufflehog/v3/pkg/analyzer/analyzers/dockerhub

Runtime (sec/op)

Test Case Old (sec/op) New (sec/op) Change

size=10, best-case 197.550n ± 1% 3.371n ± 0% -98.29% (p=0.000 n=10)

size=10, worst-case 223.40n ± 0% 66.80n ± 1% -70.10% (p=0.000 n=10)

size=100, best-case 1495.000n ± 1% 3.358n ± 0% -99.78% (p=0.000 n=10)

size=100, worst-case 1558.5n ± 7% 548.9n ± 0% -64.78% (p=0.000 n=10)

size=1000, best-case 16555.500n ± 6% 3.371n ± 0% -99.98% (p=0.000 n=10)

size=1000, worst-case 16.284µ ± 1% 5.386µ ± 6% -66.92% (p=0.000 n=10)

size=10000, best-case 120032.500n ± 1% 3.402n ± 3% -100.00% (p=0.000 n=10)

size=10000, worst-case 120.23µ ± 3% 55.33µ ± 3% -53.98% (p=0.000 n=10)

size=100000, best-case 949148.500n ± 5% 3.372n ± 3% -100.00% (p=0.000 n=10)

size=100000, worst-case 943.6µ ± 0% 546.3µ ± 2% -42.10% (p=0.000 n=10)

Geomean 14.30µ 138.6n -99.03%

Memory Usage (B/op)

Test Case Old (B/op) New (B/op) Change

size=10, best-case 320.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=10)

size=10, worst-case 320.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=10)

size=100, best-case 2.648Ki ± 0% 0.000Ki ± 0% -100.00% (p=0.000 n=10)

size=100, worst-case 2.648Ki ± 0% 0.000Ki ± 0% -100.00% (p=0.000 n=10)

size=1000, best-case 48.02Ki ± 0% 0.00Ki ± 0% -100.00% (p=0.000 n=10)

size=1000, worst-case 48.02Ki ± 0% 0.00Ki ± 0% -100.00% (p=0.000 n=10)

size=10000, best-case 328.0Ki ± 0% 0.0Ki ± 0% -100.00% (p=0.000 n=10)

size=10000, worst-case 328.0Ki ± 0% 0.0Ki ± 0% -100.00% (p=0.000 n=10)

size=100000, best-case 2.523Mi ± 0% 0.000Mi ± 0% -100.00% (p=0.000 n=10)

size=100000, worst-case 2.523Mi ± 0% 0.000Mi ± 0% -100.00% (p=0.000 n=10)

Geomean 32.03Ki ? (Not computed)

Allocations (allocs/op)

Test Case Old (allocs/op) New (allocs/op) Change

size=10, best-case 1.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=10, worst-case 1.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=100, best-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=100, worst-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=1000, best-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=1000, worst-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=10000, best-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=10000, worst-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=100000, best-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

size=100000, worst-case 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)

Geomean 1.741 ? * (Not computed)*

Here are the benchmarks if you were curious.

Your approach looks Solid! 🚀
I added more comments for explanation and applied it.

ahrav · 2025-02-09T19:20:52Z

pkg/analyzer/analyzers/dockerhub/requests.go

+		}
+
+		if errorLogin.Login2FAToken != "" {
+			// TODO: handle it more appropriately


question: Just confirming this TODO is valid, and not necessary for this initial implementation?

Yup, it is working. If 2FA is enabled for account we return error. Added TODO just in case we figure out a more better approach in future to handle these scenarios.

ahrav · 2025-02-09T19:27:13Z

pkg/analyzer/analyzers/dockerhub/dockerhub.go

+	// create the http client
+	client := analyzers.NewAnalyzeClientUnrestricted(cfg) // /user/login is a non-safe request
+
+	var secretInfo = &SecretInfo{}


optional: This is purely a personal preference, so feel free to disregard. When passing a pointer to functions responsible for populating a struct, I find it clearer to declare the variable without &. This forces the reviewer to recognize that the pointer is being passed explicitly, indicating that we're modifying a pre-existing struct. Again, this is entirely optional.

Ex:

var secretInfo SecretInfo ... if err := decodeTokenToSecretInfo(token, &secretInfo); err != nil { ... } if err := fetchRepositories(client, username, token, &secretInfo); err != nil { ... }

ahrav · 2025-02-09T19:29:22Z

pkg/analyzer/analyzers/dockerhub/dockerhub.go

+
+	result := analyzers.AnalyzerResult{
+		AnalyzerType: analyzers.AnalyzerTypeDockerHub,
+		Metadata:     map[string]any{},


nit: We can probably just set the Metadata field directly in here.

Ex:

Metadata: map[string]any{"Valid_Key": info.Valid}

ahrav · 2025-02-09T19:30:15Z

pkg/analyzer/analyzers/dockerhub/dockerhub.go

+	result := analyzers.AnalyzerResult{
+		AnalyzerType: analyzers.AnalyzerTypeDockerHub,
+		Metadata:     map[string]any{},
+		Bindings:     make([]analyzers.Binding, 0),


nit: We can pre-allocate here.

Ex:

Bindings: make([]analyzers.Binding, 0, len(info.Repositories))

kashifkhan0771 changed the title ~~dockerhub analyzer~~ dockerhub analyzer [WIP] Jan 31, 2025

kashifkhan0771 marked this pull request as ready for review January 31, 2025 11:27

kashifkhan0771 requested review from a team as code owners January 31, 2025 11:27

kashifkhan0771 changed the title ~~dockerhub analyzer [WIP]~~ dockerhub analyzer Jan 31, 2025

ahrav approved these changes Feb 9, 2025

View reviewed changes

kashifkhan0771 added 3 commits February 10, 2025 11:45

dockerhub analyzer

ecdaaec

fixed linter

879ca8c

added test cases

d711389

kashifkhan0771 force-pushed the feat/oss-96-dockerhub-analyzer branch from 5e72fa5 to d711389 Compare February 10, 2025 06:45

kashifkhan0771 and others added 3 commits February 10, 2025 12:24

resolved comments

c61c561

Merge branch 'main' into feat/oss-96-dockerhub-analyzer

5cd80c0

Merge branch 'main' into feat/oss-96-dockerhub-analyzer

6ce48f3

ahrav merged commit b10342e into trufflesecurity:main Feb 11, 2025
13 checks passed

kashifkhan0771 deleted the feat/oss-96-dockerhub-analyzer branch February 12, 2025 04:36

blsaccess mentioned this pull request Feb 13, 2025

Update trufflehog to 3.88.9 blacklanternsecurity/bbot#2251

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dockerhub analyzer #3861

dockerhub analyzer #3861

kashifkhan0771 commented Jan 31, 2025 •

edited

Loading

ahrav left a comment

ahrav Feb 9, 2025

ahrav Feb 9, 2025

kashifkhan0771 Feb 10, 2025

ahrav Feb 9, 2025

kashifkhan0771 Feb 10, 2025

ahrav Feb 9, 2025

ahrav Feb 9, 2025

ahrav Feb 9, 2025

Test Case	Old (sec/op)	New (sec/op)	Change
size=10, best-case	197.550n ± 1%	3.371n ± 0%	-98.29% (p=0.000 n=10)
size=10, worst-case	223.40n ± 0%	66.80n ± 1%	-70.10% (p=0.000 n=10)
size=100, best-case	1495.000n ± 1%	3.358n ± 0%	-99.78% (p=0.000 n=10)
size=100, worst-case	1558.5n ± 7%	548.9n ± 0%	-64.78% (p=0.000 n=10)
size=1000, best-case	16555.500n ± 6%	3.371n ± 0%	-99.98% (p=0.000 n=10)
size=1000, worst-case	16.284µ ± 1%	5.386µ ± 6%	-66.92% (p=0.000 n=10)
size=10000, best-case	120032.500n ± 1%	3.402n ± 3%	-100.00% (p=0.000 n=10)
size=10000, worst-case	120.23µ ± 3%	55.33µ ± 3%	-53.98% (p=0.000 n=10)
size=100000, best-case	949148.500n ± 5%	3.372n ± 3%	-100.00% (p=0.000 n=10)
size=100000, worst-case	943.6µ ± 0%	546.3µ ± 2%	-42.10% (p=0.000 n=10)
Geomean	14.30µ	138.6n	-99.03%

Test Case	Old (B/op)	New (B/op)	Change
size=10, best-case	320.0 ± 0%	0.0 ± 0%	-100.00% (p=0.000 n=10)
size=10, worst-case	320.0 ± 0%	0.0 ± 0%	-100.00% (p=0.000 n=10)
size=100, best-case	2.648Ki ± 0%	0.000Ki ± 0%	-100.00% (p=0.000 n=10)
size=100, worst-case	2.648Ki ± 0%	0.000Ki ± 0%	-100.00% (p=0.000 n=10)
size=1000, best-case	48.02Ki ± 0%	0.00Ki ± 0%	-100.00% (p=0.000 n=10)
size=1000, worst-case	48.02Ki ± 0%	0.00Ki ± 0%	-100.00% (p=0.000 n=10)
size=10000, best-case	328.0Ki ± 0%	0.0Ki ± 0%	-100.00% (p=0.000 n=10)
size=10000, worst-case	328.0Ki ± 0%	0.0Ki ± 0%	-100.00% (p=0.000 n=10)
size=100000, best-case	2.523Mi ± 0%	0.000Mi ± 0%	-100.00% (p=0.000 n=10)
size=100000, worst-case	2.523Mi ± 0%	0.000Mi ± 0%	-100.00% (p=0.000 n=10)
Geomean	32.03Ki	?	(Not computed)

Test Case	Old (allocs/op)	New (allocs/op)	Change
size=10, best-case	1.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=10, worst-case	1.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=100, best-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=100, worst-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=1000, best-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=1000, worst-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=10000, best-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=10000, worst-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=100000, best-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
size=100000, worst-case	2.000 ± 0%	0.000 ± 0%	-100.00% (p=0.000 n=10)
Geomean	1.741	?	* (Not computed)*

dockerhub analyzer #3861

dockerhub analyzer #3861

Conversation

kashifkhan0771 commented Jan 31, 2025 • edited Loading

Description:

Checklist:

ahrav left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Benchmark Results for assignHighestPermission

Runtime (sec/op)

Memory Usage (B/op)

Allocations (allocs/op)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kashifkhan0771 commented Jan 31, 2025 •

edited

Loading

Benchmark Results for `assignHighestPermission`