Fix in scorer manager in picking the best target #35

mayabar · 2025-04-20T11:41:17Z

No description provided.

…ill be the target for a request. Session affinity scorer added

- Rename SessionId to SessionID - Remove datastore from scoreTargets, add datastore to SessionAffinityScorer - Rename ScoredPod to PodScore

…orerManager

…f ScoreMng - If some specific scorer failed to score pods - just log the problem, skip it and continue to the next scorer

Signed-off-by: Shane Utt <shaneutt@linux.com>

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.37.0 to 0.38.0. - [Commits](golang/net@v0.37.0...v0.38.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-version: 0.38.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>

…odules/golang.org/x/net-0.38.0 Bump golang.org/x/net from 0.37.0 to 0.38.0

Add scorers support in scheduler

…ev-deployments First iteration of development deployments & environments

Signed-off-by: Shane Utt <shaneutt@linux.com>

…e-builds fix: basic container image builds for linux

Signed-off-by: Shane Utt <shaneutt@linux.com>

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

…on_yaml empty top level kustomization.yaml - make CICD happy

Fix kustomize envs

Minor fixes to enable image building matching GIE

draft changes in run-kind

Add inference model and pool yamls

Signed-off-by: Shane Utt <shaneutt@linux.com>

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

…oyments Add CRDs deployments

upgrade golang.org/x/oauth2 to v0.27.0

Signed-off-by: Shane Utt <shaneutt@linux.com>

This is required for full GIE support Signed-off-by: Shane Utt <shaneutt@linux.com>

Signed-off-by: Shane Utt <shaneutt@linux.com>

…l-stack Add full stack deployment to Kind dev env

vMaroon · 2025-04-20T18:11:49Z

pkg/epp/scheduling/scorer.go

@@ -92,6 +92,7 @@ func (sm *ScorerMng) scoreTargets(ctx *types.Context, pods []*types.PodMetrics)
 		if isFirst {
 			maxScore = score
 			highestScoreTargets = []*types.PodMetrics{pod}
+			isFirst = false


isFirst can be completely dropped, because it doesn't fix the negative score case, since the array is not sorted (which would beat the point). If you want to account for negative scores, maxScore should be set to math.SmallestNonzeroFloat64.

The code block can look like:

// select pod with maximum score, if more than one with the max score - use random pods from the list var highestScoreTargets []*types.PodMetrics // score weights could be negative maxScore := 0.0 for pod, score := range podsTotalScore { if score > maxScore { maxScore = score highestScoreTargets = []*types.PodMetrics{pod} } else if score == maxScore { highestScoreTargets = append(highestScoreTargets, pod) } }

I think negative scores should be prohibited textually through the Scorer interface (to avoid checking it), because allowing them would let such scorers to invade and overwrite the scoring of others. For the same reasons, scores should be normalized (cc @elevran @nirrozenbaum).

I agree with @vMaroon. scorers should never return a negative value by definition.
the "normalized score" of a score should reflect in percentage the recommendation of the scorer for the pod (e.g., a float with value within the range 0-1).

I recommend reading the NormalizeScores section in kubernetes scheduling-framework:
https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/

Negative is useful (e.g., prefer the pod with the best cache coverage and lowest load can be expressed naively as kv - load when both are normalized). However it is better to use weights for that and not in the score.
The downside is that we need to decide what the value represents. Working on the assumption that we prefer Pods with higher score would mean that the LoadScorer would rate pods as "1 - load" which would give lower score to loaded Pods. This would not align with the negative weights.

mayabar and others added 30 commits April 10, 2025 14:50

Add initial support for scorers, used as part of decision which pod w…

0c95c2a

…ill be the target for a request. Session affinity scorer added

Fixes in scores infrastructure & session aware scorer

bde57da

- Add cleanup for session->pod map

8f9785e

- Rename SessionId to SessionID - Remove datastore from scoreTargets, add datastore to SessionAffinityScorer - Rename ScoredPod to PodScore

Export score and pod for external implementations

52af66f

Rename session id header

90e23bc

Separate code of Scorer interface and scorer implementations + add sc…

c946acc

…orerManager

Remove vllmRequest from scoreTargets API since it exists in the context

137ca09

Support negative score weights

c42f72a

Fix fakeDataStore to be compatible with DataStore intereface

a05a573

- Check for nils in list of available pods in main scoring function o…

aca8e07

…f ScoreMng - If some specific scorer failed to score pods - just log the problem, skip it and continue to the next scorer

[version bump] Promote 0.0.2 to prod, bump dev to 0.0.3

bae2a66

chore: move openshift router deployment to extra

c398f4b

Signed-off-by: Shane Utt <shaneutt@linux.com>

feat: add deployment for sail operator

95e6bb1

Signed-off-by: Shane Utt <shaneutt@linux.com>

feat: add istio control-plane deployment

6d12b06

Signed-off-by: Shane Utt <shaneutt@linux.com>

feat: add vllm simulator deployment

8c4eb46

Signed-off-by: Shane Utt <shaneutt@linux.com>

feat: add inference-gateway deployment

58ef159

Signed-off-by: Shane Utt <shaneutt@linux.com>

feat: add kind environment deployment

f606e0d

Signed-off-by: Shane Utt <shaneutt@linux.com>

feat: kind dev env deployment script

c679724

Signed-off-by: Shane Utt <shaneutt@linux.com>

Merge pull request kubernetes-sigs#5 from neuralmagic/dependabot/go_m…

09e79e6

…odules/golang.org/x/net-0.38.0 Bump golang.org/x/net from 0.37.0 to 0.38.0

Merge pull request kubernetes-sigs#1 from mayabar/main

d8303a0

Add scorers support in scheduler

Merge pull request kubernetes-sigs#4 from shaneutt/shaneutt/initial-d…

dad8db2

…ev-deployments First iteration of development deployments & environments

fix: basic container image builds for linux

f608526

Signed-off-by: Shane Utt <shaneutt@linux.com>

fix: lint fix

2dd2ee7

Signed-off-by: Shane Utt <shaneutt@linux.com>

Merge pull request kubernetes-sigs#10 from shaneutt/shaneutt/fix-imag…

10be213

…e-builds fix: basic container image builds for linux

fix: move openshift deployment to environments

a24a801

Signed-off-by: Shane Utt <shaneutt@linux.com>

fix: retarget kustomize deployments in Makefile

950e07b

Signed-off-by: Shane Utt <shaneutt@linux.com>

empty top level kustomization.yaml - make CICD happy

47bed9d

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

Merge pull request kubernetes-sigs#17 from elevran/deploy_kustomizati…

0c4e6c8

…on_yaml empty top level kustomization.yaml - make CICD happy

Merge pull request kubernetes-sigs#15 from shaneutt/fix-kustomize-envs

e0dcba6

Fix kustomize envs

shaneutt and others added 27 commits April 18, 2025 08:35

Merge pull request kubernetes-sigs#21 from elevran/image_build

aad629b

Minor fixes to enable image building matching GIE

docs: add issue links for some TODOs

84da7a5

Merge pull request kubernetes-sigs#29 from elevran/kind_env

be9c800

draft changes in run-kind

Merge pull request kubernetes-sigs#24 from mayabar/dev

8004347

Add inference model and pool yamls

feat: add crd deployment component

6bd139f

Signed-off-by: Shane Utt <shaneutt@linux.com>

fix: remove podman load instructions that are no longer needed

9202462

Signed-off-by: Shane Utt <shaneutt@linux.com>

upgrade golang.org/x/oauth2 to v0.27.0

8e46ea9

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

Merge pull request kubernetes-sigs#31 from shaneutt/shaneutt/crd-depl…

e7a53af

…oyments Add CRDs deployments

Merge pull request kubernetes-sigs#32 from elevran/oauth2_vuln

142668c

upgrade golang.org/x/oauth2 to v0.27.0

feat: add istio crds to deployments

413c4a7

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: add custom build for istio-control-plane

43ab7c1

This is required for full GIE support Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: cleanup vllm-sim deployments

a3355b3

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: update gateway deployment for gie compat

f699eb7

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: kind env script cleanup

760414e

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: cleanup sail operator deployment

5659f22

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: cleanup kind dev env deployment

1ef859f

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: move kind dev env deploys

322a421

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: move openshift dev env deploys

383d2db

Signed-off-by: Shane Utt <shaneutt@linux.com>

feat: add environment.dev.kind makefile target

55bf0f8

Signed-off-by: Shane Utt <shaneutt@linux.com>

docs: add development documentation

896270f

Signed-off-by: Shane Utt <shaneutt@linux.com>

chore: cleanup some language in the Makefile

a83015b

Signed-off-by: Shane Utt <shaneutt@linux.com>

Merge pull request kubernetes-sigs#33 from shaneutt/shaneutt/kind-ful…

33a10b5

…l-stack Add full stack deployment to Kind dev env

added infra pipeline run stuff

175fbec

added infra pipeline run stuff

b7bcf73

added infra pipeline run stuff

eea72e4

pre-commit hook added

72a4328

scorer fix

e195481

mayabar requested a review from shmuelk April 20, 2025 11:41

vMaroon reviewed Apr 20, 2025

View reviewed changes

mayabar force-pushed the dev branch from b0d29ec to a80bcfc Compare April 23, 2025 14:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix in scorer manager in picking the best target #35

Fix in scorer manager in picking the best target #35

mayabar commented Apr 20, 2025

vMaroon Apr 20, 2025 •

edited

Loading

nirrozenbaum Apr 20, 2025

nirrozenbaum Apr 20, 2025

elevran Apr 22, 2025

Fix in scorer manager in picking the best target #35

Are you sure you want to change the base?

Fix in scorer manager in picking the best target #35

Conversation

mayabar commented Apr 20, 2025

vMaroon Apr 20, 2025 • edited Loading

Choose a reason for hiding this comment

nirrozenbaum Apr 20, 2025

Choose a reason for hiding this comment

nirrozenbaum Apr 20, 2025

Choose a reason for hiding this comment

elevran Apr 22, 2025

Choose a reason for hiding this comment

vMaroon Apr 20, 2025 •

edited

Loading