-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Description
PySDK Version
- PySDK V2 (2.x)
- PySDK V3 (3.4.0)
Describe the bug
When using ModelBuilder in "passthrough" mode (image_uri + env_vars, no model
or inference_spec), build() fails with a ValidationException because a
non-existent S3 path is injected as ModelDataUrl in the CreateModel request.
Passthrough mode is the intended path for deploying models on custom/DJL
containers where the container itself handles model loading (e.g. via
HF_MODEL_ID env var pointing to S3).
Expected behavior
Expected: Model created with no ModelDataUrl (container loads via HF_MODEL_ID)
Actual: ValidationException -- "Could not find model data at
s3:///model-builder///"
Suggested Fix
In _build_for_passthrough(), also clear s3_model_data_url:
def _build_for_passthrough(self) -> Model:
if not self.image_uri:
raise ValueError("image_uri is required for pass-through cases")
self.s3_upload_path = None
self.s3_model_data_url = None # <-- add this line
return self._create_model()
To reproduce
from sagemaker.core.helper.session_helper import Session, get_execution_role
from sagemaker.serve.model_builder import ModelBuilder
sagemaker_session = Session()
region = sagemaker_session.boto_session.region_name
role = get_execution_role(sagemaker_session, use_default=True)
bucket = sagemaker_session.default_bucket()
image_uri = f"763104351884.dkr.ecr.{region}.amazonaws.com/djl-inference:0.33.0-lmi15.0.0-cu128"
model_builder = ModelBuilder(
image_uri=image_uri,
role_arn=role,
sagemaker_session=sagemaker_session,
instance_type="ml.g5.2xlarge",
env_vars={
"HF_MODEL_ID": f"s3://{bucket}/models/any-model",
"OPTION_ROLLING_BATCH": "vllm",
},
)
model_builder.build(model_name="passthrough-bug-repro")
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels