Connect to AI
Machine Learning Infrastructure API Key

Baseten REST API

Deploy and scale ML models with serverless infrastructure

Baseten is a serverless platform for deploying, managing, and scaling machine learning models in production. It provides infrastructure for serving models with autoscaling, GPU support, and low-latency inference. Developers use Baseten to deploy models from popular frameworks like PyTorch, TensorFlow, and Hugging Face without managing servers or Kubernetes clusters.

Base URL https://api.baseten.co/v1

API Endpoints

MethodEndpointDescription
POST/models/{model_id}/deploymentsDeploy a new version of a machine learning model
GET/models/{model_id}/deploymentsList all deployments for a specific model
GET/deployments/{deployment_id}Get detailed information about a specific deployment
POST/models/{model_id}/predictRun inference on a deployed model with input data
POST/models/{model_id}/predict_asyncSubmit an asynchronous inference request for long-running predictions
GET/predictions/{prediction_id}Retrieve the status and results of an async prediction
GET/modelsList all models in your workspace
POST/modelsCreate a new model in your workspace
PATCH/deployments/{deployment_id}Update deployment configuration including scaling and hardware settings
DELETE/deployments/{deployment_id}Delete a model deployment and release resources
GET/deployments/{deployment_id}/logsRetrieve logs from a specific deployment for debugging
GET/deployments/{deployment_id}/metricsGet performance metrics including latency, throughput, and error rates
POST/models/{model_id}/secretsAdd environment secrets for model deployment
GET/workspace/usageGet workspace resource usage and billing information

Code Examples

curl -X POST https://api.baseten.co/v1/models/MODEL_ID/predict \
  -H 'Authorization: Api-Key YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "instances": [
      {"text": "Classify this sentiment", "max_length": 100}
    ]
  }'

Use Baseten from Claude / Cursor / ChatGPT

Get a hosted MCP endpoint for Baseten. Paste your Baseten API key, copy back one URL, drop it into Claude Desktop, Cursor, or any AI client that supports remote MCP. Your AI calls Baseten directly with your credentials — no local install, works on mobile.

deploy_ml_model Deploy a machine learning model to Baseten infrastructure with specified hardware and scaling configuration
run_inference Execute model inference with input data and return predictions synchronously or asynchronously
monitor_deployment Get real-time metrics and logs for deployed models including latency, error rates, and resource utilization
manage_model_versions List, update, or rollback model deployments across different versions
optimize_deployment_config Analyze usage patterns and recommend optimal scaling and hardware configurations for cost and performance

Connect in 60 seconds

Paste your Baseten key → get an MCP URL → paste into Claude/Cursor. Hosted by IOX, encrypted at rest.

Connect Baseten to your AI →

Related APIs