inference-server

Here are 44 public repositories matching this topic...

roboflow / inference

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

Updated Nov 26, 2024
Python

basetenlabs / truss

Star

The simplest way to serve AI/ML models in production

open-source machine-learning packaging artificial-intelligence falcon easy-to-use whisper inference-server model-serving inference-api stable-diffusion wizardlm

Updated Nov 26, 2024
Python

pipeless-ai / pipeless

Star

An open-source computer vision framework to build and deploy apps in minutes

Updated May 8, 2024
Rust

underneathall / pinferencia

Star

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

Updated Feb 14, 2023
Python

NVIDIA / gpu-rest-engine

Star

A REST API for Caffe using Docker and Go

docker caffe deep-learning gpu inference inference-server

Updated Jul 20, 2018
C++

containers / ramalama

Star

The goal of RamaLama is to make working with AI boring.

ai local containers inference-server podman llms llamacpp vllm

Updated Nov 25, 2024
Shell

BMW-InnovationLab / BMW-YOLOv4-Inference-API-GPU

Star

This is a repository for an nocode object detection inference API using the Yolov3 and Yolov4 Darknet framework.

Updated Jun 28, 2022
Python

BMW-InnovationLab / BMW-YOLOv4-Inference-API-CPU

Star

This is a repository for an nocode object detection inference API using the Yolov4 and Yolov3 Opencv.

Updated Jun 28, 2022
Python

BMW-InnovationLab / BMW-TensorFlow-Inference-API-CPU

Star

This is a repository for an object detection inference API using the Tensorflow framework.

Updated Jun 28, 2022
Python

containers / podman-desktop-extension-ai-lab

Star

Work with LLMs on a local environment using containers

ai local containers inference-server podman llms

Updated Nov 26, 2024
TypeScript

autodeployai / ai-serving

Star

Serving AI/ML models in the open standard formats PMML and ONNX with both HTTP (REST API) and gRPC endpoints

inference pmml inference-server onnx onnx-models ai-serving pmml-model onnx-inference onnx-rest pmml-deployment pmml-rest pmml-grpc onnx-grpc pmml-realtime onnx-realtime pmml-inference

Updated Oct 20, 2024
Scala

vertexclique / orkhon

Sponsor

Star

Orkhon: ML Inference Framework and Server Runtime

machine-learning async tensorflow multiprocessing python3 inference-server data-parallelism

Updated Feb 1, 2021
Rust

kibae / onnxruntime-server

Sponsor

Star

ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.

machine-learning ai deep-learning cuda inference-server nueral-networks contributions-welcome onnx onnxruntime

Updated Nov 24, 2024
C++

kf5i / k3ai

Star

K3ai is a lightweight, fully automated, AI infrastructure-in-a-box solution that allows anyone to experiment quickly with Kubeflow pipelines. K3ai is perfect for anything from Edge to laptops.

kubernetes artificial-intelligence edge datascience machinelearning inference-server kubeflow kubeflow-pipelines k3s

Updated Nov 2, 2021
PowerShell

notAI-tech / fastDeploy

Star

Deploy DL/ ML inference pipelines with minimal extra code.

Updated Nov 20, 2024
Python

RubixML / Server

Star

A standalone inference server for trained Rubix ML estimators.

api infrastructure php machine-learning microservice json-api rest-api inference http-server inference-server inference-engine model-deployment php-ml ml-infrastructure model-server rubix-ml php-machine-learning rubix-server

Updated Feb 18, 2024
PHP

friendliai / friendli-client

Star

Friendli: the fastest serving engine for generative AI

ai ml inference gpt inference-server mistral inference-engine serving mlops gpt3 llm stable-diffusion llms generative-ai llmops llm-serving llm-inference llama2 llm-ops

Updated Nov 11, 2024
Python

curtisgray / wingman

Star

Wingman is the fastest and easiest way to run Llama models on your PC or Mac.

windows macos linux downloader ai local download gpu chatbot inference openai gpu-acceleration llama inference-server inference-engine gpu-monitoring llm chatgpt llamacpp

Updated Jun 2, 2024
TypeScript

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

Star

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX

inference pytorch text-detection nvidia-docker inference-server tensorrt inference-engine onnx onnx-torch tensorrt-conversion triton-inference-server text-detection-from-image

Updated Aug 18, 2021
Python

haicheviet / fullstack-machine-learning-inference

Star

Fullstack machine learning inference template

aws machine-learning cloudformation full-stack infrastructure-as-code twitter-sentiment-analysis inference-server fastapi machine-learning-template machine-learning-infrastructure

Updated Nov 24, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the inference-server topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference-server topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference-server

Here are 44 public repositories matching this topic...

roboflow / inference

basetenlabs / truss

pipeless-ai / pipeless

underneathall / pinferencia

NVIDIA / gpu-rest-engine

containers / ramalama

BMW-InnovationLab / BMW-YOLOv4-Inference-API-GPU

BMW-InnovationLab / BMW-YOLOv4-Inference-API-CPU

BMW-InnovationLab / BMW-TensorFlow-Inference-API-CPU

containers / podman-desktop-extension-ai-lab

autodeployai / ai-serving

vertexclique / orkhon

kibae / onnxruntime-server

kf5i / k3ai

notAI-tech / fastDeploy

RubixML / Server

friendliai / friendli-client

curtisgray / wingman

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

haicheviet / fullstack-machine-learning-inference

Improve this page

Add this topic to your repo