搜索 "inference" 找到 12 个结果
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA J
AirLLM 70B inference with single 4GB GPU
PaddlePaddle High Performance Deep Learning Inference Engine for Mobile and Edge (飞桨高性能深度学习端侧推理引擎)
LiteRT-LM is Google's production-ready, high-performance, open-source inference framework for deploying Large Language M
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agent
Self-hosted personalized AI in a mirror.
TinyML & Edge AI: On-device inference, model quantization, embedded ML, ultra-low-power AI for microcontrollers and IoT
Local AI Agent firmware running on ESP32-S3, integrating offline voice wake-up with cloud TTS, supporting local LLM infe
无线多模态传感:推断传感硬件数据(如温湿度、光照)。