ollama-python is a Python client for interacting with large language models. It provides an interface for sending prompts to receive text and chat completions, as well as a dedicated client for generating numerical vector embeddings from text.
The project includes a wrapper that emulates the OpenAI API, allowing applications built for that standard to interact with local models. It also provides a non-blocking asynchronous client for executing concurrent requests.
The library covers the full model lifecycle, including the ability to pull, create, list, and delete models within a local environment. It handles remote endpoint connectivity via configurable host addresses and authentication headers.