Floway: LLM API Gateway on Cloudflare Workers

FlowayLLM API Gateway on Cloudflare Workers

Floway is a lightweight LLM API gateway designed for Cloudflare Workers, offering multi-API key management, usage statistics, and support for Copilot, Azure, and custom upstream models. Written in TypeScript, it's an ideal solution for individual developers or small teams needing centralized control over their LLM API calls without the overhead of traditional server infrastructure.

Project Overview

Managing multiple API keys for large language models (LLMs) can quickly become a headache, especially when you're trying to keep tabs on usage across a team or different projects. Who's using what? Are we hitting our quotas? Is anyone overspending? These aren't complex questions, but getting clear answers often involves a lot of manual tracking, which is far from ideal.

This is precisely the problem Floway aims to solve. It's an API gateway built to run on Cloudflare Workers, specifically tailored for LLM requests. With just a few lines of configuration, you can consolidate various upstream providers—think OpenAI, Azure, or even self-hosted models—into a single endpoint. Then, you can issue unique keys to different users or applications, and Floway automatically tracks request counts and token consumption for each.

Effortless Deployment, Minimal Upkeep

One of Floway's biggest draws is its deployment model. By leveraging Cloudflare Workers, it inherently benefits from a global edge network, ensuring low latency and broad geographical reach. You won't need to provision servers, manage Docker containers, or worry about infrastructure. A simple wrangler deploy command is all it takes to get Floway live. The project even provides a ready-to-use wrangler.toml template on GitHub, making setup as straightforward as tweaking a few environment variables.

For indie developers or small teams, this 'deploy and forget' experience is incredibly practical. No more late-night server reboots or certificate expiry scares; Cloudflare handles the underlying infrastructure, letting you focus on your application logic.

Multi-Key Management and Usage Insights

At its core, Floway excels at multi-key rotation and usage statistics. You can feed it a pool of upstream API keys, and the gateway will intelligently distribute requests based on your defined strategies, such as weighted routing or simple round-robin. Each downstream request can carry a custom 'group' identifier, giving you granular visibility into which projects or users are consuming resources.

Supports Copilot, Azure OpenAI, and any custom upstream compatible with the OpenAI API format.
Automatically logs token usage and request counts for each key.
Offers basic statistical views (though deeper analysis might require integrating with a dashboard or exporting logs).
Written in TypeScript, making it easy to extend or modify its logic if needed.

While these features might sound fundamental, their practical value is immense. Simply understanding 'who used how much' can prevent nasty surprises, like discovering a key has been overused at the end of the billing cycle due to a lack of transparency.

Use Cases and Practical Considerations

Floway has a clear niche: it's lightweight, cost-effective, and developer-centric. If you're integrating LLMs into a few small applications, or perhaps setting up a shared gateway for a personal project or a friend's bot, it's more than capable. However, it's not designed to compete with enterprise-grade API management platforms like Kong or Kubernetes Gateway. You won't find advanced security policies beyond basic authentication, nor features like rate limiting or caching built-in.

It's also worth noting Cloudflare Workers' free tier offers a generous 100,000 requests per day, which is ample for most individual or small team use. For higher volumes, you might need to consider a paid plan. Floway itself is open-source under the MIT license, giving you complete freedom with the code.

Ultimately, Floway is a well-crafted tool that doesn't try to be everything to everyone. It focuses on doing one thing exceptionally well on a minimalist platform: proxying and tracking LLM API calls. If you need that thin, efficient layer for your LLM integrations, it's definitely worth exploring.

Frequently Asked Questions