Feldera: Revolutionizing Incremental Computation with SQL Power

BY Mark Howell 29 September 20244 MINS READ
article cover

Feldera stands out as a fast query engine designed for incremental computation. What sets it apart is its ability to evaluate arbitrary SQL programs incrementally, making it more powerful, expressive, and performant than existing alternatives like batch engines, warehouses, stream processors, or streaming databases.
A Feldera pipeline consists of a set of SQL tables and views, which can be deeply nested. Users have the flexibility to start, stop, or pause pipelines to manage and advance computations. These pipelines continuously process changes, which include any number of inserts, updates, or deletes to a set of tables. When changes are received, Feldera incrementally updates all the views by only looking at the changes, completely avoiding recomputation over older data. This approach makes Feldera incredibly fast, capable of handling millions of events per second on a laptop. It also enables unified offline and online compute over both live and historical data.

🎯 Defining Features

Full SQL Support and More: Feldera is the only engine that can evaluate full SQL syntax and semantics completely incrementally. This includes joins and aggregates, group by, correlated subqueries, window functions, complex data types, time series operators, UDFs, and recursive queries. Pipelines can process deeply nested hierarchies of views.
Fast Out-of-the-Box Performance: Users have reported getting complex use cases implemented in 30 minutes or less, achieving millions of events per second in performance on a laptop without any tuning.
Datasets Larger than RAM: Feldera is designed to handle datasets that exceed the available RAM by efficiently spilling to disk, leveraging recent advances in NVMe storage.
Strong Guarantees on Consistency and Freshness: Feldera is strongly consistent and guarantees that the state of the views always corresponds to what you'd get if you ran the queries in a batch system for the same input.
Connectors for Your Favorite Data Sources and Destinations: Feldera connects to various batch and streaming data sources like Kafka, HTTP, CDC streams, S3, Data Lakes, Warehouses, and more. If a needed connector is not yet supported, users can request it.

💻 Architecture

The architecture of Feldera is designed to support its high performance and flexibility. A quick start with Docker is available for demos, development, and testing. Users can bring up a Feldera Platform deployment by downloading a Docker Compose file and running a simple command. The Feldera web console becomes available shortly after, accessible via `https://localhost:8080`.

⚙️ Running Feldera from Sources

To run Feldera from sources, users need to install several dependencies, including the Rust toolchain (at least 1.75), Java (at least JDK 19), Maven, and Typescript. After setting up these dependencies, users can build the SQL compiler and run the pipeline-manager. The Feldera WebConsole can then be accessed at `https://localhost:8080`.

📖 Documentation

For more detailed information, users are encouraged to go through the Feldera documentation.

🤖 Benchmarks

Feldera is generally faster and uses less memory than systems like stream processors. Benchmarks are performed by CI on every commit that goes into the main branch. For detailed results, users can visit benchmarks.feldera.io.

🎓 Theory

Feldera Platform is built on a solid mathematical foundation. The formal model underpinning the system, called DBSP, is described in a paper presented at the Conference on Very Large Databases in August 2023. The model provides both semantics and an algorithm for generating incremental dataflow programs that are efficient and correct.

Description: Diagram illustrating the DBSP model.

👍 Contributing

The software in this repository is governed by an open-source license, and contributions are welcome. Guidelines for contributing are available in the repository.

Remember these 3 key ideas for your startup:

  1. Leverage Full SQL Support: Feldera's ability to evaluate full SQL syntax incrementally can significantly enhance your data processing capabilities, making complex queries more efficient and faster.

  2. Optimize Performance: With Feldera, you can achieve high performance out-of-the-box, handling millions of events per second without extensive tuning. This can be a game-changer for startups needing rapid data processing.

  3. Handle Large Datasets Efficiently: Feldera's design to handle datasets larger than RAM by spilling to disk ensures that you can manage extensive data without performance degradation.


Edworking is the best and smartest decision for SMEs and startups to be more productive. Edworking is a FREE superapp of productivity that includes all you need for work powered by AI in the same superapp, connecting Task Management, Docs, Chat, Videocall, and File Management. Save money today by not paying for Slack, Trello, Dropbox, Zoom, and Notion.
For more details, see the original source.

article cover
About the Author: Mark Howell Linkedin

Mark Howell is a talented content writer for Edworking's blog, consistently producing high-quality articles on a daily basis. As a Sales Representative, he brings a unique perspective to his writing, providing valuable insights and actionable advice for readers in the education industry. With a keen eye for detail and a passion for sharing knowledge, Mark is an indispensable member of the Edworking team. His expertise in task management ensures that he is always on top of his assignments and meets strict deadlines. Furthermore, Mark's skills in project management enable him to collaborate effectively with colleagues, contributing to the team's overall success and growth. As a reliable and diligent professional, Mark Howell continues to elevate Edworking's blog and brand with his well-researched and engaging content.

Trendy NewsSee All Articles
Try EdworkingA new way to work from  anywhere, for everyone for Free!
Sign up Now