Apache_Beam

Apache Beam

Apache Beam

Unified programming model for data processing pipelines


Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream (continuous) processing.[2] Beam Pipelines are defined using one of the provided SDKs and executed in one of the Beam’s supported runners (distributed processing back-ends) including Apache Flink, Apache Samza, Apache Spark, and Google Cloud Dataflow.[3]

Quick Facts Original author(s), Developer(s) ...

History

Apache Beam[3] is one implementation of the Dataflow model paper.[4] The Dataflow model is based on previous work on distributed processing abstractions at Google, in particular on FlumeJava[5] and Millwheel.[6][7]

Google released an open SDK implementation of the Dataflow model in 2014 and an environment to execute Dataflows locally (non-distributed) as well as in the Google Cloud Platform service.

Timeline

Apache Beam makes minor releases every 6 weeks.[8]

More information Version, Release date ...

See also


References

  1. "Blogs". beam.apache.org. The Apache Software Foundation. Retrieved 2024-01-17.
  2. Woodie, Alex (22 April 2016). "Apache Beam's Ambitious Goal: Unify Big Data Development". Datanami. Retrieved 4 August 2016.
  3. Akidau, Tyler; Schmidt, Eric; Whittle, Sam; Bradshaw, Robert; Chambers, Craig; Chernyak, Slava; Fernández-Moctezuma, Rafael J.; Lax, Reuven; McVeety, Sam; Mills, Daniel; Perry, Frances (1 August 2015). "The dataflow model" (PDF). Proceedings of the VLDB Endowment. 8 (12): 1792–1803. doi:10.14778/2824032.2824076. Retrieved 4 August 2016.
  4. Chambers, Craig; Raniwala, Ashish; Perry, Frances; Adams, Stephen; Henry, Robert R.; Bradshaw, Robert; Weizenbaum, Nathan (1 January 2010). "FlumeJava: Easy, efficient data-parallel pipelines". Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PDF). ACM. pp. 363–375. doi:10.1145/1806596.1806638. ISBN 9781450300193. S2CID 14888571. Archived from the original (PDF) on 23 September 2016. Retrieved 4 August 2016.
  5. Akidau, Tyler; Whittle, Sam; Balikov, Alex; Bekiroğlu, Kaya; Chernyak, Slava; Haberman, Josh; Lax, Reuven; McVeety, Sam; Mills, Daniel; Nordstrom, Paul (27 August 2013). "MillWheel" (PDF). Proceedings of the VLDB Endowment. 6 (11): 1033–1044. doi:10.14778/2536222.2536229. Archived from the original (PDF) on 1 February 2016. Retrieved 4 August 2016.
  6. Pointer, Ian (14 April 2016). "Apache Beam wants to be uber-API for big data". InfoWorld. Retrieved 4 August 2016.
  7. "Policies". beam.apache.org. Retrieved 21 April 2022.

Share this article:

This article uses material from the Wikipedia article Apache_Beam, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.