Limit search to available items
Book Cover
E-book
Author Wu, Huijun (Writer on cloud computing)

Title Heron streaming : fundamentals, applications, operations, and insights / Huijun Wu, Maosong Fu
Published Cham : Springer, 2021

Copies

Description 1 online resource (211 pages)
Contents Intro -- Foreword -- Preface -- Who This Book Is For -- How This Book Is Organized -- What You Need for This Book -- Typographical Conventions -- Acknowledgments -- Contents -- About the Authors -- Part I Heron Fundamentals -- 1 Stream Processing -- 1.1 Big Data Processing -- 1.1.1 Lambda Architecture -- 1.1.1.1 Batch Processing Layer -- 1.1.1.2 Stream Processing Layer -- 1.1.1.3 Serving Layer -- 1.1.2 Kappa Architecture -- 1.2 Big Data Stream Processing -- 1.3 From Apache Storm to Apache Heron (Incubating) -- 1.3.1 Motivation for Heron -- 1.3.2 Heron Design Goals
1.3.3 Join the Apache Heron (Incubating) Community -- 1.4 Stream Processing Tools -- 1.5 Summary -- References -- 2 Heron Basics -- 2.1 Topology Data Model -- 2.1.1 Topology -- 2.1.2 Spout -- 2.1.3 Bolt -- 2.1.4 Grouping -- 2.2 Heron Architecture and Components -- 2.2.1 Cluster-Level Components (Six Components) -- 2.2.1.1 Scheduler -- 2.2.1.2 State Manager -- 2.2.1.3 Uploader -- 2.2.1.4 Heron CLI -- 2.2.1.5 Heron Tracker -- 2.2.1.6 Heron UI -- 2.2.2 Topology-Level Components (Four Components) -- 2.2.2.1 Heron Instance -- 2.2.2.2 Stream Manager -- 2.2.2.3 Topology Master -- 2.2.2.4 Metrics Manager
2.3 Submission Process and Failure Handling -- 2.4 Submit the First Topology -- 2.4.1 Preparation -- 2.4.2 Install the Heron Client -- 2.4.3 Heron Example Topologies -- 2.4.4 Submit the Topology JAR File -- 2.4.5 Observe the Running Topology -- 2.5 Summary -- References -- 3 Study Heron Code -- 3.1 Code Languages -- 3.2 Requirements for Compiling -- 3.3 Prepare the Compiling Environment -- 3.4 Source Organization -- 3.4.1 Directory Organization -- 3.4.2 Bazel Perspective -- 3.5 Compile Heron -- 3.6 Examine Compiling Results -- 3.6.1 Examine the API -- 3.6.2 Examine Packages -- 3.7 Run Tests
3.7.1 Unit Test -- 3.7.2 Integration Test -- 3.8 Summary -- References -- Part II Write Heron Topologies -- 4 Migrate Storm Topology to Heron -- 4.1 Prepare the Storm Topology Code -- 4.1.1 Examine the Storm Topology Code -- 4.1.2 Examine the Storm Flux Code -- 4.2 Migrate the Storm Topology Code to a HeronTopology Project -- 4.2.1 Adjust the Topology Java Code -- 4.2.2 Adjust the Project File pom.xml -- 4.2.2.1 Add Dependency -- 4.2.2.2 Build with Dependencies -- 4.2.3 Compile the Topology JAR File -- 4.3 Migrate Storm Flux to Heron ECO -- 4.4 Summary -- References -- 5 Write Topology Code
5.1 Before Writing Code -- 5.1.1 Design Topology -- 5.1.2 Choose a Heron API -- 5.2 Write Topology in Java -- 5.2.1 Code the Topology -- 5.2.1.1 Code Main -- 5.2.1.2 Code Spout -- 5.2.1.3 Code Bolt -- 5.2.2 Understand Tuple Flow -- 5.2.2.1 How Tuple Is Constructed -- 5.2.2.2 How Tuple Is Routed -- 5.3 Write Topology in Python -- 5.3.1 Code Main -- 5.3.2 Code Spout -- 5.3.3 Code Bolt -- 5.3.4 Compile and Run -- 5.4 Summary -- Reference -- 6 Heron Topology Features -- 6.1 Delivery Semantics -- 6.1.1 At-Least-Once -- 6.1.2 Effectively-Once -- 6.1.2.1 Requirements for Effectively-Once -- 6.1.2.2 Exactly-Once Versus Effectively-Once
Summary This book provides both a basic understanding of stream processing in general, and practical guidance for development and research with Apache Heron in particular. It delivers to developers of streaming applications basic and systematic knowledge about Heron, which is today only scattered across project documents, technique blogs and code snippets on the Web. The book is organized in four parts: Part I describes basic knowledge about stream processing, Apache Storm, and Apache Heron (Incubating), and also introduces the Heron source repository. Part II then goes into details and describes two data models to write Heron topologies and often used topology features, including stateful processing. This part is especially targeted at software developers who write topologies using Heron APIs. Next, part III describes Heron tools, including the command-line interface and the user interface, needed to manage a single topology or multiple topologies in a data center. This part is particularly aimed at operators who deploy and manage running jobs. Eventually, part IV describes the Heron source code and how to customize or extend Heron. This part is especially suggested for software engineers who would like to contribute code to the Heron repository and who are curious about Heron insights. Overall, this book aims at professionals who want to process streaming data based on Apache Heron. A basic knowledge of Java and Bash commands for Linux is assumed
Bibliography Includes bibliographical references and index
Notes Print version record
In Springer Nature eBook
Subject Real-time data processing.
Computer programming.
computer programming.
Bases de datos relacionales
Real-time data processing
Recuperació de la informació.
Mineria de dades.
Dades massives.
Genre/Form Llibres electrònics.
Form Electronic book
Author Fu, Maosong
ISBN 9783030600945
3030600947
9783030600952
3030600955