Transcript Spring XD
Introducing Spring XD Mark Pollack, Sr. Software Engineer, Pivotal © 2014 Pivotal Spring XD XD = eXtreme Data 2 Spring XD What is a Big Data Application? 4 Spring XD Big Data Architecture XD> REALTIME Spring XD VIEWS Stream FILES Processing Spring Analytics SOCIAL Ingest MASTER BOOT DATASET Spring SENSORS Workflow BOOT Orchestration MOBILE Export Spring Predictive BOOT Modeling BATCH VIEWS 5 Spring XD SPEED XD> Lambda Architecture LAYER REALTIME Spring XD VIEWS Stream FILES SERVING LAYER Processing Spring Analytics MASTER Spring DATASET SOCIAL Ingest SENSORS BOOT XD Spring Workflow BOOT Orchestration MOBILE Export BATCH Spring Predictive LAYER BOOT Modeling BATCH VIEWS 6 SPEED XD> GemFire XD LAYER REALTIME Spring XD VIEWS SERVING Stream FILES LAYER Processing Spring Analytics MASTER Spring DATASET SOCIAL Ingest SENSORS BOOT XD Spring Workflow BOOT Orchestration GemFire XD MOBILE Export BATCH Spring Predictive LAYER BOOT Modeling BATCH VIEWS 7 Spring IO Platform 8 Spring XD 10,000 ft view FILES 9 Spring XD SENSORS SOCIAL MOBILE Streams HTTP Tail File Mail Twitter Gemfire Syslog TCP UDP JMS RabbitMQ MQTT Trigger Reactor TCP/UDP 10 Spring XD Filter Transformer Object-to-JSON JSON-to-Tuple Splitter Aggregator HTTP Client Groovy Scripts Java Code JPMML Evaluator File HDFS JDBC TCP Log Mail RabbitMQ Gemfire Splunk MQTT Dynamic Router Counters Streams How can we make this easier? http | filter | file 11 Spring XD Taps “Listen” to data on another stream 12 Spring XD Analytics Counters and Gauges • Simple & Field Value Counter • How many tweets for #java • Aggregate Counter • How many tweets for #java in the week/day/hour • Gauge & Rich Gauge • How many requests per minute? Abstract API. Implemented in • In-Memory • Redis 13 Spring XD Predictive Models • Is this transaction fraudulent? Based on JPMML Evaluator • Wide range of model types Interoperable with R, Rattle, KNIME, RapidMiner Jobs CSV to JDBC FTP to HDFS JDBC to HDFS HDFS to JDBC HDFS to MongoDB 14 Spring XD Spring XD Runtime XD Shell HTTP POST /streams/aStream “M1 | M2” XD Admin XD Admin (leader) XD Admin ZooKeeper XD Container Data Transport 15 Spring XD XD Container Container State Spring XD Runtime XD Shell HTTP POST /streams/aStream “M1 | M2” XD Admin XD Admin (leader) XD Admin ZooKeeper Spring App Context XD Container M1 Data Transport 16 Spring XD XD Container Container State Spring XD Runtime XD Shell HTTP POST /streams/aStream “M1 | M2” XD Admin XD Admin (leader) XD Admin ZooKeeper Spring App Context Data Transport 17 Spring XD XD Container XD Container M1 M2 Container State Predictive Models 18 Spring XD Concepts Model • Parameterized algorithm Model Building • Derive a parameterized algorithm from the data • Slow process. Done offline, as a batch process, due to amount of data involved Model Scoring • Use the model to predict new information • Fast process. Can be done as part of stream processing 19 Spring XD PMML Predictive Model Markup Language XML interchange format for analytical models From the Data Mining Group http://www.dmg.org Processing + models Supported by statistics and data minig tools • R/Rattle, SAS Enterprise Miner, SPSS, Weka Java Evaluator API • JPMML-Evaluator project • Provides model scoring 20 Spring XD Distributed, Fault Tolerant Runtime 21 Spring XD Spring XD – Runtime – Fault Tolerance XD Shell HTTP POST /streams/aStream “M1 | M2” XD Admin XD Admin (leader) XD Admin ZooKeeper Spring App Context Data Transport 22 Spring XD XD Container XD Container M1 M2 Container State Spring XD – Runtime – Fault Tolerance XD Shell HTTP POST /streams/aStream “M1 | M2” XD Admin XD Admin (leader) XD Admin Container State ZooKeeper XD Container M2 Data Transport 23 Spring XD Spring XD – Runtime – Fault Tolerance XD Shell HTTP POST /streams/aStream “M1 | M2” XD Admin XD Admin (leader) XD Admin Container State ZooKeeper XD Container M1 M2 Data Transport 24 Spring XD Spring XD – Runtime – Fault Tolerance XD Shell XD Admin (leader) XD Admin Container State ZooKeeper XD Container M1 M2 Data Transport 25 Spring XD Spring XD – Runtime – Fault Tolerance XD Shell XD Admin (leader) XD Admin Container State ZooKeeper XD Container XD Container M1 M2 Data Transport 26 Spring XD Spring XD – Runtime – Fault Tolerance XD Shell XD Admin (leader) XD Admin Container State ZooKeeper XD Container XD Container XD Container M1 M2 Data Transport 27 Spring XD Spring XD – Runtime – Fault Tolerance XD Shell XD Admin (leader) XD Admin XD Admin Container State ZooKeeper XD Container XD Container XD Container M1 M2 Data Transport 28 Spring XD Spring XD – Runtime – Fault Tolerance XD Shell HTTP POST /streams/aStream “M3| M4” XD Admin (leader) XD Admin XD Admin Container State ZooKeeper XD Container XD Container XD Container M1 M3 M4 M2 Data Transport 29 Spring XD