RTR IR4 Blog
  • About
  • Professional Summary
  • Style Guide

AWS Big Data IOT With Spark Streaming

AWS Big Data IOT With Spark Streaming

Business case :Use AWS Big data services to process streaming data from IOT device. An Iot device will publish the license plate and clocked speed of vehicles as data streams into AWS and Spark Streaming...

30 Jan 2021

Airflow Pipeline Kafka Cassandra

Airflow Pipeline Kafka Cassandra

Business case : Create Airflow Pipeline using Kafka for Realtime stock price streaming, Cassandra for Realtime stock data warehousing Real time stock price can be streamed from alphavantage.com(subscription required).I will be using fake stock data...

01 Jul 2020

MQTT Spark Streaming With Watermark Part 2

MQTT Spark Streaming With Watermark Part 2

Earlier we created a IOT Device / MQTT Client using Raspberry Pi. We published a message from the device with a simple push button switch to simulate an event using bread board connected to the...

02 May 2020

MQTT Publish Subscribe IOT Part 1

MQTT Publish Subscribe IOT Part 1

We will create a IOT Device / MQTT Client using Raspberry Pi. With a simple push button switch to simulate an event using bread board connected to the Pi device. In the event of pressing...

01 May 2020

Custom MQTT Spark Receiver

Custom MQTT Spark Receiver

Spark Streaming’s Receivers accept data in parallel and buffer it in the memory of Spark’s workers nodes. Each Input DStream is associated with a Receiver object which receives the data from a source and stores...

15 Apr 2020

Spark Structured Streaming Integrate Kafka Streams to Cassandra Sink

Spark Structured Streaming Integrate Kafka Streams to Cassandra Sink

Kafka Streams integrated with Cassandra Sink using Spark Structured Streaming. Stream apache webserver logs from Kafka, parse the results using Spark Structured streaming and save the parsed results to Cassandra DB as a table. Also...

14 Apr 2020

Spark SQL Hive

Spark SQL Hive

In this post we will explore Spark SQL reading processing and writing data stored in Apache Hive. Apache Hive Translates SQL queries to MapReduce or Tez jobs on your cluster HIve Distributes SQL queries with...

14 Apr 2020

Kafka Streams

Kafka Streams

kafka Stream Word Count. Kafka Streams Key Concepts: ​ Stream: An ordered, replayable, and fault-tolerant sequence of immutable data records, where each data record is **defined as a **key-value pair.A source processor consumes data from...

04 Apr 2020

Inject Spark Stream to ElastiSearch Sink With Kibana

Inject Spark Stream to ElastiSearch Sink With Kibana

Elastisearch is a distributed document search and analytics engine, real-time search. Kibana, paired with elastisearch provises interactive exploration bash board creation and analysis. Amazon offers an Elasticsearch as Service In this example we will try...

16 Mar 2020

AWS EMR Spark

AWS EMR Spark

Steps for Deploying Spark App on Amazon EMR Make sure there are no paths to your local filesystem used in your script! Use HDFS, S3, etc. instead. Package up your Scala project into a JAR...

15 Mar 2020
Next