0% found this document useful (0 votes)
11 views

SPA Notes

Uploaded by

2022dc04246
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

SPA Notes

Uploaded by

2022dc04246
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

Notes - Session-01 , SPA [ 25-05-2024 ] , 3.

50 PM
==================================

Instructor : SURYA PRAKASH GOTETI ( [email protected] )

Objectives

- Up-to Lecture 6 : Applications and Architectures.


- Lecture 14-16 : Streaming Algorithms
- Other Lecture : Tools and techniques
- Refer Handouts
- Assignment 1 & 2 : Demo video of 5 mins & upload into Canvas .
- Assignment 1 : 6
- Assignment 2 : Lecture-10,11
- Quiz-1 : Lecture-7 , Quiz-2 : Lecture-13

Agenda :

- Scope
- Assessment
- Key aspects : Streaming platforms , Spark Structured Streaming ( RDD ,
Structured Streaming ) , Databricks

Streaming : Event Streaming

- Event is Data records , in the context of data streaming


- Online transactions through cards .
- System logs (timestamp , log info ), Monitoring & Control
- An event is immutable fact about something that occurred in a software system .
- immutable : potentially endless and constantly evolving records . Immutable by
design - time spent , amount of txn , pos , Ref
- Stream processing is act of performing continuous calculations on potentially
endless and continuously evolving source of data .
- Enrichment : Aggregation , filtering , actions .
- Enrichment of event : Data at rest , Data in motion ( Batch / Stream )

Notes - Session-02 , SPA [ 01-06-2024 ] , 3.50 PM


=================================

1 . Assignment-01 : 29th June - 14th July , Quiz-01 : 6th-7th July .


Assignment-02 : 24th Aug - 08th Sept , Quiz-02 : 14th-15th Sept

2 . Agenda :

- Characteristics of Data
- Functional and Non-Functional requirements Pertaining to Data intensive
applications .

3 . 5V's : Volume , Variety , Veracity ,

4 . Computational model : Data Representation , Operation .

5 . Data processing applications

6 . Data systems

7 . Non-Functional requirements for Data Systems :


- Reliability
- Scalability
- Maintainability

8 . Web Analytics Application , Scaling with intermediate layer , Scaling with


Database partitions

9 . What are the bottleneck/issues ?

10 . Rise of Big Data Systems

11 . Big Data systems :

12 . Desired properties of Big Data Systems .

13 . Data Model of Big Data Systems

Notes - Session-03 , SPA [ 08-06-2024 ] , 3.50 PM


===================================

From Session-02
===============

1 . Data Model for Big Data

- Properties of Data : Rawness , Immutability , Eternity

2 . Fact based model for Data

- Facts :Data is growing in one direction infinitely .


- Benefits
- Structure / Schema
- Different instances are associated with a relationship .
- Aspects of ( Traditional & Big Data ) : Flexibility , analytics ,
Architectures , Sourcing , EDA

3 . Architecture of Big Data System

- Reference :
https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/big-
data
- Data warehousing , Data Lake , Lakehouse ( Databricks )
- Components , Advantages

From Session-03
================

1 . Classification of Real Time Systems :

- Hard , Soft , Near

2 . Difference between Real time and Stream Processing :

- Real Time stream processing .


- Streaming data system

3 . Difference between Batch Processing and Stream Processing


Notes - Session-04 , SPA [ 15-06-2024 ] , 3.50 PM
===================================

From Session-03

1 . User of Stream Processing

- Examples
- Credit card fraud detection
- Stock trading
- Defective manufacturing process

2 . Other Application
- Complex Even Processing ( CEP )
- Stream Analytics
- Materialized view
- Stream Searching

3 . Sources of Streaming Data


- Operational Monitoring
- Web analytics
- Online Advertising
- Social Media
- Mobile data & IoT

From Session-04

1 . Streaming Data System Components


- Collection
- Data Flow
- Processing
- Storage
- Delivery
2 . Generalized Architecture
- Collection System
- Data Flow Tier
- Processing / Analytics Tier
- Storage Tier
- Delivery Layer

Notes - Webinar-01 , SPA [ 20-06-2024 ] , 7.30 PM


===================================

Apache Samza

1.Introduction & Background


2.Overview of Apache Samza
3.How it Works
4.Key Concepts
5.Use Cases of Stream Processing

Notes - Session-05 , SPA [ 22-06-2024 ] , 3.50 PM


===================================

1 . Analysis tier : Data processing or event processing .


2 . Architecture for data processing .
- Lambda architecture .
- Kappa architecture .

3 . A case study problem . [ Refer document provided ]

- 3 Business Opportunities : Customer Segmentation , Product recommendation , More


selling , etc
- you can go ahead with Lambda architecture .

4 . Real time system characteristics .

- Distinguishing Features of Streaming Data

From Lecture Session-05:

1 . Service Configuration and Co-ordination Systems .

- Distributed Applications
- Motivation
- Distributed State Management

You might also like