0% found this document useful (0 votes)
9 views

SPA Notes

Uploaded by

2022dc04246
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

SPA Notes

Uploaded by

2022dc04246
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

Notes - Session-01 , SPA [ 25-05-2024 ] , 3.

50 PM
==================================

Instructor : SURYA PRAKASH GOTETI ( [email protected] )

Objectives

- Up-to Lecture 6 : Applications and Architectures.


- Lecture 14-16 : Streaming Algorithms
- Other Lecture : Tools and techniques
- Refer Handouts
- Assignment 1 & 2 : Demo video of 5 mins & upload into Canvas .
- Assignment 1 : 6
- Assignment 2 : Lecture-10,11
- Quiz-1 : Lecture-7 , Quiz-2 : Lecture-13

Agenda :

- Scope
- Assessment
- Key aspects : Streaming platforms , Spark Structured Streaming ( RDD ,
Structured Streaming ) , Databricks

Streaming : Event Streaming

- Event is Data records , in the context of data streaming


- Online transactions through cards .
- System logs (timestamp , log info ), Monitoring & Control
- An event is immutable fact about something that occurred in a software system .
- immutable : potentially endless and constantly evolving records . Immutable by
design - time spent , amount of txn , pos , Ref
- Stream processing is act of performing continuous calculations on potentially
endless and continuously evolving source of data .
- Enrichment : Aggregation , filtering , actions .
- Enrichment of event : Data at rest , Data in motion ( Batch / Stream )

Notes - Session-02 , SPA [ 01-06-2024 ] , 3.50 PM


=================================

1 . Assignment-01 : 29th June - 14th July , Quiz-01 : 6th-7th July .


Assignment-02 : 24th Aug - 08th Sept , Quiz-02 : 14th-15th Sept

2 . Agenda :

- Characteristics of Data
- Functional and Non-Functional requirements Pertaining to Data intensive
applications .

3 . 5V's : Volume , Variety , Veracity ,

4 . Computational model : Data Representation , Operation .

5 . Data processing applications

6 . Data systems

7 . Non-Functional requirements for Data Systems :


- Reliability
- Scalability
- Maintainability

8 . Web Analytics Application , Scaling with intermediate layer , Scaling with


Database partitions

9 . What are the bottleneck/issues ?

10 . Rise of Big Data Systems

11 . Big Data systems :

12 . Desired properties of Big Data Systems .

13 . Data Model of Big Data Systems

Notes - Session-03 , SPA [ 08-06-2024 ] , 3.50 PM


===================================

From Session-02
===============

1 . Data Model for Big Data

- Properties of Data : Rawness , Immutability , Eternity

2 . Fact based model for Data

- Facts :Data is growing in one direction infinitely .


- Benefits
- Structure / Schema
- Different instances are associated with a relationship .
- Aspects of ( Traditional & Big Data ) : Flexibility , analytics ,
Architectures , Sourcing , EDA

3 . Architecture of Big Data System

- Reference :
https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/big-
data
- Data warehousing , Data Lake , Lakehouse ( Databricks )
- Components , Advantages

From Session-03
================

1 . Classification of Real Time Systems :

- Hard , Soft , Near

2 . Difference between Real time and Stream Processing :

- Real Time stream processing .


- Streaming data system

3 . Difference between Batch Processing and Stream Processing


Notes - Session-04 , SPA [ 15-06-2024 ] , 3.50 PM
===================================

From Session-03

1 . User of Stream Processing

- Examples
- Credit card fraud detection
- Stock trading
- Defective manufacturing process

2 . Other Application
- Complex Even Processing ( CEP )
- Stream Analytics
- Materialized view
- Stream Searching

3 . Sources of Streaming Data


- Operational Monitoring
- Web analytics
- Online Advertising
- Social Media
- Mobile data & IoT

From Session-04

1 . Streaming Data System Components


- Collection
- Data Flow
- Processing
- Storage
- Delivery
2 . Generalized Architecture
- Collection System
- Data Flow Tier
- Processing / Analytics Tier
- Storage Tier
- Delivery Layer

Notes - Webinar-01 , SPA [ 20-06-2024 ] , 7.30 PM


===================================

Apache Samza

1.Introduction & Background


2.Overview of Apache Samza
3.How it Works
4.Key Concepts
5.Use Cases of Stream Processing

Notes - Session-05 , SPA [ 22-06-2024 ] , 3.50 PM


===================================

1 . Analysis tier : Data processing or event processing .


2 . Architecture for data processing .
- Lambda architecture .
- Kappa architecture .

3 . A case study problem . [ Refer document provided ]

- 3 Business Opportunities : Customer Segmentation , Product recommendation , More


selling , etc
- you can go ahead with Lambda architecture .

4 . Real time system characteristics .

- Distinguishing Features of Streaming Data

From Lecture Session-05:

1 . Service Configuration and Co-ordination Systems .

- Distributed Applications
- Motivation
- Distributed State Management

You might also like