0% found this document useful (0 votes)
43 views

Google - Colab Pyspark - Ml.classification Pyspark - Context Pyspark - Sql.session

This document summarizes a logistic regression model trained on sample LIBSVM data using PySpark. It loads the training data, trains a logistic regression model with hyperparameters, and prints the coefficients and intercept. It then trains a multinomial logistic regression model and prints the coefficient matrix and intercept vector.

Uploaded by

Darpan Sarode
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Google - Colab Pyspark - Ml.classification Pyspark - Context Pyspark - Sql.session

This document summarizes a logistic regression model trained on sample LIBSVM data using PySpark. It loads the training data, trains a logistic regression model with hyperparameters, and prints the coefficients and intercept. It then trains a multinomial logistic regression model and prints the coefficient matrix and intercept vector.

Uploaded by

Darpan Sarode
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Spark Assignment 2 M1084147

[9]: from google.colab import drive

[6]: from pyspark.ml.classification import LogisticRegression

[7]: from pyspark.context import SparkContext


from pyspark.sql.session import SparkSession

[8]: sc = SparkContext('local')
spark = SparkSession(sc)

[11]: training = spark.read.format("libsvm").


,→load("sample_libsvm_data.txt")

[12]: lr = LogisticRegression(maxIter=10, regParam=0.3,␣


,→elasticNetParam=0.8)

[13]: lrModel = lr.fit(training)

[18]: lrModel.coefficients

[18]: SparseVector(692, {272: -0.0001, 300: -0.0001, 323: 0.0, 350:␣


,→0.0004, 351:

0.0003, 378: 0.0006, 379: 0.0004, 405: 0.0004, 406: 0.0008,␣


,→407: 0.0005, 428:

-0.0, 433: 0.0006, 434: 0.0009, 435: 0.0001, 455: -0.0, 456:␣
,→-0.0, 461: 0.0005,

462: 0.0008, 483: -0.0001, 484: -0.0, 489: 0.0005, 490: 0.


,→0005, 496: -0.0, 511:

-0.0003, 512: -0.0001, 517: 0.0005, 539: -0.0001, 540: -0.


,→0004, 568: -0.0001})

[14]: print("Coefficients: " + str(lrModel.coefficients))


print("Intercept: " + str(lrModel.intercept))
Coefficients:␣
,→(692,[272,300,323,350,351,378,379,405,406,407,428,433,434,435,455,

1
456,461,462,483,484,489,490,496,511,512,517,539,540,568],[-7.
,→52068987138421e-05,

-8.115773146847101e-05,3.814692771846369e-05,0.
,→0003776490540424337,0.00034051483

661944103,0.0005514455157343105,0.0004085386116096913,0.
,→000419746733274946,0.000

8119171358670028,0.0005027708372668751,-2.
,→3929260406601844e-05,0.000574504802090

229,0.0009037546426803721,7.818229700244018e-05,-2.
,→1787551952912764e-05,-3.40216

58217896256e-05,0.0004966517360637634,0.0008190557828370367,-8.
,→017982139522704e-

05,-2.7431694037836214e-05,0.0004810832226238988,0.
,→00048408017626778765,-8.92647

2920011488e-06,-0.00034148812330427335,-8.950592574121486e-05,0.
,→0004864546911689

2167,-8.478698005186209e-05,-0.0004234783215831763,-7.
,→29653577763134e-05])

Intercept: -0.5991460286401435
[15]: mlr = LogisticRegression(maxIter=10, regParam=0.3,␣
,→elasticNetParam=0.8, family="multinomial")

[16]: mlrModel = mlr.fit(training)

[17]: print("Multinomial coefficients: " + str(mlrModel.


,→coefficientMatrix))

print("Multinomial intercepts: " + str(mlrModel.


,→interceptVector))

Multinomial coefficients: 2 X 692 CSRMatrix


(0,272) 0.0001
(0,300) 0.0001
(0,350) -0.0002
(0,351) -0.0001
(0,378) -0.0003
(0,379) -0.0002

2
(0,405) -0.0002
(0,406) -0.0004
(0,407) -0.0002
(0,433) -0.0003
(0,434) -0.0005
(0,435) -0.0001
(0,456) 0.0
(0,461) -0.0002
(0,462) -0.0004
(0,483) 0.0001
..
..
Multinomial intercepts: [0.2750587585718093,-0.2750587585718093]
[ ]:

You might also like