第十四周作业
程序员文章站
2022-07-01 18:19:09
...
Anscombe's quartet
Anscombe's quartet comprises of four datasets, and is rather famous. Why? You'll find out in this exercise.
模块:
import random
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf
sns.set_context("talk")
数据:
anascombe = sns.load_dataset("anscombe")
print(anascombe)
Part1
计算均值、方差:
print("\nMean:")
print(anascombe.groupby("dataset").mean())
print("\nVariance:")
print(anascombe.groupby("dataset").var())
结果:
计算相关系数:
print("\nCorrelation coefficient:")
print(anascombe.groupby("dataset").x.corr(anascombe.y))
或
X = []
Y = []
coefficients = []
for i in range(0, 4):
X.append(anascombe.x[i*11:i*11+11].values)
Y.append(anascombe.y[i*11:i*11+11].values)
coefficients.append(sp.stats.pearsonr(X[i], Y[i])[0])
print(coefficients[i])
结果:
计算线性回归方程:
for i in range(0,4):
x = X[i]
x = sm.add_constant(x)
model = sm.OLS(Y[i], x)
results = model.fit()
print("\nThe linear regression " + str(i+1))
print(" y = "+str(results.params[0])+"+"+str(results.params[1])+"x")
结果:
Part2
散点图及回归直线:
sns.lmplot(x="x", y="y", col="dataset", hue="dataset", data=anascombe,
col_wrap=2, ci=None, palette="muted", size=4,
scatter_kws={"s": 80, "alpha": 1})
plt.show()
推荐阅读
-
第十节 抽象方法和抽象类 [10]
-
Python基础总结之初步认识---clsaa类(上)。第十四天开始(新手可相互督促)
-
SQL SERVER数据库的作业的脚本及存储过程
-
[课后作业] 第001讲:我和Python的第一次亲密接触 | 课后测试题的答案
-
sql server代理中作业执行SSIS包失败的解决办法
-
FZU2018级算法第一次作业 1.1fibonacci (矩阵快速幂)
-
[课后作业] 第001讲:我和Python的第一次亲密接触 | 课后测试题
-
第八周作业
-
.NET Core实战项目之CMS 第十章 设计篇-系统开发框架设计
-
javascript asp教程第十课--global asa