欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Week 14 Python 课后练习

程序员文章站 2022-05-22 08:00:21
...

题目原文请点击查看

Week 14 Python 课后练习

Week 14 Python 课后练习

Week 14 Python 课后练习

#matplotlib inline  
  
import random  
  
import numpy as np  
import scipy as sp  
import pandas as pd  
import matplotlib.pyplot as plt  
import seaborn as sns  
  
import statsmodels.api as sm  
import statsmodels.formula.api as smf  

import math
   
#Part 1   
anscombe = sns.load_dataset("anscombe")  
print("The mean of both x and y")
print(anscombe.groupby('dataset')['x', 'y'].mean()) 
print("\nThe variance of both x and y")
print(anscombe.groupby('dataset')['x', 'y'].var()) 

print("\nThe correlation coefficient between x and y")
print(anscombe.cov()['x']['y'] / (math.sqrt(anscombe['x'].var() * anscombe['y'].var())))

print("\nThe linear regression line: \n\t(hint: use statsmodels and look at the Statsmodels notebook)")
print(smf.ols('y ~ x', anscombe).fit().summary())

#Part 2
g = sns.FacetGrid(anscombe, col="dataset", hue="dataset", size=3)  
g.map(plt.scatter, 'x', 'y')  
plt.show() 

Result:

Part 1:

The mean of both x and y
           x         y
dataset               
I        9.0  7.500909
II       9.0  7.500909
III      9.0  7.500000
IV       9.0  7.500909

The variance of both x and y
            x         y
dataset                
I        11.0  4.127269
II       11.0  4.127629
III      11.0  4.122620
IV       11.0  4.123249

The correlation coefficient between x and y
0.81636624276147

The linear regression line: 
	(hint: use statsmodels and look at the Statsmodels notebook)
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.666
Model:                            OLS   Adj. R-squared:                  0.659
Method:                 Least Squares   F-statistic:                     83.92
Date:                Wed, 13 Jun 2018   Prob (F-statistic):           1.44e-11
Time:                        17:44:49   Log-Likelihood:                -67.358
No. Observations:                  44   AIC:                             138.7
Df Residuals:                      42   BIC:                             142.3
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      3.0013      0.521      5.765      0.000       1.951       4.052
x              0.4999      0.055      9.161      0.000       0.390       0.610
==============================================================================
Omnibus:                        1.513   Durbin-Watson:                   2.327
Prob(Omnibus):                  0.469   Jarque-Bera (JB):                0.896
Skew:                           0.339   Prob(JB):                        0.639
Kurtosis:                       3.167   Cond. No.                         29.1
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Part 2:

Week 14 Python 课后练习