欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

introduction to data science w4

程序员文章站 2024-01-04 20:44:34
...

numpy提供方法来模拟运行binomial distribution:

np.random.binomial(n,p)//n代表模拟的次数,p代表成功率
np.random.binomial(n,p,size)
//例如,np.random.binomial(20,0.5,10000)表示进行10000次抛20次硬币的模拟,输出结果为一个数组,每个数是进行试验得到的结果的加和

x = np.random.binomial(20, .5, 10000)

print((x>=15).mean())

显示结果

Q:求两天连续有龙卷风的概率

chance_of_tornado = 0.01
tornado_events = np.random.binomial(1, chance_of_tornado, 1000000)
two_days_in_a_row = 0
for j in range(1,len(tornado_events)-1):
    if tornado_events[j]==1 and tornado_events[j-1]==1:
        two_days_in_a_row+=1
print('{} tornadoes back to back in {} years'.format(two_days_in_a_row, 1000000/365))

np.std(distribution)

stats.skew(distribution)给出一个分布的skew值

chi_squared_df5 = np.random.chisquare(5, size=10000)

stats.skew(chi_squared_df5)

推荐书:think stats,o'reilly系列,pdf版本在greenteapress.com/thinkstats2/index.html

hypothesis test: a statement you can test

alternative hypothesis: there is a difference between groups

null hypothesis: there is no difference between A and B

critical value: a threshold as to how much chance you are willing to accept the alternative

要比较两个distribution有没有区别,用 T test,scipy有提供

from scipy import stats

stats.ttest_ind?

stats.ttest_ind(early['assignment1_grade'], late['assignment1_grade'])//把两个distribution传入就可以了。

如果t test结果中p value比a大,那么无法拒绝null hypothesis。


上一篇:

下一篇: