I have created a heatmap that will display the correlation between all of the columns within a dataset of random numbers. The heatmap is created just fine, but the heatmap is very small, mostly in the vertical direction. I attached an image of the heatmap to this post. The dataset is a pandas dataframe from a csv file. The code is shown below:
def colCorrelation():
xData = []
yData = []
fig, ax = plt.subplots(figsize=(5,5))
# calculates the correlation between all columns and all other columns
for i in range(0,100):
for e in range(0,100):
dataFlow = dict(zip([(i,e+1)],
[np.corrcoef(dfT[i],dfT[e+1])[0,1]]))
if list(dataFlow.values())[0] < .9:
xData.append(list(dataFlow.keys())[0][0])
yData.append(list(dataFlow.values())[0])
## tuple of the two columns being correlated and their correlation
## in the dictionary as key value pairs data structure.
## Ex: {(19, 17): -0.015262993060948592}
## Plot heatmap
heatmap, xedges, yedges = np.histogram2d(xData,yData,bins=(50))
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.clf()
plt.title('Random Data heatmap')
plt.ylabel('y')
plt.xlabel('x')
plt.imshow(heatmap,extent=extent)
plt.show()
colCorrelation()
Aucun commentaire:
Enregistrer un commentaire