반응형
19대 선거결과 시각화¶
In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
In [22]:
election_result = pd.read_csv('./data/05. election_result.csv', index_col=0)
election_result.head(3)
Out[22]:
1. 광역시도 이름을 2글자로 정리하기¶
In [23]:
sido_candi = election_result['광역시도']
sido_candi = [name[:2] if name[:2] in ['서울', '부산', '대구', '광주', '인천', '대전', '울산'] else '' for name in sido_candi]
In [24]:
def cut_char_sigu(name):
return name if len(name)==2 else name[:-1]
같은 시군이 이름이 존재하는 곳은 뒤에 시 까지 붙여줌¶
- ex: 성남 중원 /성남 분당 / 성남 수정
In [25]:
import re
sigun_candi = ['']*len(election_result)
for n in election_result.index:
each = election_result['시군'][n]
if each[:2] in ['수원', '성남', '안양', '안산', '고양', '용인', '청주', '천안', '전주', '포항', '창원']:
sigun_candi[n] = re.split('시', each)[0] + ' '+ cut_char_sigu(re.split('시', each)[1])
else:
sigun_candi[n] = cut_char_sigu(each)
print(sigun_candi)
In [26]:
ID_candi = [sido_candi[n]+' '+sigun_candi[n] for n in range(len(sigun_candi))]
ID_candi = [name[1:] if name[0] == ' ' else name for name in ID_candi]
ID_candi = [name[:2] if name[:2] == '세종' else name for name in ID_candi]
print(ID_candi)
In [27]:
election_result['ID'] = ID_candi
election_result.head(10)
Out[27]:
In [28]:
election_result[['rate_moon', 'rate_hong', 'rate_ahn']] = election_result[['moon', 'hong', 'ahn']].div(election_result['pop'], axis=0)
election_result[['rate_moon', 'rate_hong', 'rate_ahn']] *= 100
election_result.head(3)
Out[28]:
문 후보가 높은 비율로 득표한 지역¶
In [32]:
election_result.sort_values(['rate_moon'], ascending=[False]).head(10)
Out[32]:
홍 후보¶
In [33]:
election_result.sort_values(['rate_hong'], ascending=[False]).head(10)
Out[33]:
안 후보¶
In [34]:
election_result.sort_values(['rate_ahn'], ascending=[False]).head(10)
Out[34]:
2. 지도로 나타내기¶
In [36]:
draw_korea = pd.read_csv('./data/05. draw_korea.csv', encoding='utf-8', index_col=0)
draw_korea.head()
Out[36]:
In [37]:
set(draw_korea['ID'].unique( )) - set(election_result['ID'].unique())
Out[37]:
In [38]:
set(election_result['ID'].unique()) - set(draw_korea['ID'].unique( ))
Out[38]:
2-1. 고성, 부천, 창원에 대한 ID값이 달라 이를 수정해주어야한다.¶
In [39]:
election_result[election_result['ID']=='고성']
Out[39]:
In [41]:
election_result.loc[125, 'ID'] = '고성(강원)'
election_result.loc[233, 'ID'] = '고성(경남)'
election_result[election_result['시군']=='고성군']
Out[41]:
In [45]:
election_result[election_result['ID']=='창원 마산합포']
Out[45]:
In [46]:
election_result[election_result['광역시도']=='경상남도']
Out[46]:
In [47]:
election_result.loc[228, 'ID'] = '창원 합포'
election_result.loc[229, 'ID'] = '창원 회원'
2-2. 부천¶
2016년부터 소사,오정,원미구를 폐지했기 때문에, 그냥 단순히 부천을 3개로 나누어 각각 넣어주자
In [48]:
election_result[election_result['시군'] =='부천시']
Out[48]:
In [86]:
tmp = election_result[election_result['시군'] =='부천시'][['pop', 'moon', 'ahn', 'hong', 'rate_moon', 'rate_hong', 'rate_ahn']]
tmp['광역시도'] = '경기도'
tmp['시군'] = '부천시'
tmp['ID'] = '부천 소사'
for i in [ '부천 오정', '부천 원미']:
t = election_result[election_result['시군'] =='부천시'][['pop', 'moon', 'ahn', 'hong', 'rate_moon', 'rate_hong', 'rate_ahn']]
t['광역시도'] = '경기도'
t['시군'] = '부천시'
t['ID'] = i
tmp = tmp.append(t)
tmp
Out[86]:
In [88]:
election_result2 = pd.concat([tmp, election_result])
election_result2[election_result2['시군']=='부천시']
Out[88]:
In [89]:
election_result2.reset_index(inplace=True)
In [92]:
del election_result2['index']
In [98]:
election_result2.drop([88], inplace=True)
election_result2[election_result2['시군']=='부천시']
Out[98]:
In [99]:
final_elect_data = pd.merge(election_result2, draw_korea, how='left', on=['ID'])
final_elect_data.head(3)
Out[99]:
In [100]:
final_elect_data['moon vs hong'] = final_elect_data['rate_moon'] - final_elect_data['rate_hong']
final_elect_data['moon vs ahn'] = final_elect_data['rate_moon'] - final_elect_data['rate_ahn']
final_elect_data['ahn vs hong'] = final_elect_data['rate_ahn'] - final_elect_data['rate_hong']
final_elect_data.head(3)
Out[100]:
데이터 준비 끝¶
지도에 그려보자!
In [101]:
BORDER_LINES = [[(5, 1), (5, 2), (7, 2), (7, 3), (11,3), (11, 0)],
[(5, 4), (5, 5), (2, 5), (2, 7), (4, 7), (4, 9), (7, 9), (7, 7), (9, 7), (9, 5), (10, 5), (10, 4), (5, 4)],
[(1, 7), (1, 8), (3, 8), (3, 10), (10, 10), (10, 7), (12, 7), (12, 6), (11, 6), (11, 5), (12, 5), (12, 4), (11, 4),(11, 3)],
[(8, 10), (8, 11), (6, 11), (6, 12)],
[(12, 5), (13, 5), (13, 4), (14, 4), (14, 5), (15, 5), (15, 4), (16, 4), (16, 2)],
[(16, 4), (17, 4), (17, 5), (16, 5), (16, 6), (19, 6), (19, 5), (20, 5), (20, 4), (21, 4), (21, 3), (19, 3), (19, 1)],
[(13, 5), (13, 6), (16, 6)],
[(13, 5), (14, 5)],
[(21, 2), (21, 3), (22, 3), (22, 4), (24, 4),(24,2), (21, 2)],
[(20, 5), (21, 5), (21, 6), (23, 6)],
[(10, 8), (12, 8), (12, 9), (14, 9), (14, 8), (16, 8), (16, 6)],
[(14, 9), (14, 11), (14, 12), (13, 12), (13, 13)],
[(15, 8), (17, 8), (17, 10), (16, 10), (16, 11), (14, 11)],
[(17, 9), (18, 9), (18, 8), (19, 8), (19, 9), (20, 9), (20, 10), (21, 10)],
[(16, 11), (16, 13)],
[(27, 5), (27, 6), (25, 6)],
]
In [103]:
def drawKorea(targetData, blockedMap, cmapname):
gamma = .75
whitelabelmin = (max(blockedMap[targetData]) - min(blockedMap[targetData]))*0.25 + min(blockedMap[targetData])
datalabel = targetData
vmin = min(blockedMap[targetData])
vmax = max(blockedMap[targetData])
mapdata = blockedMap.pivot_table(index='y', columns = 'x', values = targetData)
masked_mapdata = np.ma.masked_where(np.isnan(mapdata), mapdata)
plt.figure(figsize = (6, 8))
plt.pcolor(masked_mapdata, vmin=vmin, vmax=vmax, cmap=cmapname, edgecolor='#aaaaaa', linewidth=0.5)
for idx, row in blockedMap.iterrows():
if len(row['ID'].split())==2:
dispname = '{}\n{}'.format(row['ID'].split()[0], row['ID'].split()[1])
elif row['ID'][:2] =='고성':
dispname = '고성'
else:
dispname = row['ID']
if len(dispname.splitlines()[-1]) >= 3:
fontsize, linespacing = 8, 1.1
else:
fontsize, linespacing = 9, 0.9
annocolor = 'white' if row[targetData] > whitelabelmin else 'black'
plt.annotate(dispname, (row['x']+0.5, row['y']+0.5), weight='bold', fontsize= fontsize, ha='center', va='center', linespacing=linespacing)
for path in BORDER_LINES:
ys, xs = zip(*path)
plt.plot(xs, ys, c='black', lw=2)
plt.gca().invert_yaxis()
plt.axis('off')
cb =plt.colorbar(shrink=.1, aspect=10)
cb.set_label(datalabel)
plt.tight_layout()
plt.show()
In [105]:
from matplotlib import font_manager, rc
rc('font', family='AppleGothic')
plt.rcParams['axes.unicode_minus'] = False
문 후보 vs 홍 후보¶
In [106]:
drawKorea('moon vs hong', final_elect_data, 'RdBu')
안 후보 vs 홍 후보¶
In [107]:
drawKorea('ahn vs hong', final_elect_data, 'RdBu')
문 후보 vs 안 후보¶
In [108]:
drawKorea('moon vs ahn', final_elect_data, 'RdBu')
반응형
'Study > 파이썬으로 데이터 주무르기' 카테고리의 다른 글
[주식 - NC SOFT] 주식 데이터 예측하기 (0) | 2018.07.18 |
---|---|
[시계열 데이터 분석] numpy를 이용한 시계열 데이터 분석 (0) | 2018.07.18 |
[19대 선거] Selenium을 이용한 19대 선거 데이터 크롤링 (2) | 2018.07.16 |
[Folium으로 지도 그리기] folium으로 지도그리기 feat.인구소멸 위기지역 (0) | 2018.07.13 |
[인구 데이터] 우리나라 인구 소멸 위기 지역 분석 (0) | 2018.07.13 |
댓글