ํ”„๋กœ์ ํŠธ ๋ชฉํ‘œ

  • ์Šน์ฐจ ๋˜๋Š” ํ•˜์ฐจ ์‹œ ํ•ด๋‹น ์‹œ๊ฐ„, ํ•ด๋‹น ์—ญ์˜ ์Šน๊ฐ ์ˆ˜๋ฅผ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ๊ฐœ์ฐฐ๊ตฌ ํ†ต๊ณผ ์Šน๊ฐ ์ˆ˜ ๋ฐ์ดํ„ฐ์™€ ์ง€ํ•˜์ฒ  ์œ„์น˜์ขŒํ‘œ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉ
  • ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ์ •์ œ, ํŠน์„ฑ ์—”์ง€๋‹ˆ์–ด๋ง, ์‹œ๊ฐํ™” ๋ฐฉ๋ฒ• ํ•™์Šต

ํ”„๋กœ์ ํŠธ ๋ชฉ์ฐจ

  1. ๋ฐ์ดํ„ฐ ์ฝ๊ธฐ: ์Šนํ•˜์ฐจ ์ธ์› ์ •๋ณด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ  DataFrame ๊ตฌ์กฐ๋ฅผ ํ™•์ธ
    1.1. ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
    1.2. ๋ฐ์ดํ„ฐ ํ™•์ธํ•˜๊ธฐ

  2. ๋ฐ์ดํ„ฐ ์ •์ œ: ๋ฐ์ดํ„ฐ ํ™•์ธ ํ›„ ํ˜• ๋ณ€ํ™˜ ๋ฐ ์ด์ƒ์น˜ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ
    2.1. 2021๋…„ 6์›” ์Šนํ•˜์ฐจ ์ธ์›๋งŒ ์ถ”์ถœ

  3. ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”: ๊ฐ ๋ณ€์ˆ˜๋ณ„๋กœ ์ถ”๊ฐ€์ ์ธ ์ •์ œ ๋˜๋Š” feature engineering ๊ณผ์ •์„ ๊ฑฐ์น˜๊ณ  ์‹œ๊ฐํ™”๋ฅผ ์ดํ•ด ๋ฐ์ดํ„ฐ ํŠน์„ฑ ํŒŒ์•…
    3.1. ํ˜ธ์„  ๋ณ„ ์ด์šฉ๊ฐ ์ˆ˜ ์ถœ๋ ฅ
    3.2. ํŠน์ • ํ˜ธ์„ ์—์„œ ์—ญ๋ณ„ ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์› ๋ฐ์ดํ„ฐ ์ถ”์ถœ
    3.3. ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์› ์ˆ˜ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ์ถœ๋ ฅ
    3.4. ํŠน์ • ํ˜ธ์„ ์˜ ํ˜ผ์žก ์ •๋„์™€ ์œ„์น˜์ขŒํ‘œ ๋ฐ์ดํ„ฐ ๋ณ‘ํ•ฉ
    3.5. ํŠน์ • ํ˜ธ์„ ์˜ ํ˜ผ์žก ์ •๋„๋ฅผ ์ง€๋„์— ์ถœ๋ ฅ

๋ฐ์ดํ„ฐ ์ถœ์ฒ˜

  • ์„œ์šธ์‹œ ์ง€ํ•˜์ฒ  ํ˜ธ์„ ๋ณ„ ์—ญ๋ณ„ ์Šนํ•˜์ฐจ ์ธ์› ์ •๋ณด ๋ฐ์ดํ„ฐ: http://data.seoul.go.kr/dataList/OA-12252/S/1/datasetView.do

 

ํ”„๋กœ์ ํŠธ ๊ฐœ์š”

์ฝ”๋กœ๋‚˜ ์‹œ๊ตญ์— ์ต์ˆ™ํ•ด์กŒ๋‹ค๊ณ ๋Š” ํ•˜์ง€๋งŒ ๊ฐ€๋” ๋ฐ–์œผ๋กœ ๋‚˜๊ฐˆ ๋•Œ ์‚ฌ๋žŒ ๋งŽ์€ ๊ณณ์€ ํ”ผํ•˜๊ณ  ์‹ถ์€ ์ƒ๊ฐ์— ์–ด๋–ค ์žฅ์†Œ๋ฅผ ํ”ผํ•ด์•ผ ํ•˜๋Š”์ง€ ์•Œ์•„๋ณด๊ณ  ์‹ถ์„ ๋•Œ๊ฐ€ ์žˆ์„ ๊ฒ๋‹ˆ๋‹ค. ์ง€ํ•˜์ฒ  ์ด์šฉ ์Šน๊ฐ ์ˆ˜๋ฅผ ํ™•์ธํ•ด๋ณด๋ฉด ํ˜ผ์žก๋„๊ฐ€ ๋†’์€ ์ง€์—ญ์„ ํ™•์ธํ•ด๋ณผ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ด๋ฒˆ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ์„œ์šธ ์—ด๋ฆฐ ๋ฐ์ดํ„ฐ ๊ด‘์žฅ์—์„œ ์ œ๊ณตํ•˜๋Š” ์„œ์šธ์‹œ ์ง€ํ•˜์ฒ  ํ˜ธ์„ ๋ณ„ ์—ญ๋ณ„ ์Šนํ•˜์ฐจ ์ธ์› ์ •๋ณด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜๊ณ  ์ง€ํ•˜์ฒ  ์—ญ ์œ„์น˜ ์ขŒํ‘œ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด ํŠน์ • ํ˜ธ์„ ์—์„œ ์–ด๋–ค ์—ญ์ด ๊ฐ€์žฅ ํ˜ผ์žกํ•œ์ง€ ์ง๊ด€์ ์œผ๋กœ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค.

 

 

 

1. ๋ฐ์ดํ„ฐ ์ฝ๊ธฐ

1.1. ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt

 pd.read_csv๋ฅผ ํ†ตํ•˜์—ฌ ์Šนํ•˜์ฐจ ์ธ์› ์ •๋ณด ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ํ˜•ํƒœ๋กœ ์ฝ์–ด์˜ต๋‹ˆ๋‹ค.
metro_all = pd.read_csv("./data/แ„‰แ…ฅแ„‹แ…ฎแ†ฏแ„‰แ…ต แ„Œแ…ตแ„’แ…กแ„Žแ…ฅแ†ฏ แ„’แ…ฉแ„‰แ…ฅแ†ซแ„‡แ…งแ†ฏ แ„‹แ…งแ†จแ„‡แ…งแ†ฏ แ„‰แ…ตแ„€แ…กแ†ซแ„ƒแ…ขแ„‡แ…งแ†ฏ แ„‰แ…ณแ†ผแ„’แ…กแ„Žแ…ก แ„‹แ…ตแ†ซแ„‹แ…ฏแ†ซ แ„Œแ…ฅแ†ผแ„‡แ…ฉ_20210705.csv", encoding = 'cp949')

# ์Šนํ•˜์ฐจ ์ธ์› ์ •๋ณด ์ƒ์œ„ 5๊ฐœ ๋ฐ์ดํ„ฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
metro_all.head()

# ์Šนํ•˜์ฐจ ์ธ์› ์ •๋ณด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ •๋ณด๋ฅผ ์š”์•ฝํ•˜์—ฌ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. 
metro_all.info()
'''
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 45338 entries, 0 to 45337
Data columns (total 52 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   ์‚ฌ์šฉ์›”           45338 non-null  int64 
 1   ํ˜ธ์„ ๋ช…           45338 non-null  object
 2   ์ง€ํ•˜์ฒ ์—ญ          45338 non-null  object
 3   04์‹œ-05์‹œ ์Šน์ฐจ์ธ์›  45338 non-null  int64 
 4   04์‹œ-05์‹œ ํ•˜์ฐจ์ธ์›  45338 non-null  int64 
 5   05์‹œ-06์‹œ ์Šน์ฐจ์ธ์›  45338 non-null  int64 
 6   05์‹œ-06์‹œ ํ•˜์ฐจ์ธ์›  45338 non-null  int64 
 7   06์‹œ-07์‹œ ์Šน์ฐจ์ธ์›  45338 non-null  int64 
 8   06์‹œ-07์‹œ ํ•˜์ฐจ์ธ์›  45338 non-null  int64 
 '''

 

1.2. ๋ฐ์ดํ„ฐ ํ™•์ธํ•˜๊ธฐ

# metro_all DataFrame ์‚ฌ์šฉ์›” ๋ฐ์ดํ„ฐ ํ™•์ธ
sorted(list(set(metro_all['์‚ฌ์šฉ์›”'])))
'''
[201501,
 201502,
 201503,
 201504,...
 '''
 
 # metro_all DataFrame ํ˜ธ์„ ๋ช… ๋ฐ์ดํ„ฐ ํ™•์ธ
sorted(list(set(metro_all['ํ˜ธ์„ ๋ช…'])))
'''
['1ํ˜ธ์„ ',
 '2ํ˜ธ์„ ',
 '3ํ˜ธ์„ ',
 '4ํ˜ธ์„ ',
 '''
 
 # DataFrame ์ง€ํ•˜์ฒ ์—ญ ๋ฐ์ดํ„ฐ ํ™•์ธ
sorted(list(set(metro_all['์ง€ํ•˜์ฒ ์—ญ']))) 
'''
['4.19๋ฏผ์ฃผ๋ฌ˜์ง€',
 '๊ฐ€๋Šฅ',
 '๊ฐ€๋ฝ์‹œ์žฅ',
 '๊ฐ€์‚ฐ๋””์ง€ํ„ธ๋‹จ์ง€',
 '''
 
 # DataFrame ์ง€ํ•˜์ฒ ์—ญ ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜ ํ™•์ธ
len(list(set(metro_all['์ง€ํ•˜์ฒ ์—ญ']))) #579

 

 

 

2. ๋ฐ์ดํ„ฐ ์ •์ œ

๊ฐ€์žฅ ์ตœ๊ทผ ํ•œ๋‹ฌ๊ฐ„ ์ˆ˜์ง‘๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ํŠน์ • ํ˜ธ์„ ์—์„œ ์–ด๋–ค ์—ญ์ด ๊ฐ€์žฅ ํ˜ผ์žกํ•œ์ง€ ํ™•์ธํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.

2.1. 2021๋…„ 6์›” ์Šนํ•˜์ฐจ ์ธ์›๋งŒ ์ถ”์ถœ

# 2021๋…„ 6์›” ์ด ์Šน๊ฐ์ˆ˜๋งŒ ์ถ”์ถœ
metro_recent = metro_all[metro_all['์‚ฌ์šฉ์›”']==202106]
metro_recent

# ๋ถˆํ•„์š”ํ•œ ์ž‘์—…์ผ์ž ์ปฌ๋Ÿผ ์ œ๊ฑฐ
metro_recent = metro_recent.drop(columns={'์ž‘์—…์ผ์ž'})
metro_recent

 

 

 

3. ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”

3.1. ํ˜ธ์„  ๋ณ„ ์ด์šฉ๊ฐ ์ˆ˜ ์ถœ๋ ฅ

import matplotlib.font_manager as fm

font_dirs = ['/usr/share/fonts/truetype/nanum', ]
font_files = fm.findSystemFonts(fontpaths=font_dirs)

for font_file in font_files:
    fm.fontManager.addfont(font_file)
    
metro_line = metro_recent.groupby(['ํ˜ธ์„ ๋ช…']).mean().reset_index()
metro_line = metro_line.drop(columns='์‚ฌ์šฉ์›”').set_index('ํ˜ธ์„ ๋ช…')
metro_line = metro_line.mean(axis=1).sort_values(ascending=False)

plt.figure(figsize=(20,10))
plt.rc('font', family="NanumBarunGothic")
plt.rcParams['axes.unicode_minus'] = False

metro_line.plot(kind=('bar'))
plt.show()

 

3.2. ํŠน์ • ํ˜ธ์„ ์—์„œ ์—ญ๋ณ„ ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์› ๋ฐ์ดํ„ฐ ์ถ”์ถœ

line = '2ํ˜ธ์„ '
metro_st = metro_recent.groupby(['ํ˜ธ์„ ๋ช…','์ง€ํ•˜์ฒ ์—ญ']).mean().reset_index()
metro_st_line2 = metro_st[metro_st['ํ˜ธ์„ ๋ช…']==line]
metro_st_line2

# ์Šน์ฐจ ์ธ์› ์ปฌ๋Ÿผ๋งŒ ์ถ”์ถœ
metro_get_on = pd.DataFrame()
metro_get_on['์ง€ํ•˜์ฒ ์—ญ'] = metro_st_line2['์ง€ํ•˜์ฒ ์—ญ']
for i in range(int((len(metro_recent.columns)-3)/2)):
    metro_get_on[metro_st_line2.columns[3+2*i]] = metro_st_line2[metro_st_line2.columns[3+2*i]]
metro_get_on = metro_get_on.set_index('์ง€ํ•˜์ฒ ์—ญ')
metro_get_on

# ํ•˜์ฐจ ์ธ์› ์ปฌ๋Ÿผ๋งŒ ์ถ”์ถœ
metro_get_off = pd.DataFrame()
metro_get_off['์ง€ํ•˜์ฒ ์—ญ'] = metro_st_line2['์ง€ํ•˜์ฒ ์—ญ']
for i in range(int((len(metro_recent.columns)-3)/2)):
    metro_get_off[metro_st_line2.columns[4+2*i]] = metro_st_line2[metro_st_line2.columns[4+2*i]]
metro_get_off = metro_get_off.set_index('์ง€ํ•˜์ฒ ์—ญ')
metro_get_off

# ์—ญ ๋ณ„ ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์›์„ ๊ตฌํ•œ ํ›„ ์ •์ˆ˜๋กœ ํ˜• ๋ณ€ํ™˜ํ•˜์—ฌ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ์ €์žฅ
df = pd.DataFrame(index = metro_st_line2['์ง€ํ•˜์ฒ ์—ญ'])
df['ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜'] = metro_get_on.mean(axis=1).astype(int)
df['ํ‰๊ท  ํ•˜์ฐจ ์ธ์› ์ˆ˜'] = metro_get_off.mean(axis=1).astype(int)
df

 

3.3. ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์› ์ˆ˜ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ์ถœ๋ ฅ

2ํ˜ธ์„  ๊ธฐ์ค€ 6์›” ํ•œ ๋‹ฌ๊ฐ„ ๊ฐ•๋‚จ > ์ž ์‹ค > ์‹ ๋ฆผ > ๊ตฌ๋กœ๋””์ง€ํ„ธ๋‹จ์ง€ > ํ™๋Œ€์ž…๊ตฌ > ์„ ๋ฆ‰ ์ˆœ์œผ๋กœ ํ‰๊ท  ์Šน์ฐจ ์ธ์›์ด ๋งŽ์•˜์Šต๋‹ˆ๋‹ค.

# ์Šน์ฐจ ์ธ์› ์ˆ˜ Top10 
top10_on = df.sort_values(by='ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜', ascending=False).head(10)

plt.figure(figsize=(20,10))
plt.rc('font', family="NanumBarunGothic")
plt.rcParams['axes.unicode_minus'] = False

plt.bar(top10_on.index, top10_on['ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜'])
for x, y in enumerate(list(top10_on['ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜'])):
    if x == 0:
        plt.annotate(y, (x-0.15, y), color = 'red')
    else:
        plt.annotate(y, (x-0.15, y))

plt.title('2021๋…„ 6์›” ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜ Top10')
plt.show()

 

ํ‰๊ท  ํ•˜์ฐจ ์ธ์›์€ ๊ฑฐ์˜ ๋™์ผํ•˜๊ฒŒ ๊ฐ•๋‚จ > ์ž ์‹ค > ์‹ ๋ฆผ > ๊ตฌ๋กœ๋””์ง€ํ„ธ๋‹จ์ง€ > ํ™๋Œ€์ž…๊ตฌ > ์—ญ์‚ผ ์ˆœ์œผ๋กœ ๋งŽ์•˜์Šต๋‹ˆ๋‹ค.

# ํ•˜์ฐจ ์ธ์› ์ˆ˜ Top10
top10_off = df.sort_values(by='ํ‰๊ท  ํ•˜์ฐจ ์ธ์› ์ˆ˜', ascending=False).head(10)

plt.figure(figsize=(20,10))
plt.rc('font', family="NanumBarunGothic")
plt.rcParams['axes.unicode_minus'] = False

plt.bar(top10_off.index, top10_off['ํ‰๊ท  ํ•˜์ฐจ ์ธ์› ์ˆ˜'])
for x, y in enumerate(list(top10_off['ํ‰๊ท  ํ•˜์ฐจ ์ธ์› ์ˆ˜'])):
    if x == 0:
        plt.annotate(y, (x-0.15, y), color = 'red')
    else:
        plt.annotate(y, (x-0.15, y))

plt.title('2021๋…„ 6์›” ํ‰๊ท  ํ•˜์ฐจ ์ธ์› ์ˆ˜ Top10')
plt.show()

 

ํ€ด์ฆˆ1. 6ํ˜ธ์„ ์˜ ์ง€ํ•˜์ฒ  ์—ญ ์ค‘์—์„œ ์Šน์ฐจ ์ธ์›์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๋งŽ์€ ์—ญ๋ช…์„ ๊ตฌํ•˜์„ธ์š”.

# 3.2.์˜ ์ฒซ ๋ฒˆ์งธ ์…€์—์„œ line๊ฐ’๋งŒ ์ˆ˜์ •ํ•œ ํ›„ 
# 3.2.์™€ 3.3. ์ฝ”๋“œ๋ฅผ ์ˆœ์„œ๋Œ€๋กœ ๋‹ค์‹œ ์‹คํ–‰ํ•ด๋ณด๋ฉด ๋‹ต์„ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

top_on = df.sort_values(by='ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜', ascending=False).head(1)
top_on.index[0]

 

3.4. ํŠน์ • ํ˜ธ์„ ์˜ ํ˜ผ์žก ์ •๋„์™€ ์œ„์น˜์ขŒํ‘œ ๋ฐ์ดํ„ฐ ๋ณ‘ํ•ฉ

ํŠน์ • ํ˜ธ์„ ์˜ ์ง€ํ•˜์ฒ  ์—ญ ๋งˆ๋‹ค ์ง€๋„์— ์ •๋ณด๋ฅผ ์ถœ๋ ฅํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ฐ ์œ„์น˜์˜ ์ขŒํ‘œ์ •๋ณด๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์นด์นด์˜ค API๋ฅผ ํ™œ์šฉํ•˜์—ฌ csv ํŒŒ์ผ๋กœ ๋งŒ๋“ค์–ด๋‘์—ˆ์Šต๋‹ˆ๋‹ค.

์ถœ์ฒ˜:
https://developers.kakao.com/docs/latest/ko/local/dev-guide#search-by-keyword
https://developers.kakao.com/docs/latest/ko/local/dev-guide#address-coord

# ์ง€ํ•˜์ฒ  ์—ญ๋ณ„ ์œ„์น˜์ขŒํ‘œ์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.
subway_location = pd.read_csv('./data/์ง€ํ•˜์ฒ  ์—ญ ์œ„์น˜ ์ขŒํ‘œ.csv')
subway_location

 

# ํŠน์ • ํ˜ธ์„ ์˜ ์—ญ๋ณ„ ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์› ์ˆ˜์™€ ์ง€ํ•˜์ฒ  ์—ญ ์œ„์น˜ ์ขŒํ‘œ๋ฅผ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
def get_nums_and_location(line, metro_st):
    
    # ํŠน์ • ํ˜ธ์„ ์˜ ๋ฐ์ดํ„ฐ๋งŒ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
    metro_line_n = metro_st[metro_st['ํ˜ธ์„ ๋ช…']==line]
    
    # ์Šน์ฐจ ์ธ์› ์ปฌ๋Ÿผ๋งŒ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
    metro_get_on = pd.DataFrame()
    metro_get_on['์ง€ํ•˜์ฒ ์—ญ'] = metro_line_n['์ง€ํ•˜์ฒ ์—ญ']
    for i in range(int((len(metro_recent.columns)-3)/2)):
        metro_get_on[metro_line_n.columns[3+2*i]] = metro_line_n[metro_line_n.columns[3+2*i]]
    metro_get_on = metro_get_on.set_index('์ง€ํ•˜์ฒ ์—ญ')
    
    # ํ•˜์ฐจ ์ธ์› ์ปฌ๋Ÿผ๋งŒ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
    metro_get_off = pd.DataFrame()
    metro_get_off['์ง€ํ•˜์ฒ ์—ญ'] = metro_line_n['์ง€ํ•˜์ฒ ์—ญ']
    for i in range(int((len(metro_recent.columns)-3)/2)):
        metro_get_off[metro_line_n.columns[4+2*i]] = metro_line_n[metro_line_n.columns[4+2*i]]
    metro_get_off = metro_get_off.set_index('์ง€ํ•˜์ฒ ์—ญ')
    
    # ์—ญ ๋ณ„ ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์›์„ ๊ตฌํ•œ ํ›„ ์ •์ˆ˜๋กœ ํ˜• ๋ณ€ํ™˜ํ•˜์—ฌ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
    df = pd.DataFrame(index = metro_line_n['์ง€ํ•˜์ฒ ์—ญ'])
    df['ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜'] = metro_get_on.mean(axis=1).astype(int)
    df['ํ‰๊ท  ํ•˜์ฐจ ์ธ์› ์ˆ˜'] = metro_get_off.mean(axis=1).astype(int)
    
    # ์ง€ํ•˜์ฒ ์—ญ ๋ช… ๋™์ผํ•˜๋„๋ก ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
    temp = []
    df = df.reset_index()
    for name in df['์ง€ํ•˜์ฒ ์—ญ']:
        temp.append(name.split('(')[0]+'์—ญ')
    df['์ง€ํ•˜์ฒ ์—ญ'] = temp
    
    # ์ง€ํ•˜์ฒ ์—ญ ๋ช…์„ ๊ธฐ์ค€์œผ๋กœ ๋‘ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ๋ณ‘ํ•ฉํ•ฉ๋‹ˆ๋‹ค.
    df = df.merge(subway_location, left_on='์ง€ํ•˜์ฒ ์—ญ', right_on='์ง€ํ•˜์ฒ ์—ญ')
    return df
get_nums_and_location('6ํ˜ธ์„ ', metro_st)

 

3.5. ํŠน์ • ํ˜ธ์„ ์˜ ํ˜ผ์žก ์ •๋„๋ฅผ ์ง€๋„์— ์ถœ๋ ฅ

import folium

# ํŠน์ • ์œ„๋„, ๊ฒฝ๋„ ์ค‘์‹ฌ์œผ๋กœ ํ•˜๋Š” OpenStreetMap์„ ์ถœ๋ ฅ
map_osm = folium.Map(location = [37.529622, 126.984307], zoom_start=12)
map_osm

# ํŠน์ • ํ˜ธ์„ ์˜ ์—ญ๋ณ„ ํ‰๊ท  ์Šนํ•˜์ฐจ ์ธ์› ์ˆ˜์™€ ์œ„์น˜์ขŒํ‘œ ๋ฐ์ดํ„ฐ๋งŒ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
rail = '6ํ˜ธ์„ '
df = get_nums_and_location(rail, metro_st)

# ์„œ์šธ์˜ ์ค‘์‹ฌ์— ์œ„์น˜ํ•˜๋Š” ๋ช…๋™์—ญ์˜ ์œ„๋„์™€ ๊ฒฝ๋„๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ์ง€๋„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
latitude = subway_location[subway_location['์ง€ํ•˜์ฒ ์—ญ']=='๋ช…๋™์—ญ']['x์ขŒํ‘œ']
longitude = subway_location[subway_location['์ง€ํ•˜์ฒ ์—ญ']=='๋ช…๋™์—ญ']['y์ขŒํ‘œ']
map_osm = folium.Map(location = [latitude, longitude], zoom_start = 12)

# ๊ฐ ์ง€ํ•˜์ฒ  ์—ญ์˜ ์œ„์น˜๋ณ„๋กœ ์›ํ˜•๋งˆ์ปค๋ฅผ ์ง€๋„์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
for i in df.index:
    marker = folium.CircleMarker([df['x์ขŒํ‘œ'][i],df['y์ขŒํ‘œ'][i]],
                        radius = (df['ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜'][i]+1)/3000, # ์ธ์› ์ˆ˜๊ฐ€ 0์ผ ๋•Œ ๊ณ„์‚ฐ์˜ค๋ฅ˜ ๋ณด์ •
                        popup = [df['์ง€ํ•˜์ฒ ์—ญ'][i],df['ํ‰๊ท  ์Šน์ฐจ ์ธ์› ์ˆ˜'][i]], 
                        color = 'blue', 
                        fill_color = 'blue')
    marker.add_to(map_osm)

map_osm

 

ํ€ด์ฆˆ2. ๊ฐ•๋‚จ์—ญ์˜ x์ขŒํ‘œ(์œ„๋„)๋ฅผ ๊ตฌํ•˜์„ธ์š”.

# get_nums_and_location() ํ•จ์ˆ˜๋ฅผ ํ™œ์šฉํ•˜๋ฉด ์‰ฝ๊ฒŒ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
# ๊ฐ•๋‚จ์—ญ์€ 2ํ˜ธ์„ ์ด๊ธฐ ๋•Œ๋ฌธ์— df = get_nums_and_location('2ํ˜ธ์„ ', metro_st)์œผ๋กœ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
# df[df['์ง€ํ•˜์ฒ ์—ญ']=='๊ฐ•๋‚จ์—ญ']['x์ขŒํ‘œ']์„ ํ†ตํ•ด ์ปฌ๋Ÿผ '์ง€ํ•˜์ฒ ์—ญ'์ด '๊ฐ•๋‚จ์—ญ'์ธ ํ–‰์„ ์ถ”์ถœํ•˜๊ณ  'x์ขŒํ‘œ'๊ฐ’์„ ๊ตฌํ•ด๋ณด์„ธ์š”.

df = get_nums_and_location('2ํ˜ธ์„ ', metro_st)
x = df[df['์ง€ํ•˜์ฒ ์—ญ']=='๊ฐ•๋‚จ์—ญ']['x์ขŒํ‘œ']
x[0] #37.4970572543978

+ Recent posts