https://www.kaggle.com/datasets/hmavrodiev/london-bike-sharing-dataset
London bike sharing dataset
Historical data for bike sharing in London 'Powered by TfL Open Data'
www.kaggle.com
import numpy as np
import pandas as pd
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))
import matplotlib.pyplot as plt
import seaborn as sns
import missingno as msno
df = pd.read_csv('/kaggle/input/london-bike-sharing-dataset/london_merged.csv', parse_dates = ['timestamp'])
df.head()
print('데이터 구조 :', df.shape)
print('데이터 타입 :', df.dtypes)
print('데이터 컬럼 :', df.columns)
df.isna().sum()
msno.matrix(df)
plt.show()
df['year'] = df['timestamp'].dt.year
df['month'] = df['timestamp'].dt.month
df['dayofweek'] = df['timestamp'].dt.dayofweek
df['hour'] = df['timestamp'].dt.hour
df.head()
df['year'].value_counts()
# df['month'].value_counts()
# df['dayofweek'].value_counts()
# df['weather_code'].value_counts()
a, b = plt.subplots(1,1, figsize=(10, 5))
sns.boxplot(df['year'], df['cnt'])
a, b = plt.subplots(1,1, figsize=(10, 5))
sns.boxplot(df['month'], df['cnt'])
a, b = plt.subplots(1,1, figsize=(10, 5))
sns.boxplot(df['dayofweek'], df['cnt'])
a, b = plt.subplots(1,1, figsize=(10, 5))
sns.boxplot(df['hour'], df['cnt'])
def plot_bar(data, feature):
fig = plt.figure(figsize = (12, 3))
sns.barplot(x = feature, y = 'cnt', data = data, palette = 'Set3')
plot_bar(df, 'hour')
plot_bar(df, 'dayofweek')
'Programming Language > Python' 카테고리의 다른 글
[Python] 데이터프레임의 구조, 타입, 컬럼 확인하는 방법 (0) | 2022.05.04 |
---|---|
[Python] read_csv의 parse_dates 옵션 사용해보기 (0) | 2022.05.04 |
[Kaggle] London bike sharing dataset 데이터 살펴보기 (0) | 2022.05.04 |
[Python] RL, 강화학습 Frozen-Lake Dummy Q-Learning 실행하는 방법 (0) | 2022.05.02 |
[Python] RL, 강화학습 Frozen Lake v1 자동실행 하는 방법 (0) | 2022.05.02 |