JUST DOWNLOAD THOSE TWO FILES FIRST:
NOTICE
CSV
For the first dataset named as bank.csv you need to read it with sep=";" as even it is named as .csv it is not a “comma seperated csv”.
Or more better if you can read the file once, save it as normal csv and use it later each time as bank_normal.csv
bank <- read.csv("bank.csv", sep = ";")str(bank)length(bank$y)write.csv(bank,"bank_normal.csv")File Encoding
For the second dataset, we have different problem to handle with:
File encoding not UTF8
Extra indexed column
Missing characters or Missing Column Names
Col names becoming …
….
The second dataset is a little bit more to work with so if interested about the process, check the code below:
xlibrary(readr)
# Specify column namescol_names <- c("rented_bike_count", "hour", "temperature", "humidity", "wind_speed", "visibility", "dew_point_temperature", "solar_radiation", "rainfall", "snowfall", "seasons", "holiday", "functioning_day")
# Specify column types for each columncol_types <- cols( rented_bike_count = col_double(), hour = col_double(), temperature = col_double(), humidity = col_double(), wind_speed = col_double(), visibility = col_double(), dew_point_temperature = col_double(), solar_radiation = col_double(), rainfall = col_double(), snowfall = col_double(), seasons = col_character(), holiday = col_character(), functioning_day = col_character())
# Read in the CSV file with the specified column names and typesbike_data <- read_csv("SeoulBikeData.csv", col_names = col_names, col_types = col_types, skip = 1)# Write out the dataset with UTF-8 encodingwrite_csv(bike_data, "bike_data_utf8.csv")
So from now on, you already have two normal datafiels to work with, bank_normal.csv and bike_data_utf8.csv, and can spent your time on real things….