Saturday, 18 August 2018

Data Handling - Importing CSV and Tabular data files in R Language

Setting up directories

→ We can change the current working directory as follows:
> setwd ("<location of the dataset>")

Example:
> setwd ("C":/RCourse/")
or
> setwd ("C:\\RCourse\\")

→ The following command returns the current working directory:

> getwd ( )
[1] "C:/RCourse/"

Importing Data Files

Suppose we have some data on our computer and we want to import it in R.

Different formats of files can be read in R
  • comma-separated values (CSV) data files,
  • table file (TXT)
  • Spreadsheet (e.g., MS Excel) file,
  • files from other software like SPSS, Minitab etc.

One can also read or upload the file from Internet site.

We can read the file containing rent index data from website:
http://home.iitk.ac.in/~shalab/Rcourse/munichdata.asc

as follows

> datamunich <- read.table (file = 
"http://home.iitk.ac.in/~shalab/Rcourse/munichdata.asc", header = TRUE)

File name is munichdata.asc

Comma-seperate values (CSV) files

First set the working directory where the CSV file is located.
setwd ("<location of your dataset>")

>setwd ("C:/RCourse/")


To read a CSV file
syntax: read.CSV ("filename.CSV")

Example:
> data <- read.CSV ("examplel.CSV")

Comma-separated values (CSV) files

Example:
> data <- read.CSV ("examplel.CSV")
> data
      X1    X10   X100
 1      2       20      200
 2      3       30      300
 3      4       40      400
 4      5       50      500

 Notice the difference in the first rows of excel file and output

Comma-separated values (CSV) files

Data files have many formats and accordingly we have options for loading them.

If the data file does not have headers in the first row, then use

data <- read.CSV ("datafile.CSV", header=FALSE)


Comma-separated values (CSV) files
The  resulting data frame will have columns named V1, V2, ...
We can rename the header names manually:

Comma-separated values (CSV) files
We can set the delimiter with sep.
If it is tab delimited, use  sep="\t".
data <- read.CSV ("datafile.CSV", sep="\t")

If it is space-delimited, use sep=" ".
data <- read.CSV ("datafile.CSV", sep= "  ") 

Reading Tabular Data Files

Tabular data files are test files with a simple format:
  • Each linee contains one record.
  • Within each record, fields (items) are separated by a one-character delimiter, such as a space, tab, colon, or comma.
  • Each record contains the same number of fields.
we want to read a text file that contains a table of data.
read.table function is used and it returns a data frame.
read.table ("FileName") 

0 Comments:

Post a Comment

Popular Posts

Categories

AI (33) Android (24) AngularJS (1) Assembly Language (2) aws (17) Azure (7) BI (10) book (4) Books (146) C (77) C# (12) C++ (82) Course (67) Coursera (198) Cybersecurity (24) data management (11) Data Science (106) Data Strucures (8) Deep Learning (13) Django (14) Downloads (3) edx (2) Engineering (14) Excel (13) Factorial (1) Finance (6) flask (3) flutter (1) FPL (17) Google (21) Hadoop (3) HTML&CSS (47) IBM (25) IoT (1) IS (25) Java (93) Leet Code (4) Machine Learning (46) Meta (18) MICHIGAN (5) microsoft (4) Nvidia (1) Pandas (3) PHP (20) Projects (29) Python (893) Python Coding Challenge (285) Questions (2) R (70) React (6) Scripting (1) security (3) Selenium Webdriver (2) Software (17) SQL (42) UX Research (1) web application (8)

Followers

Person climbing a staircase. Learn Data Science from Scratch: online program with 21 courses