Data frames are generic data objects of R, used to store tabular data.
Code :-
# Introduction to data frames
vec1 = c(1,2,3)
vec2 = c("R","Scilab","Java")
vec3 = c("For prototyping","For prototyping","For Scaleup")
df = data.frame(vec1,vec2,vec3)
print(df)
Console Output
Create a dataframe using data from a file
- A dataframe can also be created by reading data from a file using the following command.
- In the path, please use '/' instead of '/' .
- A separator can also be used to distinguish between entries. Default separator is space, ' ' .
Accessing rows and columns
- df[val1,val2] refers to row "val1" , column "val2" . Can be number or sting.
- "val1" or "val2" can also be array of values like "1:2" or "c(1,3)".
- df[val2] (no commas) - just refers to column "val2" only
Code :-
# accessing first & second row:
print(df[1:2,])
# accessing first & second column:
print(df[,1:2])
# accessing 1st & 2nd column -
# alternate:
print(df[1:2])
Output :-
Subset :-
Subset( ) which extracts subset of data based on conditions.
Editing dataframes
- A dataframe can also be edited using the edit( ) command
- Create an instance of data frame and use edit command to open a table editor, changes can be manually made.
Adding extra rows and columns
Extra row can be added with "rbind" function and extra column with "cbind".
Deleting rows and columns
There are several ways to delete a row/column, some cases are shown below.
Manipulating rows - the factor issue
- When character columns are created in a data .frame, they become factors
- Factor variables are those where the character column is split into categories or factor levels.
Resolving factor issue
New entries need to be consistent with factor levels which are fixed when the dataframe is first created.