Thursday, 31 January 2019

Random forest in R Language

In the random forest approach, a large number of decision trees are created. Every observation is fed into every decision tree. The most common outcome for each observation is used as the final output. A new observation is fed into all the trees and taking a majority vote for each classification model.
An error estimate is made for the cases which were not used while building the tree. That is called an OOB (Out-of-bag) error estimate which is mentioned as a percentage.
The R package "randomForest" is used to create random forests.
Install R Package
Use the below command in R console to install the package. You also have to install the dependent packages if any.
install.packages("randomForest)
The package "randomForest" has the function randomForest() which is used to create and analyze random forests.

SYNTAX
The basic syntax for creating a random forest in R is −
randomForest(formula, data)
Following is the description of the parameters used −
  • formula is a formula describing the predictor and response variables.
  • data is the name of the data set used.

INPUT DATA
We will use the R in-built data set named readingSkills to create a decision tree. It describes the score of someone's readingSkills if we know the variables "age","shoesize","score" and whether the person is a native speaker.
Here is the sample data.
# Load the party package. It will automatically load other required packages.
library(party)

# Print some records from data set readingSkills.
print(head(readingSkills))
When we execute the above code, it produces the following result and chart −
  nativeSpeaker   age   shoeSize      score
1           yes     5   24.83189   32.29385
2           yes     6   25.95238   36.63105
3            no    11   30.42170   49.60593
4           yes     7   28.66450   40.28456
5           yes    11   31.88207   55.46085
6           yes    10   30.07843   52.83124
Loading required package: methods
Loading required package: grid
...............................
...............................

EXAMPLE
We will use the randomForest() function to create the decision tree and see it's graph.
# Load the party package. It will automatically load other required packages.
library(party)
library(randomForest)

# Create the forest.
output.forest <- randomForest(nativeSpeaker ~ age + shoeSize + score, 
           data = readingSkills)

# View the forest results.
print(output.forest) 

# Importance of each predictor.
print(importance(fit,type = 2)) 
When we execute the above code, it produces the following result −
Call:
 randomForest(formula = nativeSpeaker ~ age + shoeSize + score,     
                 data = readingSkills)
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 1

        OOB estimate of  error rate: 1%
Confusion matrix:
    no yes class.error
no  99   1        0.01
yes  1  99        0.01
         MeanDecreaseGini
age              13.95406
shoeSize         18.91006
score            56.73051

CONCLUSION
From the random forest shown above we can conclude that the shoesize and score are the important factors deciding if someone is a native speaker or not. Also the model has only 1% error which means we can predict with 99% accuracy.

Saturday, 26 January 2019

XPath in Selenium Webdriver


Automation 
involves identifying the elements on the UI and performing actions on them, so in order to identify the UI elements there are various locators available on the HTML DOM  by means of which Selenium WebDriver is able to identify and automate them.
In some cases if these elements are not found using other locators such as Id, Class, Css etc then Xpaths are used.

What is Xpath?

XPath (XML Path Language) is a query language for selecting nodes from an XML document.
Xpath language provides the ability to navigate around XML tree and select the  nodes by various criteria.
Below is an example of XML tree:




In this example Product is the Root Element of the tree and the two child nodes of Product are Name and Detail

Understanding HTML Dom
Whenever a web page is loaded, the browser creates a Document Object Model of the page.
The HTML DOM is a standard object model and programming interface for HTML. It defines:
The HTML elements as objects
The properties of all HTML elements
The methods to access all HTML elements
The events for all HTML elements

In other words: The HTML DOM is a standard for how to get, change, add, or delete HTML elements.
The HTML DOM is constructed as a tree of objects:


In order to start studying about xpath it is very important to have some basic knowledge about the DOM structure.

Consider the below HTML DOM of facebook signup page.



Let us concentrate on the highlighted portion of the DOM:
Here "input" is a tag and "class,data-type,name,aria-required,placeholder aria-label and id" are known as attributes.Each of these attributes have some values which are used for creating the html page.

When should xpath be used:

Elements on any UI can be located using various html attributes such as ID,Class,name etc,but many a times it happens that:
1. The attributes that are supported by selenium webdriver are not present 
2. The attributes that are available are not unique.

In such cases xpaths can be used to locate the UI elements. Xpaths can easily traverse through the HTML nodes and this feature provides a lot of flexibility to create unique identifiers
.

Basic syntax of xpath:

//tagname[@attribute_name='attribute_value']

Explanation:
//-Selects nodes in the document from the current node that match the selection no matter where they are

@-Selects attribute

Let us look at a simple example below






We have created an xpath for locating first name text box by using the basic syntax-//tagname[@attribute_name='attribute_value']

In the above created xpath input is the tag name,name is the attribute and 'firstname' is the value of the attribute name.

Typing the above xpath in the html dom highlights the HTML element that points to firstname textbox on the UI which assures that the created xpath is a correct one.

 


Types of Xpath


There are 3 types of xpaths:

1. Absolute xpath
2. Relative xpath
3. Partial xpath

Let us go through each one of them one by one:

1.    Absolute xpath: It uses complete path from the root element to the desired element.The key characteristic of Absolute XPath is that it begins with the single forward slash(/,which means you can select the element from the root node and the root node for any web page is 'html' always.

Absolute xpaths are very fragile and are not used widely since in order to reach a particular node you have to traverse from the beginning till the required node. If at any further point in time extra web elements are added on the page then the absolute xpath that was created becomes invalid.


Below is an example of absolute xpath that can be directly copied by using firepath plugin in Mozilla web browser.



In the above snap we can see that the xpath for Google maps starts from the tag html-this is the root node, which indicates the beginning of the DOM structure. From html we navigate down to google maps node by node.

2.   Relative Xpath:

As we have seen in the above Google maps example, the absolute xpath that is generated is very lengthy and there are high chances that this xpath would vary with every release due to addition or removal of UI elements. To avoid frequent occurrence of 'No such element found' error we can use relative xpaths.
Relative xpath searches for matching element anywhere in the HTML DOM.

It starts with a double slash(//) which indicates to search an element anywhere on the HTML DOM.

Basic syntax for this is:

//tagname[@attribute_name='attribute_value']

3.    Partial Xpath(Dynamic xpath):
     Dynamic/Custom xpaths are used to narrow down the matching nodes which helps to efficiently identify web elements.
      They are also used to create xpaths of webelemnts that change each time when a webpage loads.


      There are various ways in which we can create dynamic xpaths which will be explained one by one below:



1.Using contains() and starts with()

2.Using AND and OR conditions
3.Using multiple attributes
4.Using Siblings
5.Using Ancestors


1.Using contains() and starts-with():

These functions can be used in the below situations:

  •     When the HTML properties/attributes are not very clear for any webelement.
  •     When we want to create a list of elements having same partial value
  •     When the values of attributes are dynamic

Syntax for contains:
  •     By text-//tagname[contains(text(),'anytext')]
  •     //tagname[contains(.,'text')]
Consider below example,xpath for 'Create an account' text is created using contains function



In the above xpath '.' can be replaced by text() method.The only difference is
Eg. //span[contains(text(),'Create an account')]


By Attribute:

//tagname[contains(@attribute_name,'attributevalue')]

Let us look at the below example where we have created an xpath for title 'facebook' using partial attribute name.
If sometimes the html attributes are not very clear as in below example then we can use partial attribute name using contains().




Using starts-with()

  • Similar to contains(),starts-with() function can be used to create partial xpaths. In situations where the attribute values are not clear starts with can be used.For eg.
In the below example the value of attribute class is a bit since there is a space between the 2 words,hence we can use starts-with function here






2.Using AND and OR conditions
In OR expression, two conditions are used, whether 1st condition OR 2nd condition should be true. It is also applicable if any one condition is true or maybe both. Means any one condition should be true to find the element.
In the below XPath expression, it identifies the elements whose single or both conditions are true.
Xpath=//*[@type='submit' OR @name='btnReset']
Highlighting both elements as "LOGIN " element having attribute 'type' and "RESET" element having attribute 'name'.
Similarly AND condition can also be used that combines 2 attributes:
Below is an xpath to identify Experience button in naukri profile:


3.Using multiple attributes:

Multiple attributes can be used to create unique xpaths.

Consider below example where in you have same class name for mutliple webelements and you need an xpath for locating Today's Deal.

Below we have created xpath using only 'class' attribute which shows that there are 7 elements on the webpage that have class name as 'nav-a'



But in order to locate Today's Deal we can take help of another attribute here which is tabindex. Creating xpath using these two attributes will give us a unique xpath.Refer below snapshot




Using xpath axes:

      XPath axis defines a node-set relative to the current node. Names of axes include       “ancestor”, “descendant”, “parent” ,following,preceding


Popular Posts

Categories

100 Python Programs for Beginner (95) AI (38) Android (24) AngularJS (1) Assembly Language (2) aws (17) Azure (7) BI (10) book (4) Books (184) C (77) C# (12) C++ (83) Course (67) Coursera (236) Cybersecurity (25) Data Analytics (2) data management (11) Data Science (138) Data Strucures (8) Deep Learning (21) Django (14) Downloads (3) edx (2) Engineering (14) Euron (22) Excel (13) Factorial (1) Finance (6) flask (3) flutter (1) FPL (17) Generative AI (5) Google (34) Hadoop (3) HTML Quiz (1) HTML&CSS (47) IBM (30) IoT (1) IS (25) Java (93) Java quiz (1) Leet Code (4) Machine Learning (67) Meta (22) MICHIGAN (5) microsoft (4) Nvidia (4) Pandas (4) PHP (20) Projects (29) pyth (1) Python (965) Python Coding Challenge (411) Python Quiz (62) Python Tips (3) Questions (2) R (70) React (6) Scripting (1) security (3) Selenium Webdriver (4) Software (17) SQL (42) UX Research (1) web application (8) Web development (4) web scraping (2)

Followers

Person climbing a staircase. Learn Data Science from Scratch: online program with 21 courses