John Aaron Nelson


April 19, 2015

What is it?
Selenium is a web automation framework that provides a unified API for manipulating browsers. It can be interactively built with a Firefox plugin (v 2.9), and also provides a rich architecture to code against using a variety of the most popular programming languages. Its practical purpose is it to replay actions in a browser in an interactive and code based manner and the most popular use is automated testing for web developers.  PhantomJS is a very similar technology, running exclusively with JavaScript in Webkit, but we will be talking about Selenium here as it is a broader technology.

Selenium is a relatively mature web technology. Begun in 2004, it currently sits at version 2.45.  The technology has been implemented in many popular languages via ”Bindings”.  Selenium allows us to control any browser, including headless browsers like HtmlUnit, using ”Drivers”.  The promise of Selenium is we can write once and test everywhere.  Selenium uses your language of choice on a computer of your choice and communicates with the browser of your choice.  Selenium also includes incredibly good logging so detailed notes about our tests are kept for us.

Why do we need it?
Regression hurts.  We don't want to break what we've already built.  The real test of any application is not in the beauty of the code, but that the app works as requested.  We must automate our manual tests so we can get visibility into whether our app works or not. Applications often break in frustratingly simple ways during optimization.  This technology should allow us to break that pattern by making it easier to test.
Lets get started!
The Selenium IDE is the easiest way to get started with writing Selenium tests. The IDE is a Firefox plugin, functionally similar to a macro recorder, recording our actions using browser events converted to Selenium commands. The IDE is started by pressing the Ctrl+Shift+S buttons.  Now, everything done in the browser will be recorded for playback.

When Selenium opens, we see the command table. The command table allows us to see the recorded commands.  When an event in the browser fires, it will show up in the table.  The source code bound to this table can be seen by clicking the source tab.  In the source tab, we see Selenese, the source language for Selenium tests.  Documentation for Selenese is built into the IDE and displays when a command is clicked.  Double clicking a command will execute it in the browser.

Understanding Selenium
Selenese statements are made up of commands, locators, and values.  Commands are the actions to take, Locators (selectors) allow us to select a DOM element for evaluation. Values allow us to set and check values of the selected element.  It is a simple, yet powerful language and syntax.

The testing part of Selenium is primarily achieved with assert and verify. Assert and verify allow us to make a test fail. Failing is good, it allows us to test. These commands, and others, can be found in the context (right-click) menu of the IDE.

Moving on to code based automation; using Visual Studio 2013 to write a console application to automate Firefox. The first thing to do after creating the console application is to use Nuget to bring in the Selenium WebDriver package.  A simple sample…

class Program    {
   static void Main(string[] args)        {
     IWebDriver driver = new FirefoxDriver();
     driver.Url = "";
     var searchBox = driver.FindElement(By.Name("q"));
     searchBox.SendKeys("Hello World");
     var imageLinks = driver.FindElements(By.LinkText("Images"));
     IWebElement imageMenuItem = imageLinks.First();
     var images = driver.FindElements(By.CssSelector("div.rg_el:nth-child(1) > a:nth-child(1) > img:nth-child(1)"));
     var image = images.First();
Moving on to more advanced stuff...
Add the "Selenium Webdriver Support Classes" from Nuget to enhance element selection abilities, using "new SelectElement(FindElement(By...))".  Alternatively, use firebug to select the xpath and use the built-in "FindElement(By.XPath(""))".  Alternatively Firefox has a "Copy unique selector" option.

Every time we select an element, Selenium waits using an implicit wait, it polls the DOM every half a second.  As a result of different factors, we may run into selector issues due to an element not being available.  An alternative is “Explicit Waits”, using the support classes referenced above.

WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(5));
wait.Until(e => {
var elements = driver.FindElements(By.ClassName("qs"));
  if (elements.Count > 0) {
    return elements[0];
  return null;
Selenium Server
Selenium runs in three modes: local, server, and grid. Selenium Server operates via HTTP. Its big win is cross-platform testing. The only practical difference is that when we instantiate our IWebDriver we use "RemoteDriver" and give it the server address, optionally specify browser expectations. The Java server provides a remote interface to the same WebDriver drivers as when run locally.  The servers are configured with JSON.

To start the server, Java (JRE) must be installed be in the PATH to run the server.  Browse to the folder containing the Selenium JAR file. Insure Java works by typing “Java” in the terminal. Then execute the JAR.

$ java -jar selenium-standalone-2.45.0.jar -port 44

The web server will start at localhost:44/wd/hub.  The default port is 4444.

Grid mode is an alternate server mode which turns the server into a hub (orchestrator).  The hub will communicate with and delegate tests to other servers automagically with JSON.  Grid mode is useful for distribution of resource requirements and running tests under different environments, like Mac or Linux.

To start the server as a hub, run

$ java -jar selenium-standalone-2.45.0.jar -role hub -port 4444

The hub server will start at localhost:4444/grid/console. If you browse to the address you will get configuration settings of the hub. This is also where you will see the log

To start the server as a node, run

$ java -jar selenium-standalone-2.45.0.jar -role node -hub http://localhost:4444/grid/register

The node server will start at localhost:5555

The grid command is exactly the same as when executing against a regular server, but the hub will delegate the test to another server, depending on requirements and capabilities.  For detailed setups, configuration settings are your friend.  Saucelabs can provide a Selenium grid in the cloud, for a price.

John Nelson

Written by John Nelson, who lives and works in Chattanooga, building things for Clearbit.