Introduction to Selenium And Appium

If you are developing websites or mobile applications, then you must have heard about Selenium or Appium. If you are familiar with them and had the chance to use them, then we congratulate you on keeping up with the industry standards. You may now go back to work.

If you haven’t heard about them, or if you know about them but never really used themm or used them on a very basic level, then this post is just for you.

What is Selenium or Appium?

Selenium and Appium are frameworks for automated UI testing. They allow you to automate user interaction with websites or apps. Unlike real user interaction, Selenium and Appium let you scale up by running a very large set of tests on various devices or platforms which in turn ensures high test coverage.

How does it work really?

Selenium and Appium are capable of identifying, locating and interacting with a website’s DOM (Document Object Model, the HTML standard) or an application’s Object Structure. So if I had a webpage that had the following HTML as part of its structure making up a login form:

I could then automate a login test using Selenium. Selenium is able to identify the input elements and inject values into them. It can then identify and click on the submit button. Expand this functionality to whole pages and entire site, and you can automate complete workflows from signup and login and all the way to posting on forums and purchasing products.

Appium works much the same way. Like Selenium, it is capable of identifying HTML markup and interact with it but it is more advanced than Selenium in that it is also able to identify and interact with mobile app object structure. The object structure is a bit different than HTML but the idea is the same.

See the object structure below of an app’s login page (the app is running on an iOS device)

Don’t be alarmed if this looks bloated. It is simply the elements and their corresponding properties.

Selenium and Appium recognize the elements by a path (in string format) that uniquely identifies an element. The path can be XPath or reference an element’s property such as id or name. Basically any property can be used in order to identify elements.


XPath is a query language for selecting nodes from an XML document, but it works also for HTML.
If I ask Selenium to find the element that corresponds to //*[@id=”inputEmail”], it will scan the entire node structure for an element that has an id attribute with the value inputEmail. See the HTML example above and see if you can recognize the element that Selenium will locate. Telling Selenium how to locate the element, allows it to interact with it. In this case since the element is an input element of type text, Selenium will be able to inject or remove text. If we told Selenium to locate an element that corresponds to the XPath //*[@type=”submit”], Selenium would find the submit button and we would then be able to tell Selenium to click on it. Appium operates in much the same way. If I ask Appium to find the element that corresponds to //*[@text=”usernameTextField”] it will scan the entire node structure for an element that has a text attribute with the value usernameTextField. This is where I should warn you that the text attribute is bound to change. If chars have been injected into this element, then the //*[@text=”usernameTextField”] will no longer work because the text value has changed. It is recommended to use some static attribute value pair such as label or id //*[@label=”usernameTextField”] as they are not likely to change.

The Driver

The driver is the bread and butter of Selenium and Appium. It is a class packed full of commands and static properties that let you do almost anything when testing your website or mobile app. Selenium even goes as far as executing javascript, switching between tabs, and identifying frames within a webpage. Appium is capable of handling hybrid apps (apps that combine native and web elements), install applications and launch them. These capabilities are just the tip of what Selenium and Appium can do.

WebDriver – Selenium

Selenium needs the browser driver in order to interact with the UI of a website. Let’s look at simple selenium test written in Java:

You can see the import statements including the required classes, WebDriver and ChromeDriver. The test creates a new driver as ChromeDriver, goes to and gets the site title. It then populates the search field with the phrase “Appium and Selenium” and hits the search button. That’s it. The core capability of Selenium is its ability to find elements using the method findElement. Once the element has been located, you can populate it with data, click on it, clear and more. As mentioned earlier there is so much more you can do with Selenium. Review the list of available classes and methods to learn more.


AppiumDriver is an expansion of WebDriver that is capable of interacting with native and hybrid mobile apps. Unlike Selenium, Appium requires a set of capabilities that determine that nature of the driver, the nature of test and what the test depends on in terms of the application and the devices that are tested. Let’s look at simple Appium test written in Java:


What this test does is to install an application on an iPhone 6 that runs iOS 10.3. The test then launches the app and fetches the page source which is the node structure of the app’s default page. Look at the import section and notice that Appium has much more dependencies than Selenium does. Appium also needs capabilities that determine which app will be tested and on which device. Finally, Appium requires a server to run on, in which case the server is located at our localhost, port 4723. Like Selenium, Appium harbors a lot of potential. To learn what classes, methods and properties Appium has, review the list of available classes and methods to learn more.

Let’s Recap

Selenium and Appium are both designed to automate UI testing. They rely on identifying elements within the DOM of a website or node structure of a mobile app, and then interacting with these elements in various ways. Combining the identification of and interaction with elements unlocks the potential that is embodied within Appium and Selenium. Said potential is the ability to create a set of test cases that covers the UI/UX level of each and every feature that you incorporate into your website or mobile app.

Limitations of Selenium and Appium

Although widely used by developers and QA testers, Selenium and Appium don’t come without limitations. Both only automated UI testing. They don’t have access to the code that make up the site or app. They require some setting up, especially Appium. To run Selenium or Appium tests, you must have some basic programming knowledge and the ability to setup your testing environment by fetching and integrating Selenium dependencies. Appium is even more complex in that regard. To run Appium tests you have to install Appium server using npm and even then, you are limited to testing Android on Windows and Android and iOS on Mac. If you only have windows, testing on iOS will not be possible. In addition, it is somewhat complex to run parallel tests using Appium as you either have to launch multiple instances of Appium server, or register Appium with Selenium Grid. Appium also reacts badly to failed tests, often forcing you to kill and restart the server. solves most of the issues above.

Heads Up! Exceptions on The Way

One of the most challenging things about working with Selenium or Appium is that they are hardwired to fail whenever they are unable to execute a command. If you were looking for an element and it was not there, or the driver was unable to locate it (wrong XPATH, page or app didn’t fully load, elements hidden by other elements), the test would fail and stop running. You could catch the exception and handle it but that should be done carefully. Only catch those exceptions whose origin is not crucial for the completion of the test. Since Appium and Selenium are really unforgiving when it comes to not being able to locate elements or take actions (such as sending text or performing gestures), it is important that you develop your tests step by step, making sure that each step succeeds before adding additional steps.

Benefits of Using for Running your Selenium and Appium Tests expands Selenium and Appium by improving your test coverage. allows you to run tests in parallel, without having to setup Selenium grid or launch multiple instances of Appium server. In addition, includes capabilities that are not included in Appium such as:

  • device query for pinpointing a specified subset of devices, i.e. all Samsung and LG devices running Android 6 and above
    executing scripts
  • setting device Geo-Location
  • setting test speed
  • playing audio files for testing music recognition apps
  • Uninstalling applications
  • Testing outside the scope the app
  • Monitoring CPU, Memory and Network
  • And much more… also features Appium Studio, a desktop application compatible with Windows or Mac, which allows you to develop and run your automated tests with little to no programming knowledge. In under 10 minutes you can install Appium Studio, plug in your devices (Android and iOS alike, no matter the operating system), install your applications and start develop your tests.

Bottom Line

What you want to do is complement your manual QA with extensive and scalable test coverage that is based on top of automated test cases. Appium and Selenium are the natural choices for this purpose. With, it’s now easier than ever before.