You are using an older browser that might negatively affect how this site is displayed. Please update to a modern browser to have a better experience. Sorry for the inconvenience!

Image Driven Automation Using Sikuli

By: Radha

Image Driven Automation Using Sikuli

If you want to automate anything on multiple platforms/devices and to access pictures at pixel level, then Sikuli is the best automation tool for it. Many automation tools like Cucumber and Selenium have difficulties in interacting with images, but Sikuli provides a solution for these problems. It can be used alone in the form of Sikuli IDE and also in the form of jar file which can be integrated with other automation tools. To learn more about Sikuli, please refer

What is Sikuli and how it is useful in Automation?

Sikuli is an Open-Source product. This is a visual technology tool to automate and test GUI using images without any API references. It can automate Web as well as Window applications in multiple platforms like windows, Linux, MacOS, and Simulators/Emulators.

What Sikuli consists of?

  • Sikuli IDE
  • Sikuli Scripts.
  • Visual Scripting API for Jython.

Sikuli IDE:

Sikuli is a visual approach to search and automate graphical user interfaces using screenshots. It allows users to take a screenshot of a GUI element (such as a toolbar button, icon, or dialog box) and also provides a visual scripting API for automating GUI interactions, using screenshot patterns to direct mouse and keyboard events.

Searching by screenshot is easy to learn and faster to specify than keywords. There are several automation tasks suitable for visual scripting, such as map navigation and bug tracking. It shows how visual scripting can improve interactive help systems.

A programmer can insert a screenshot directly into a script statement and specify what keyboard or mouse actions to invoke when this element is seen on the screen.

Installation of Sikuli IDE:

Sikuli IDE can be installed on Window, Linux, and Mac OS. The latest version of Sikuli is Sikuli 1.0.1. Download it from the below link

Step: 1 Unzip the downloaded Sikuli Package

Step: 2 Go to the installation directory and double click the file name “sikuli-ide.jar”.

Image -1 Sikuli IDE Installation Package

Step: 3 After the installation, Sikuli IDE will be ready for use. The Home page of Sikuli IDE looks similar to the below image.

Image-2 Home page of Sikuli IDE

Sikuli IDE consists of Title Bar, Menu Bar, Tool bar and Command List. Title Bar displays the program name, and the default name is “Untitled”. Menu Bar contains all the available menus like File, Edit, Run, View, Tools and Help. Tool Bar contains the buttons named Take screenshot, Insert Image, Create Region, Run,  Run in Slow Motion and Find. Command List contains Find options, Mouse Actions, Keyboard Actions and Event Observation.

Sikuli Script:

Basically, Sikuli Script is a Jython and Java library that automates GUI interaction using image patterns to direct Keyboard/ mouse events. It can automate anything that we see on the screen without internal API’s support. It can automate a web page, a windows/Linux/Mac OS desktop applications and an iPhone/Android applications running in a simulator.

The Structure of a Sikuli Script:

  • A Sikuli Script is a directory (.sikuli) that consists of a Python source file (.py) and all image files (.png) used by the source file.
  • All images used in a Sikuli Script are simply a path to the .png file in the .sikuli bundle.
  • The python source file .py can be edited by using any script editor.
  • An Extra HTML file is also created in .sikuli Directory is used to share the scripts on the web easily.
  • A Sikuli executable script (.skl) is a zipped file of .sikuli directory. When it is passed as a command line argument, it recognizes the type by checking the file name extension. If it is .skl, Sikuli IDE runs without opening IDE window. If it is .sikuli, IDE opens it in a source code editor.

Sikuli Script File structure will look similar to the below image:

Image-3 Sikuli Script File Structure

Below is the script of composing an email in Gmail Account using Sikuli.

Image – 4 Sikuli Script

Advantages of Sikuli:

  • Testable in varieties of devices and platforms
  • Able to test complex applications.
  • Able to verify images at pixel level
  • Can automate Emulators and Simulators


  • Sikuli can automate only visible things; it could not interact with hidden web elements.
  • Highly dependent on resolution
  • Cannot interact with moving or animated objects


Sikuli is Platform Independent and it works on any GUI that is displayed on Windows/Linux/Mac. It can also can work on Virtual machines, remote desktops and Mobile Simulators like Android, and IPhone. It can interact with Web elements of Flash, HTML+ JavaScript. Sikuli programs are written against GUI instead of an API. It can be used as an IDE and also integrated with other Automation Tools like Selenium and Cucumber.