Apache Apex Development Environment Setup
This document discusses the steps needed for setting up a development environment for creating applications that run on the Apache Apex platform.
There are a few tools that will be helpful when developing Apache Apex applications, including:
git - A revision control system (version 1.7.1 or later). There are multiple git clients available for Windows (http://git-scm.com/download/win for example), so download and install a client of your choice.
java JDK (not JRE) - Includes the Java Runtime Environment as well as the Java compiler and a variety of tools (version 1.7.0_79 or later). Can be downloaded from the Oracle website.
maven - Apache Maven is a build system for Java projects (version 3.0.5 or later). It can be downloaded from https://maven.apache.org/download.cgi.
IDE (Optional) - If you prefer to use an IDE (Integrated Development Environment) such as NetBeans, Eclipse or IntelliJ, install that as well.
After installing these tools, make sure that the directories containing the executable files are in your PATH environment variable.
- Windows - Open a console window and enter the command
echo %PATH%to see the value of the
PATHvariable and verify that the above directories for Java, git, and maven executables are present. JDK executables like java and javac, the directory might be something like
C:\Program Files\Java\jdk1.7.0\_80\bin; for git it might be
C:\Program Files\Git\bin; and for maven it might be
C:\Users\user\Software\apache-maven-3.3.3\bin. If not, you can change its value clicking on the button at Control Panel ⇨ Advanced System Settings ⇨ Advanced tab ⇨ Environment Variables.
- Linux and Mac - Open a console/terminal window and enter the command
echo $PATHto see the value of the
PATHvariable and verify that the above directories for Java, git, and maven executables are present. If not, make sure software is downloaded and installed, and optionally PATH reference is added and exported in a
~/.bash_profile. For example to add maven located in
/sfw/maven/apache-maven-3.3.3to PATH add the line:
Confirm by running the following commands and comparing with output that show in the table below:
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
git version 2.6.1.windows.1
Apache Maven 3.3.3 (7994120775791599e205a5524ec3e0dfe41d4a06; 2015-04-22T06:57:37-05:00)
Creating New Apex Project
After development tools are configured, you can now use the maven archetype to create a basic Apache Apex project. Note: When executing the commands below, replace
3.4.0 by latest available version of Apache Apex.
Windows - Create a new Windows command file called
newapp.cmdby copying the lines below, and execute it. When you run this file, the properties will be displayed and you will be prompted with
Y: :; just press Enter to complete the project generation. The caret (^) at the end of some lines indicates that a continuation line follows.
@echo off @rem Script for creating a new application setlocal mvn archetype:generate ^ -DarchetypeGroupId=org.apache.apex ^ -DarchetypeArtifactId=apex-app-archetype -DarchetypeVersion=3.4.0 ^ -DgroupId=com.example -Dpackage=com.example.myapexapp -DartifactId=myapexapp ^ -Dversion=1.0-SNAPSHOT endlocal
Linux - Execute the lines below in a terminal window. New project will be created in the curent working directory. The backslash (\) at the end of the lines indicates continuation.
mvn archetype:generate \ -DarchetypeGroupId=org.apache.apex \ -DarchetypeArtifactId=apex-app-archetype -DarchetypeVersion=3.4.0 \ -DgroupId=com.example -Dpackage=com.example.myapexapp -DartifactId=myapexapp \ -Dversion=1.0-SNAPSHOT
When the run completes successfully, you should see a new directory named
myapexapp containing a maven project for building a basic Apache Apex application. It includes 3 source files:Application.java, RandomNumberGenerator.java and ApplicationTest.java. You can now build the application by stepping into the new directory and running the maven package command:
cd myapexapp mvn clean package -DskipTests
The build should create the application package file
myapexapp/target/myapexapp-1.0-SNAPSHOT.apa. This application package can then be used to launch example application via apex CLI, or other visual management tools. When running, this application will generate a stream of random numbers and print them out, each prefixed by the string
Running Unit Tests
To run unit tests on Linux or OSX, simply run the usual maven command, for example:
On Windows, an additional file,
winutils.exe, is required; download it from
and unpack the archive to, say,
C:\hadoop; this file should be present under
hadoop-common-2.2.0-bin-master\bin within it.
HADOOP_HOME environment variable system-wide to
c:\hadoop\hadoop-common-2.2.0-bin-master as described at:
https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/sysdm_advancd_environmnt_addchange_variable.mspx?mfr=true. You should now be able to run unit tests normally.
If you prefer not to set the variable globally, you can set it on the command line or within
your IDE. For example, on the command line, specify the maven
mvn -Dhadoop.home.dir=c:\hadoop\hadoop-common-2.2.0-bin-master test
or set the environment variable separately:
set HADOOP_HOME=c:\hadoop\hadoop-common-2.2.0-bin-master mvn test
Within your IDE, set the environment variable and then run the desired unit test in the usual way. For example, with NetBeans you can add:
at Properties ⇒ Actions ⇒ Run project ⇒ Set Properties.
Similarly, in Eclipse (Mars) add it to the project properties at Properties ⇒ Run/Debug Settings ⇒ ApplicationTest ⇒ Environment tab.
Building Apex Demos
If you want to see more substantial Apex demo applications and the associated source code, you can follow these simple steps to check out and build them.
Check out the source code repositories:
git clone https://github.com/apache/apex-core git clone https://github.com/apache/apex-malhar
Switch to the appropriate release branch and build each repository:
cd apex-core mvn clean install -DskipTests cd apex-malhar mvn clean install -DskipTests
install argument to the
mvn command installs resources from each project to your local maven repository (typically
.m2/repository under your home directory), and not to the system directories, so Administrator privileges are not required. The
-DskipTests argument skips running unit tests since they take a long time. If this is a first-time installation, it might take several minutes to complete because maven will download a number of associated plugins.
After the build completes, you should see the demo application package files in the target directory under each demo subdirectory in
To jump start development with an Apache Hadoop single node cluster, DataTorrent Sandbox powered by VirtualBox is available on Windows, Linux, or Mac platforms. The sandbox is configured by default to run with 6GB RAM; if your development machine has 16GB or more, you can increase the sandbox RAM to 8GB or more using the VirtualBox console. This will yield better performance and support larger applications. The advantage of developing in the sandbox is that most of the tools (e.g. jdk, git, maven), Hadoop YARN and HDFS, and a distribution of Apache Apex and DataTorrent RTS are pre-installed. The disadvantage is that the sandbox is a memory-limited environment, and requires settings changes and restarts to adjust memory available for development and testing.