Known Issues and Troubleshooting

Known Issues

This section describes the known issues related to the Qubole-Talend integration.

  • a SPARK_HOME issue: When running a Spark Job, you may encounter the following issue.

    [ERROR]: org.apache.spark.SparkContext - Error initializing SparkContext.
    java.util.NoSuchElementException: key not found: SPARK_HOME
    

    Workaround

    Install a spark-client on the machine where this Job is executed to resolve this issue.

    1. Stop the JobServer. If you directly use the Studio to run your Job, stop the Talend Studio.

    2. Download the supported Spark version from Apache Spark

      In this example, download
      Spark release = 2.0.2 (Nov 14 2016)
      package type = Pre-built for Apache Hadoop 2.6
      
    3. Upload the zip file to the machine of the JobServer and unzip it to the directory of your choice. For example: /tmp/spark-2.0.2-bin-hadoop2.6. If you directly use the Studio to run your Job, perform these operations on the machine where your Studio is installed.

    4. Export the environment variable SPARK_HOME using the export SPARK_HOME command: If you have unzipped the downloaded Spark zip to /tmp/spark-2.0.2-bin-hadoop2.6, this command to be used is .. sourcecode:: bash

      export SPARK_HOME=/tmp/spark-2.0.2-bin-hadoop2.6

    5. Restart the JobServer. If you directly use the Studio to run your Job, restart the Studio.

Troubleshooting

This section describes the errors and the resolutions related to the Qubole-Talend integration.

Error: Locating the winutils binary in the hadoop binary path over windows fails.

Resolution:

  1. Install a full native windows Hadoop version or get the WINUTILS.EXE binary from a Hadoop redistribution. See https://github.com/steveloughran/winutils

  2. Set the environment variable %HADOOP_HOME% to point to the directory above the BIN dir containing WINUTILS.EXE.

  3. Navigate to Designer >> Advanced Settings. Select Use specific JVM arguements as shown below.

    ../../../../_images/jvm-settings.png
  4. If you use job script for the data integration, set the system property for `hadoop.home.dir` with `HADOOP_HOME` path.

    The following snippet shows how to set the hadoop home directory on a Windows system.

    if(System.getProperty(\"os.name\").toLowerCase().indexOf(\"win\") >= 0) {
    System.setProperty(\"hadoop.home.dir\", \"C:/Program Files/hadoop\");