Configuring Secure Endpoints (Azure)

Use this section if you want to run your QDS clusters in a more secure environment than QDS implements by default, or if you want to understand Azure Secure Endpoints and what your options are.

About Secure Endpoints

Secure endpoints extend your virtual network private address space, and the identity of your VNet, over a direct connection to the Azure services you use for QDS clusters, allowing you to restrict access to your Azure resources to your own virtual networks exclusively. Traffic from your VNet to the Azure services always remains on the Microsoft Azure backbone network .

Advantages

Secure endpoints provide the following benefits:

  • Improved security for your Azure resources: Whereas VNet private address spaces can overlap, and so cannot be used to uniquely identify traffic originating from your VNet, secure endpoints allow you to restrict access to your Azure resources to your virtual network exclusively.
  • Optimal routing for Azure service traffic from your virtual network: By default, any routing in your virtual network that forces internet traffic to your physical premises or virtual appliances (forced tunneling), also forces Azure service traffic to take the same route as the internet traffic. Secure endpoints allow you to separate Azure traffic and provide optimal routing for it.
  • All traffic on the Azure backbone: Secure endpoints always take service traffic directly from your virtual network to the service on the Microsoft Azure backbone network. Keeping traffic on the Azure backbone allows you to use forced tunneling to audit and monitor outbound internet traffic from your virtual networks, without impeding service traffic. See the Azure documentation for more information.
  • Simple set-up and less management overhead: If you use secure endpoints, you do not need reserved, public IP addresses in your virtual networks to allow Azure resources secure access through your firewall. Secure endpoints require no NAT or gateway devices and can be configured by simply clicking on a subnet in the Azure portal. Maintaining the endpoints requires no additional overhead.

Open versus Closed Endpoints in QDS

You can deploy QDS on Azure with:

Open Endpoints

../../_images/open-endpoints.png

This is the Azure default, and is the configuration you have in QDS if you followed the instructions in the Azure Quick Start Guide. All virtual machines created within a Vnet with open endpoints are assigned a public IP address and communicate via public IP. Security is provided by the Network Security Group (NSG) for the VNet in which the virtual machines run. QDS generates an exception rule that allows access to and from the Qubole tunnel server. This configuration provides good security, but there is a drawback: because the Azure storage account accepts connections from any source, the data could be exposed to external attack.

Closed Endpoints

../../_images/closed-endpoints.png

In this case, network traffic uses private IP addresses and is confined to the internal Azure backbone network. Access to the storage account is limited to approved Vnets and allowed IP addresses. See Configuring Secure Storage Endpoints without a Bastion Host for setup instructions.

Closed Endpoints with a Bastion Host

../../_images/closed-endpoints-with-Bastion.png

If you need an additional layer of security, you can add a secure Bastion host in a separate public subnet in the same Vnet in which rhe cluster runs. This ensures that all communication between the Qubole Tunnel Server and your Azure Vnet is SSL-encrypted. A drawback with any Bastion configuration is that the Bastion host could become a network bottleneck. See Configuring Secure Storage Endpoints with a Bastion Host for setup instructions.

Configuring Endpoints for QDS

Configuring Open Endpoints

This is the default setup that you are already using if you followed the instructions in the Azure Quick Start Guide.

Configuring Secure Storage Endpoints without a Bastion Host

Secure storage endpoints can be used with Blob and ADLS storage accounts. They restrict access to the storage account to approved VNets and allowed IP addresses. A firewall rule change is required to allow the Qubole tunnel server access to the storage account.

Prerequisites:

  • An Azure storage account (either Blob or ADLS).
  • An Azure VNet with at least one subnet.

VNet Setup

  1. Log in to the Azure Portal and navigate to the VNet where you will be enabling the secure endpoints:
../../_images/Vnet.png
  1. In the Navigator, click on Service Endpoints under the Settings submenu:
../../_images/choose-service-endpoints.png
  1. In the main window, click on the + Add button to add a new endpoint. Select the Microsoft.Storage service in the Service selection box and the subnet you want to secure with the endpoint rule:

    ../../_images/add-service-endpoints.png

    Let the Service Endpoint Policies setting default.

Storage Account Setup

  1. In the Azure Portal, navigate to the storage account you need to secure:

    • Either Blob/ADLS Gen2 Storage Account:
    ../../_images/storage-account-ex-1(Gen2).png
    • Or ADLS Gen1 Storage Account:
    ../../_images/storage-account-ex-2(ADLSGen1).png
  2. In the Navigator, click on Firewalls and virtual networks under Settings:

    ../../_images/choose-firewall.png
  3. In the main window, select the selected networks radio button. This opens up the configurable options.

  4. Under Virtual Networks, click the + Add existing virtual network and select the VNet and subnet to which you want to secure the storage account:

../../_images/add-networks.png
  1. In the main window, under Firewall, make sure that the Qubole tunnel server IP address (currently 52.44.223.209) is added to the list of IP exceptions:
../../_images/check-exceptions.png
  1. If you are using ADLS Gen1, make sure that the Allow all Azure services to access… check box is selected under Exceptions:
../../_images/allow-all-Azure-services.png

Note

This is essential in order to ensure that virtual machines have access to the ADLS account; otherwise jobs will not run on your clusters.

Configuring Secure Storage Endpoints with a Bastion Host

This configuration is essentially the same as the no-Bastion setup, but the addition of the Bastion node provides an additional layer of security, preventing the running cluster nodes’ public IP addresses from being externally visible.

Prerequisites

  • An Azure storage account (either Blob or ADLS).
  • An Azure VNet with at least two subnets. Follow your own naming conventions, but the designation of these two subnets should be:
    • “public”, meaning that it will not be included in the Secure Storage Endpoint configuration.
    • “private”, meaning that it will be included in the Secure Storage Endpoint configuration.
  • A Network Security Group rule denying all inbound access from the internet and allowing access only from VNets associated with this Azure account.

VNet and Storage Account Setup

Follow instructions under Configuring Secure Storage Endpoints without a Bastion Host to set up the storage endpoints. Make sure that the rules apply only to the “private” subnet in your VNet. Do not enable any endpoint on the “public” subnet.

Configuring a Network Security Group

If you have not already done so, create a custom Network Security Group with the following rules, to ensure that all inbound access is restricted appropriately:

../../_images/inbound-security-rules.png

These are the rules that are configured by default when you create a Network Security Group. By default, no access from the internet is allowed. This is exactly what you want.

Adding the Custom Network Security Group to the Private Subnet

  1. In the Azure portal, navigate to the VNet you will be configuring:
  2. Under Settings click on Subnets:
../../_images/choose-subnets.png
  1. In the main window, you should see the two subnets you configured. Select the “private” one to continue:
../../_images/select-private-subnet.png
  1. Click on Network Security Group and find the custom Network Security Group with the security rules you pre-configured, and enable it on the subnet as follows:
../../_images/enable-private-NSG.png
  1. Save all changes.

Bastion Setup

In the Azure Portal, follow the steps to create a virtual machine to serve as the Bastion host:

../../_images/create-virtual-machine.png

The following recommendations will help you to avoid any pitfalls when configuring the Bastion host:

  • Choose a machine type that can handle high network throughput.

  • Use the latest CentOS or Ubuntu Server image. Don’t use a Windows image.

  • Set the authentication type to SSH public Key and add security credentials so you can access the Bastion host later to complete the setup.

  • Launch the Bastion host in the “public” subnet of the VNet that you configured earlier.

    The virtual machine (VM) setup wizard in the Azure Portal will automatically create a Network Security Group and associate it with the VM. Make sure there is an ingress rule to allow access from the Qubole tunnel server:

    ../../_images/ingress-rule.png

    You may also want to add your own IP address to the ingress rules list if you need to complete the setup after the Bastion host has been launched.

Wait for the Bastion host to start up, and then complete the setup following these instructions.