1. Introduction
This document is intended for IT system architects who would like to have a detailed understanding about the system architecture and enterprise integration of the Techila Distributed Computing Engine (TDCE).
Please note that this document focuses on high level design and does not contain instructions on how to perform specific configurations. If you are unfamiliar with the TDCE system, please see Introduction to Techila Distributed Computing Engine.
System Architecture contains a description of different types of TDCE installations. These include on-premise, full cloud, hybrid, and multi-site installations.
Internal architecture contains a description of the internal architecture of TDCE core components.
Environment Administration Overview contains an overview of the administrative side of TDCE. This includes information about how to view the status and which events can be used to trigger automated email reports. Information is also provided about usage generating reports.
Software Release Deployment contains a description about how TDCE is deployed and updated in different environments.
Quality Features contains high level information about the security, performance and maintainability features in TDCE. Also includes pointers to the applicable documentation containing more detailed information about these areas.
2. System Architecture
This Chapter contains an overview of the Techila Distributed Computing Engine (TDCE) software components and different types of TDCE installations.
2.1. Overview
TDCE consists of following software components:
-
One Techila Server
-
One or several Techila Workers
-
One or several Techila SDKs
A minimalistic TDCE system consists of the components is illustrated in Figure 1 below.
The table below explains the network connections used in TDCE. Please refer to this table when viewing figures in this Chapter.
Data Channel1 | Signal Channel2 | Management Channel3 | Web Access4 | License Server5 | External Storage6 | |
---|---|---|---|---|---|---|
Port |
TCP/20002 |
TCP/20001 |
TCP/25001 and TCP/25002 |
TCP/443 |
TCP/80 or TCP/443 |
Not specified |
Protocol |
HTTPS |
HTTPS + XML-RPC like |
HTTPS + XML-RPC |
HTTPS |
HTTP or HTTPS |
Not specified |
Direction of Initialization |
Techila Worker → Techila Server |
Techila Worker → Techila Server |
End-User → Techila Server |
Browser →Techila Server |
Techila Server → Techila License Server |
Techila Worker → Storage, |
Description |
Transfer slots and request queues in server. |
Always open. |
Synchronous requests. |
Techila Web Interface |
Connects only intermittently. |
Opened by End-User’s code. |
Following Chapters contain more detailed environment descriptions in different usage scenarios.
2.2. On-Premise Installation
On-premise installations can be used when a customer needs to have the computing environment in their on-premise facilities. In this case, the Techila Server will be deployed as a virtual appliance that will run in the customer’s own VMware ESXi environment. The Techila Workers will be installed from installation packages to customer’s own clusters and/or workstations.
Techila Server virtual appliance includes two separate virtual hard disk images (system and data disks), used to store Techila Server data. The system disk is the root partition of the operating system and is used to store operating system specific data. The initial size of the system hard disk image is approximately 700 MB, and the maximum size is 8 GB.
The data disk will be used to store data generated during the usage of TDCE. This disk is also known as the Bundle Drive in the Techila Web Interface. This includes computational result files, Bundle files created by End-Users and the Techila Server database. The initial size of the data hard disk is approximately 8 MB, because no data has been stored on the disk yet. By default, the maximum size of the data disk is set to 64 GB. This can be configured before starting the virtual appliance to the ESXi environment. The data disk is partitioned and the file system will be created when the Techila Server virtual appliance is started for the first time.
Please see Techila Distributed Computing Engine Product Description for information on the related minimum technical operating environment.
The image below illustrates the architecture of an on-premise environment.
Techila License Server is used to monitor the usage of the Techila license(s) installed into Techila Server. Connection from the Techila Server to the Techila License Server can be made optional in systems, which do not use usage-based licensing, if agreed separately in the license agreement. Data transferred to the Techila License Server only contains information on the Techila license usage, meaning no sensitive data will be transferred.
2.3. Multisite Installation
Multisite installations can be used when a customer needs to have access to computational capacity located on several sites. In this case, the Techila Server will be deployed as virtual appliance that will run in the customer’s own ESXi environment on the primary site. The sites can be connected by using VPN gateways, which will enable Techila Workers from multiple sites to connect to the Techila Server on the primary site.
Techila Workers can also be added from public cloud providers by creating a VPN gateway between the primary site and public cloud environment. Creating a VPN gateway is not required, but if there is no VPN gateway, the Techila Server ports used by the Signal and Data Channels will need to be accessible from the Internet.
Running a VPN gateway in the cloud can incur a small additional financial cost. The benefit of using a VPN tunnel for communication is that there is no need to open ports for incoming access to the Techila Server from the public Internet.
Please see the Techila Distributed Computing Engine Product Description for information on the related minimum technical operating environment.
2.4. Hybrid Installation
Environments, where TDCE software components are running on both on-premise hardware and virtual machines on cloud instances, are called hybrid environments. In hybrid environments, additional computing capacity can be acquired from supported public cloud providers to satisfy demand during peak load intervals. This is called cloud bursting. TDCE supports organizational and End-User specific cloud accounts, which enable individual End-Users to deploy computing capacity based on their individual needs.
Techila Workers can also be added from public cloud providers by creating an optional VPN gateway between the on-premise site and public cloud environment. Creating a VPN gateway is not required, but if there is no VPN gateway, the Techila Server ports used by the Signal and Data Channels will need to be accessible from the Internet.
Running a VPN gateway in the cloud can incur a small additional financial cost. The benefit of using a VPN tunnel for communication is that there is no need to open ports for incoming access to the Techila Server from the public Internet.
Please see the Techila Distributed Computing Engine Product Description for information on the related minimum technical operating environment.
2.5. Full Cloud Installation
Full cloud installations can be used when a customer wants to have access to scalable computation capacity from outside the customer’s on-premise facilities. Full cloud installations are available via the public cloud provider marketplace offerings:
-
Amazon Web Services (AWS)
-
Google Cloud Platform
The following Chapters contain more detailed descriptions of environments in different cloud environments.
2.5.1. Amazon Web Services - Marketplace
This Chapter contains a description of the TDCE components used in AWS.
Primary components:
-
Amazon EC2 (Elastic Compute Cloud) - Used to deploy instances for Techila Server and Techila Workers.
-
EBS (Elastic Block Storage) - Used to store data generated by the Techila Server, making the Techila Server persistent between reboots.
-
Security Groups - Used to configure firewall rules for the Techila Server and Techila Workers. Allows P2P communication between Techila Workers and between Techila Server and Techila Workers.
-
Elastic IP - Used to allocate a fixed IP address for the Techila Server
-
-
Amazon VPC (Virtual Private Cloud) - Used to isolate the Techila Server and Techila Worker instances from any other instances running in EC2
-
Amazon S3 (Simple Storage Service) - Can be used by the user to store large amounts of data and quickly access the data without using the Bundle transfer mechanism. Improves performance when performing data intensive computations. Can be accessed by the Techila Workers and End-Users.
Please see the Techila Distributed Computing Engine Product Description for information on the related minimum technical operating environment.
2.5.2. Google Cloud Platform - Using Techila Distributed Computing Engine in Google Cloud Marketplace
This Chapter contains a description of the TDCE components used in Google Cloud Platform environment when using Techila Distributed Computing Engine in Google Cloud Marketplace
Primary Google Cloud Platform components:
-
Google Compute Engine (GCE) - Used to deploy instances for Techila Server and Techila Workers
-
Google Cloud Storage - Used to store image sources used to deploy Techila Server and Techila Workers
Optional Google Cloud Platform components:
-
Google Cloud Storage - Can also be used to store large amounts of data and quickly access the data without using the Bundle transfer mechanism. Improves performance when performing data intensive computations. Can be accessed by the Techila Workers and End-Users.
-
Google BigQuery - Can be used to store large amounts of structured data and to quickly access it using the BigQuery API. Can be accessed by the Techila Workers and End-Users
Please see the Techila Distributed Computing Engine Product Description for information on the related minimum technical operating environment.
3. Internal architecture
Techila Distributed Computing Engine (TDCE) consists of multiple components, which contain core functionality such as network communication, security and database services. Core components can be divided into the following categories:
-
Common Services
-
Techila Server
-
Techila Worker
Components belonging to the Common Services category are used by both the Techila Server and Techila Workers. Other core components are specific to either the Techila Server or the Techila Worker.
All 3rd party licenses are available in "licenses" directory of the TDCE software installation.
3.1. Core Components
The architecture stack is shown below.
The software components are explained the following Chapters.
3.1.1. Core Bundles
All Core Bundles in the Techila Server and Techila Workers are OSGi modules. These modules contain essential functionality such as network communication, security and database services. These modules are controlled by an OSGi framework, which is running on top of Java Virtual Machine.
3.1.2. OSGi Framework
The OSGi framework is a module system for the Java programming language that implements a component model. This model enables application components to be controlled in a flexible manner without requiring a reboot.
Techila Server software and Techila Worker software run on top of an OSGi framework. The OSGi framework extends the Java execution environment with lifecycle management making it more dynamic and more controllable. This allows on-the-fly updates, replaceable components and excellent options for configuring, monitoring and debugging the environment.
Please refer to the following web page for more information on the OSGi framework:
3.1.3. Java Runtime Environment
The Java Runtime Environment (JRE) consists of Java APIs and the Java Virtual Machine (JVM). This runtime environment is used to execute the Techila Server software and Techila Worker software.
All Techila Worker installation packages include a JRE, which will be automatically installed on the Techila Worker when the Techila Worker software is installed. This means no additional Java components will need to be installed on the Techila Worker. The JREs on the Techila Server and Techila Workers will be automatically updated when the Techila Server is updated using a Techila Service Pack.
3.1.4. Java Service Wrapper
TDCE uses a Java Service Wrapper, which enables a Java Application to be run as a Windows Service or UNIX Daemon.
The Java service wrapper is used to automatically launch the Techila Server and Techila Worker processes. Processes are automatically launched in the following situations:
-
The computer running the TDCE processes is rebooted
-
The Java process becomes unresponsive
-
The Java process requests a restart from the service wrapper
After the processes have been launched, the wrapper will monitor the Techila Server and Techila Worker Java processes and will automatically restart the processes if required.
3.2. Dependencies to External Environment
3.2.1. Simple Mail Transfer Protocol (SMTP)
The Techila Server can be configured to send email notifications to the Techila Administrator and/or to the End-Users. This requires configuring an SMTP server access for the Techila Server. The Techila Server will use this SMTP server to send the email notifications.
Configuring email notifications is an optional feature. Enabling this feature is recommended because it can provide warnings and proactive reports, and notify the administrator before any serious problems arise.
Please see Techila Server Monitoring and Techila Worker Monitoring for more information about email notifications
4. Environment Administration Overview
Techila Distributed Computing Engine (TDCE) includes several user-friendly methods for monitoring, managing and reporting.
4.1. Environment Monitoring
This Chapter describes how the TDCE environment can be monitored.
4.1.1. Techila Server Monitoring
The status of the Techila Server can be monitored from the 'Status' page of the Techila Web Interface.
This page displays information about the following hardware properties of the Techila Server:
-
Total amount of disk space on the Techila Server Bundle Drive
-
Amount of free disk space on the Techila Server Bundle Drive
-
Amount of used disk space by Project results and Bundles (all Bundles)
The status page also displays information about the Techila License. This information includes:
-
The Techila License validity period
-
Maximum Techila Worker Cores allowed (capacity-based license)
-
CPU Hours Used / Allowed (usage-based license)
Please refer to Techila Distributed Computing Engine Administration Guide for more information about the 'Status' page.
The Techila Server can also be configured to send automatic email reports to the local Techila Administrator based on different triggers. Email reports can include the following information:
-
Low disk space warning for the Techila Server (including amount of free disk space left)
-
Notifications of untrusted and expired End-User Keys
In addition to the accessing information from the Techila Web Interface and email notifications, a Java service wrapper is always running on the computer that is running the Techila Server processes. The service wrapper monitors the JVM process and automatically restarts it if the JVM has crashed or become unresponsive. This can be executed quickly, taking only a few seconds once the service wrapper has observed that there is a problem.
Restarting the Techila Server processes or rebooting the computer running the Techila Server processes does not affect the status of Jobs being currently processed on Techila Workers. If Techila Server is rebooted, Techila Workers will continue processing the computational Jobs without interruption. Techila Workers will also automatically re-establish a network connection to the Techila Server after the reboot has been completed. If Jobs have been completed on Techila Workers while the Techila Server has been offline, the Job results will be automatically transferred to the Techila Server.
If there are active management connections from Techila SDKs to the Techila Server when the Techila Server processes are interrupted, these connections will either be retried or interrupted. This behavior will depend on the Techila SDK configuration. If the connection is interrupted while Project results are being transferred to the End-Users computer, results will be automatically stored on the Techila Server. Results can be re-downloaded after the connection has been re-stablished.
Additionally, if the customer is using 3rd party monitoring software such as SNMP (Simple Network Management Protocol), the customer can install this software directly to the operating system of the Techila Server. This requires that the software supports the operating system of the Techila Server. Please refer to the software installation instructions applicable for your software. Techila Technologies does not support customer’s own software customizations or 3rd party software not included in the TDCE standard product.
4.1.2. Techila Worker Monitoring
The status of the Techila Workers can be monitored from the 'Status' and 'Workers' page of the Techila Web Interface. The 'Status' page gives a general overview of the status of Techila Workers, including the current status of Techila Workers.
More detailed information can be displayed in the Workers page, which includes information about the following Techila Worker properties:
-
Operating system and processor architecture
-
Number of CPU cores (physical and virtual)
-
Number of Jobs currently being processed
-
Total amount of memory and hard disk space
-
Amount of free hard disk space
Please refer to the Techila Distributed Computing Engine Administration Guide for more information about the information displayed on the Worker pages.
The Techila Server can also be configured to send automatic email reports about the status of Techila Workers. The email reports can include the following information:
-
Low disk space warning for the Techila Worker (including amount of free disk space left)
-
Notifications of untrusted and expired Techila Worker Keys
-
Notifications of high error count on Techila Workers
Similarly as with the Techila Server, the Java service wrapper is always running on Techila Workers and monitors the JVM (Java Virtual Machine) process. The wrapper will automatically restart it if the JVM has crashed or has become unresponsive.
If the Techila Worker processes are restarted while computational Jobs are being processed on the Techila Worker, the Jobs will be automatically re-assigned to other available Techila Workers. If the computational Project has been configured to use the Snapshot feature, computations will be automatically resumed using the latest available snapshot file, minimizing the amount computational work lost caused by the interruption.
4.1.3. End-User CPU Time Quota Monitoring
End-User CPU time quotas can be configured on the 'End-User Quota' page of the Techila Web Interface. CPU time quotas can be specified for the following time windows:
-
Daily
-
Weekly
-
Monthly
-
Total
Quota usage information will be automatically updated and displayed on the 'End-User Quota' page. If an End-User exceeds their allowed CPU time quota, Jobs from their Projects will not be assigned to Techila Workers.
The Techila Administrator can increase, decrease, enable or disable quotas at any given time. For example, if an End-User has exceeded their CPU time quota while computing a Project, the Techila Administrator can increase their quota to allow computations to continue.
4.2. Management
This Chapter describes different concepts and tools that are used to manage Techila End-Users.
4.2.1. User Management
Techila Account
All Techila End-Users need a Techila account. This account will be used to grant (or deny) the End-User access to computational resources. The End-User certificate will be linked to the Techila Account. This account can be used to log in to the Techila Web Interface.
Techila Keytool
The Techila Keytool is used to create End-User Keys. All Techila End-Users need a personal End-User Key. During the key generation process, the End-User certificate will be transferred to the Techila Server and linked to the End-User’s Techila Account.
Please see the Techila Distributed Computing Engine Security Statement for more information about TDCE security features.
Techila Administrator Command Line Interface
End-Users can also be added and removed using the Techila Administrator Command Line Interface (CLI). The Techila Administrator CLI is included in the Techila SDK.
A script can be written to be used for integration with Active Directory. This script should use the Techila Administrator CLI and will require access to both an End-User Key and an Administrator Key.
4.2.2. Certificate Management
All communication peers in Techila Distributed Computing Engine (TDCE) are authenticated using X.509 certificates. Please see Techila Distributed Computing Engine Security Statement for more information.
4.3. Logging
Please see Techila Distributed Computing Engine Security Statement for information about the logging mechanisms.
4.4. Reporting
Reports are typically used to extract CPU time usage information from the TDCE environment. New reports can be added by using the Techila Web Interface when logged in as a user with administrative access. See Techila Distributed Computing Engine Reporting Guide for more information.
4.4.1. Usage Reporting
Reports can run by using the Techila Web Interface or the Techila Administrator CLI. The list below describes some of the most commonly used reports:
-
CPU time usage of all End-Users
-
CPU time used in a specified time interval
-
End-User specific CPU time usages
-
End-User specific CPU time reports grouped at intervals
Local Techila Administrators are able to write custom SQL queries and store these as reports using the Techila Web Interface.
4.4.2. License Reporting
The status of the Techila License can be viewed on the 'Status' page of the Techila Web Interface. If multiple valid Techila Licenses are in use, the allowed amounts are summed together and the total value is displayed. Techila License data can also be retrieved in a programmatic manner by using the Techila Administrator CLI to generate a suitable report on the Techila Server.
5. Software Release Deployment
Techila Distributed Computing Engine (TDCE) software requires one initial installation. System updates are performed using Techila Service Packs.
5.1. Initial System Deployment
The steps for initial TDCE system deployment in different types of environments are described in the following Chapters.
5.1.1. On-Premises
Deploying a Techila Virtual Server is performed by importing the virtual appliance into the ESXi environment. Techila Workers are installed using an operating system specific installation packages. All required installation packages can be downloaded from the URLs given by Techila support staff.
When performing the initial installation, it is recommended to use Techila Server and Techila Worker installation packages that contain the same versions of the core components. This will ensure that no unnecessary update procedures are performed on the Techila Workers after installation.
Version of installation packages can be verified from the installation package release dates.
5.1.2. Hybrid
Hybrid installations are performed similarly as on-premise environments.
5.1.3. Multi-site
Multi-site installations are performed similarly as hybrid environments.
5.1.4. Public Cloud
TDCE environments in public clouds can be deployed via the public cloud provider’s marketplace offering.
5.2. System Updates
Updating is performed by uploading a Techila Service Pack to the Techila Server. The Techila Service Pack contains updates for the Techila Server and Techila Workers. The upload is performed by using the Techila Web Interface. After uploading the Techila Service Pack, the Techila Server will update all required Server-side components.
After the Techila Server has been updated, the Techila Workers will automatically download the updated Techila Worker -side components from the Techila Server. After new components have been downloaded, the Techila Worker will install the updates and re-connect to the Techila Server.
5.2.1. On-Premises
On-premise environments are updated by uploading the Techila Service Pack to the Techila Server. After uploading the Service Pack, the Techila Server and Techila Workers will be updated automatically.
After updating the Techila Server, it is recommended to perform any new Techila Worker installations from the latest Techila Worker installation packages, which can be downloaded from the URLs given by Techila support staff. This will ensure that no unnecessary update procedures are performed after the installation.
5.2.2. Hybrid
Hybrid environments are updated by uploading the Techila Service Pack to the Techila Server. After uploading the Service Pack, the Techila Server and Techila Workers will be updated automatically.
New Techila Workers deployed in supported public cloud environments are automatically installed using the latest available Techila Worker installation packages.
5.2.3. Multi-site
Multi-site environments are updated similarly as hybrid environments.
5.2.4. Full Cloud
Full cloud environments are updated by uploading the Techila Service Pack to the Techila Server.
In supported public cloud environments, new Techila Workers will always be installed from the latest available installation packages. No additional action is required to update Techila Worker deployments.
6. Quality Features
6.1. Security
Please see Techila Distributed Computing Engine Security Statement for more information about the security features in TDCE.
6.2. Performance
Techila Technologies performs cloud benchmarks, where the performance of major public cloud providers is compared using computational use cases.
The latest benchmark performed by Techila Technologies can be viewed and downloaded from the following web site:
6.3. High Availability
The high availability features of TDCE allow computational workload to be processed normally on Techila Workers even in situations where the Techila Server or Techila Worker processes are interrupted.
Please see Chapters Techila Server Monitoring and Techila Worker Monitoring for more information about how interruptions of the Techila Server and Techila Worker processes are managed.
6.4. Maintainability
A TDCE environment, consisting of a Techila Server and multiple Techila Workers, is updated by transferring and approving a Techila Service Pack to the Techila Server. Updates included in the Service Pack will be automatically transferred to all Techila Workers, which will be automatically updated. More information about updating the environment by using a Service Pack can be found in Techila Distributed Computing Engine Administration Guide.
Information about new Service Pack releases will be delivered to all Techila Administrators to via email. The latest Service Pack is always available for download from the URLs given by Techila support staff.
Techila SDKs used by End-Users are updated by downloading the latest Techila SDK from the URL given by Techila support staff and installing the Techila SDK over the existing Techila SDK installation. Programming language specific steps can be found here. Information about new features included in a Service Pack or Techila SDK release can be found in the release notes, which are available in the URL given by Techila support staff.