mirror of
https://github.com/game-ci/unity-builder.git
synced 2026-01-29 03:59:08 +08:00
Cloud Runner v0 - Reliable and trimmed down cloud runner (#353)
* Update cloud-runner-aws-pipeline.yml * Update cloud-runner-k8s-pipeline.yml * yarn build * yarn build * correct branch ref * correct branch ref passed to target repo * Create k8s-tests.yml * Delete k8s-tests.yml * correct branch ref passed to target repo * correct branch ref passed to target repo * Always describe AWS tasks for now, because unstable error handling * Remove unused tree commands * Use lfs guid sum * Simple override cache push * Simple override cache push and pull override to allow pure cloud storage driven caching * Removal of early branch (breaks lfs caching) * Remove unused tree commands * Update action.yml * Update action.yml * Support cache and input override commands as input + full support custom hooks * Increase k8s timeout * replace filename being appended for unknclear reason * cache key should not contain whitespaces * Always try and deploy rook for k8s * Apply k8s files for rook * Update action.yml * Apply k8s files for rook * Apply k8s files for rook * cache test and action description for kuber storage class * Correct test and implement dependency health check and start * GCP-secret run, cache key * lfs smudge set explicit and undo explicit * Run using external secret provider to speed up input * Update cloud-runner-aws-pipeline.yml * Add nodejs as build step dependency * Add nodejs as build step dependency * Cloud Runner Tests must be specified to capture logs from cloud runner for tests * Cloud Runner Tests must be specified to capture logs from cloud runner for tests * Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs * Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs * Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs * Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs * Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs * better defaults for new inputs * better defaults * merge latest * force build update * use npm n to update node in unity builder * use npm n to update node in unity builder * use npm n to update node in unity builder * correct new line * quiet zipping * quiet zipping * default secrets for unity username and password * default secrets for unity username and password * ls active directory before lfs install * Get cloud runner secrets from * Get cloud runner secrets from * Cleanup setup of default secrets * Various fixes * Cleanup setup of default secrets * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * Various fixes * AWS secrets manager support * less caching logs * default k8s storage class to pd-standard * more readable build commands * Capture aws exit code 1 reliably * Always replace /head from branch * k8s default storage class to standard-rwo * cleanup * further cleanup input * further cleanup input * further cleanup input * further cleanup input * further cleanup input * folder sizes to inspect caching * dir command for local cloud runner test * k8s wait for pending because pvc will not create earlier * prefer k8s standard storage * handle empty string as cloud runner cluster input * local-system is now used for cloud runner test implementation AND correctly unset test CLI input * local-system is now used for cloud runner test implementation AND correctly unset test CLI input * fix unterminated quote * fix unterminated quote * do not share build parameters in tests - in cloud runner this will cause conflicts with resouces of the same name * remove head and heads from branch prefix * fix reversed caching direction of cache-push * fixes * fixes * fixes * cachePull cli * fixes * fixes * fixes * fixes * fixes * order cache test to be first * order cache test to be first * fixes * populate cache key instead of using branch * cleanup cli * garbage-collect-aws cli can iterate over aws resources and cli scans all ts files * import cli methods * import cli files explicitly * import cli files explicitly * import cli files explicitly * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * import cli methods * log parameters in cloud runner parameter test * log parameters in cloud runner parameter test * log parameters in cloud runner parameter test * Cloud runner param test before caching because we have a fast local cache test now * Using custom build path relative to repo root rather than project root * aws-garbage-collect at end of pipeline * aws-garbage-collect do not actually delete anything for now - just list * remove some legacy du commands * Update cloud-runner-aws-pipeline.yml * log contents after cache pull and fix some scenarios with duplicate secrets * log contents after cache pull and fix some scenarios with duplicate secrets * log contents after cache pull and fix some scenarios with duplicate secrets * PR comments * Replace guid with uuid package * use fileExists lambda instead of stat to check file exists in caching * build failed results in core error message * Delete sample.txt
This commit is contained in:
201
src/model/cloud-runner/providers/k8s/index.ts
Normal file
201
src/model/cloud-runner/providers/k8s/index.ts
Normal file
@@ -0,0 +1,201 @@
|
||||
import * as k8s from '@kubernetes/client-node';
|
||||
import { BuildParameters, Output } from '../../..';
|
||||
import * as core from '@actions/core';
|
||||
import { ProviderInterface } from '../provider-interface';
|
||||
import CloudRunnerSecret from '../../services/cloud-runner-secret';
|
||||
import KubernetesStorage from './kubernetes-storage';
|
||||
import CloudRunnerEnvironmentVariable from '../../services/cloud-runner-environment-variable';
|
||||
import KubernetesTaskRunner from './kubernetes-task-runner';
|
||||
import KubernetesSecret from './kubernetes-secret';
|
||||
import waitUntil from 'async-wait-until';
|
||||
import KubernetesJobSpecFactory from './kubernetes-job-spec-factory';
|
||||
import KubernetesServiceAccount from './kubernetes-service-account';
|
||||
import CloudRunnerLogger from '../../services/cloud-runner-logger';
|
||||
import { CoreV1Api } from '@kubernetes/client-node';
|
||||
import DependencyOverrideService from '../../services/depdency-override-service';
|
||||
|
||||
class Kubernetes implements ProviderInterface {
|
||||
private kubeConfig: k8s.KubeConfig;
|
||||
private kubeClient: k8s.CoreV1Api;
|
||||
private kubeClientBatch: k8s.BatchV1Api;
|
||||
private buildGuid: string = '';
|
||||
private buildParameters: BuildParameters;
|
||||
private pvcName: string = '';
|
||||
private secretName: string = '';
|
||||
private jobName: string = '';
|
||||
private namespace: string;
|
||||
private podName: string = '';
|
||||
private containerName: string = '';
|
||||
private cleanupCronJobName: string = '';
|
||||
private serviceAccountName: string = '';
|
||||
|
||||
constructor(buildParameters: BuildParameters) {
|
||||
this.kubeConfig = new k8s.KubeConfig();
|
||||
this.kubeConfig.loadFromDefault();
|
||||
this.kubeClient = this.kubeConfig.makeApiClient(k8s.CoreV1Api);
|
||||
this.kubeClientBatch = this.kubeConfig.makeApiClient(k8s.BatchV1Api);
|
||||
CloudRunnerLogger.log('Loaded default Kubernetes configuration for this environment');
|
||||
|
||||
this.namespace = 'default';
|
||||
this.buildParameters = buildParameters;
|
||||
}
|
||||
public async setup(
|
||||
buildGuid: string,
|
||||
buildParameters: BuildParameters,
|
||||
// eslint-disable-next-line no-unused-vars
|
||||
branchName: string,
|
||||
// eslint-disable-next-line no-unused-vars
|
||||
defaultSecretsArray: { ParameterKey: string; EnvironmentVariable: string; ParameterValue: string }[],
|
||||
) {
|
||||
try {
|
||||
this.pvcName = `unity-builder-pvc-${buildGuid}`;
|
||||
this.cleanupCronJobName = `unity-builder-cronjob-${buildGuid}`;
|
||||
this.serviceAccountName = `service-account-${buildGuid}`;
|
||||
if (await DependencyOverrideService.CheckHealth()) {
|
||||
await DependencyOverrideService.TryStartDependencies();
|
||||
}
|
||||
await KubernetesStorage.createPersistentVolumeClaim(
|
||||
buildParameters,
|
||||
this.pvcName,
|
||||
this.kubeClient,
|
||||
this.namespace,
|
||||
);
|
||||
|
||||
await KubernetesServiceAccount.createServiceAccount(this.serviceAccountName, this.namespace, this.kubeClient);
|
||||
} catch (error) {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
async runTask(
|
||||
buildGuid: string,
|
||||
image: string,
|
||||
commands: string,
|
||||
mountdir: string,
|
||||
workingdir: string,
|
||||
environment: CloudRunnerEnvironmentVariable[],
|
||||
secrets: CloudRunnerSecret[],
|
||||
): Promise<string> {
|
||||
try {
|
||||
// setup
|
||||
this.buildGuid = buildGuid;
|
||||
this.secretName = `build-credentials-${buildGuid}`;
|
||||
this.jobName = `unity-builder-job-${buildGuid}`;
|
||||
this.containerName = `main`;
|
||||
await KubernetesSecret.createSecret(secrets, this.secretName, this.namespace, this.kubeClient);
|
||||
const jobSpec = KubernetesJobSpecFactory.getJobSpec(
|
||||
commands,
|
||||
image,
|
||||
mountdir,
|
||||
workingdir,
|
||||
environment,
|
||||
secrets,
|
||||
this.buildGuid,
|
||||
this.buildParameters,
|
||||
this.secretName,
|
||||
this.pvcName,
|
||||
this.jobName,
|
||||
k8s,
|
||||
);
|
||||
|
||||
//run
|
||||
const jobResult = await this.kubeClientBatch.createNamespacedJob(this.namespace, jobSpec);
|
||||
CloudRunnerLogger.log(`Creating build job ${JSON.stringify(jobResult.body.metadata, undefined, 4)}`);
|
||||
|
||||
await new Promise((promise) => setTimeout(promise, 5000));
|
||||
CloudRunnerLogger.log('Job created');
|
||||
this.setPodNameAndContainerName(await Kubernetes.findPodFromJob(this.kubeClient, this.jobName, this.namespace));
|
||||
CloudRunnerLogger.log('Watching pod until running');
|
||||
let output = '';
|
||||
// eslint-disable-next-line no-constant-condition
|
||||
while (true) {
|
||||
try {
|
||||
await KubernetesTaskRunner.watchUntilPodRunning(this.kubeClient, this.podName, this.namespace);
|
||||
CloudRunnerLogger.log('Pod running, streaming logs');
|
||||
output = await KubernetesTaskRunner.runTask(
|
||||
this.kubeConfig,
|
||||
this.kubeClient,
|
||||
this.jobName,
|
||||
this.podName,
|
||||
'main',
|
||||
this.namespace,
|
||||
CloudRunnerLogger.log,
|
||||
);
|
||||
break;
|
||||
} catch (error: any) {
|
||||
if (error.message.includes(`HTTP`)) {
|
||||
continue;
|
||||
} else {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
await this.cleanupTaskResources();
|
||||
return output;
|
||||
} catch (error) {
|
||||
CloudRunnerLogger.log('Running job failed');
|
||||
core.error(JSON.stringify(error, undefined, 4));
|
||||
await this.cleanupTaskResources();
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
setPodNameAndContainerName(pod: k8s.V1Pod) {
|
||||
this.podName = pod.metadata?.name || '';
|
||||
this.containerName = pod.status?.containerStatuses?.[0].name || '';
|
||||
}
|
||||
|
||||
async cleanupTaskResources() {
|
||||
CloudRunnerLogger.log('cleaning up');
|
||||
try {
|
||||
await this.kubeClientBatch.deleteNamespacedJob(this.jobName, this.namespace);
|
||||
await this.kubeClient.deleteNamespacedPod(this.podName, this.namespace);
|
||||
await this.kubeClient.deleteNamespacedSecret(this.secretName, this.namespace);
|
||||
await new Promise((promise) => setTimeout(promise, 5000));
|
||||
} catch (error) {
|
||||
CloudRunnerLogger.log('Failed to cleanup, error:');
|
||||
core.error(JSON.stringify(error, undefined, 4));
|
||||
CloudRunnerLogger.log('Abandoning cleanup, build error:');
|
||||
throw error;
|
||||
}
|
||||
try {
|
||||
await waitUntil(
|
||||
async () => {
|
||||
const jobBody = (await this.kubeClientBatch.readNamespacedJob(this.jobName, this.namespace)).body;
|
||||
const podBody = (await this.kubeClient.readNamespacedPod(this.podName, this.namespace)).body;
|
||||
return (jobBody === null || jobBody.status?.active === 0) && podBody === null;
|
||||
},
|
||||
{
|
||||
timeout: 500000,
|
||||
intervalBetweenAttempts: 15000,
|
||||
},
|
||||
);
|
||||
// eslint-disable-next-line no-empty
|
||||
} catch {}
|
||||
}
|
||||
|
||||
async cleanup(
|
||||
buildGuid: string,
|
||||
buildParameters: BuildParameters,
|
||||
// eslint-disable-next-line no-unused-vars
|
||||
branchName: string,
|
||||
// eslint-disable-next-line no-unused-vars
|
||||
defaultSecretsArray: { ParameterKey: string; EnvironmentVariable: string; ParameterValue: string }[],
|
||||
) {
|
||||
CloudRunnerLogger.log(`deleting PVC`);
|
||||
await this.kubeClient.deleteNamespacedPersistentVolumeClaim(this.pvcName, this.namespace);
|
||||
await Output.setBuildVersion(buildParameters.buildVersion);
|
||||
// eslint-disable-next-line unicorn/no-process-exit
|
||||
process.exit();
|
||||
}
|
||||
|
||||
static async findPodFromJob(kubeClient: CoreV1Api, jobName: string, namespace: string) {
|
||||
const namespacedPods = await kubeClient.listNamespacedPod(namespace);
|
||||
const pod = namespacedPods.body.items.find((x) => x.metadata?.labels?.['job-name'] === jobName);
|
||||
if (pod === undefined) {
|
||||
throw new Error("pod with job-name label doesn't exist");
|
||||
}
|
||||
return pod;
|
||||
}
|
||||
}
|
||||
export default Kubernetes;
|
||||
Reference in New Issue
Block a user