All Projects → alphagov → gsp

alphagov / gsp

Licence: MIT license
GSP is a container platform and curated suite of components helping government deploy, run, observe and secure their services

Programming Languages

go
31211 projects - #10 most used programming language
HCL
1544 projects
shell
77523 projects
Open Policy Agent
39 projects
Dockerfile
14818 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to gsp

butterfly
Application transformation tool
Stars: ✭ 35 (+12.9%)
Mutual labels:  upgrades
paas-aiven-broker
A service broker to provide Aiven Elasticsearch and InfluxDB services to Cloud Foundry users
Stars: ✭ 15 (-51.61%)
Mutual labels:  reliability-engineering
deep cox mixtures
Code for the paper "Deep Cox Mixtures for Survival Regression", Machine Learning for Healthcare Conference 2021
Stars: ✭ 22 (-29.03%)
Mutual labels:  reliability-engineering
puppet-aptly
Puppet module for aptly
Stars: ✭ 31 (+0%)
Mutual labels:  reliability-engineering
MannVsMann
A plugin for Team Fortress 2 that brings Mann vs. Machine upgrades and mechanics to other gamemodes
Stars: ✭ 23 (-25.81%)
Mutual labels:  upgrades
terraform-provider-concourse
A terraform provider for Concourse
Stars: ✭ 49 (+58.06%)
Mutual labels:  reliability-engineering
cli
Reliably CLI - Optimise your operations
Stars: ✭ 2 (-93.55%)
Mutual labels:  reliability-engineering
Awesome Sre
A curated list of Site Reliability and Production Engineering resources.
Stars: ✭ 7,687 (+24696.77%)
Mutual labels:  reliability-engineering
stable-systems-checklist
An opinionated list of attributes and policies that need to be met in order to establish a stable software system.
Stars: ✭ 43 (+38.71%)
Mutual labels:  reliability-engineering
paas-docker-cloudfoundry-tools
No description or website provided.
Stars: ✭ 30 (-3.23%)
Mutual labels:  reliability-engineering
illustrated-python-3-course
Course materials and handouts for Python 3, an illustrated tour course
Stars: ✭ 60 (+93.55%)
Mutual labels:  upgrades

GSP IRC gsp


This project solved some specific needs of GDS. It was not generally useful for people outside of GDS. You should consider using GOV.UK PaaS if you are looking for somewhere to run your services. This is a decommissioning notice detailing issues that would need to be solved in order to re-use this codebase. It only documents issues known at time of repository archiving, when it will cease being updated. For the old README prior to archiving, see README-old.md


GSP (GDS Supported Platform) was a Kubernetes distribution based on Amazon EKS.

Technically:

  • The Kubernetes/EKS version is behind - we're on 1.16 and the latest is 1.20. 1.16 will not be possible to use after July 2021.
  • There's a TODO in pipelines/deployer/deployer.yaml about k8s 1.15 we can probably remove.
  • GSP relies on Istio 1.5.8 which became end-of-life on 2020-08-24. Also 1.6 was end-of-life on 2020-11-23, 1.7 was end-of-life on 2021-02-25, and 1.8 will be end-of-life on 2021-05-12.
  • GSP ran Prometheus and Grafana through prometheus-operator 8.15.6, and it's no longer developed at [email protected]:helm/charts.git stable/prometheus-operator - the latest version of chart we were using is now deprecated.
  • The check-vulnerabilities job in each cluster deployment pipeline would find all sorts of things in the third-party images we used like cluster-autoscaler, concourse-web, external-dns, fluentd-cloudwatch, fluentd-kubernetes-daemonset, and more. Some of these may be resolvable by upgrading the version of the software used.
  • It's based on Terraform but with some weird extra CloudFormation that should be merged into the Terraform - modules/k8s-cluster/data/nodegroup-v2.yaml and (especially obscure, but possibly unnecessary depending on the item below) modules/k8s-cluster/data/nodegroup.yaml.
  • We're not completely certain that the cluster-management nodes are still necessary, it may be possible to put gatekeeper and the cluster-autoscaler on normal worker nodes.
  • There is a strange distinction between the k8s-cluster and gsp-cluster terraform modules which should probably be eliminated.
  • Some of the docs are written with gds-cli in mind but gds-cli dropped support for GSP in version 4, so you'll either need to convince gds-cli to reimplement that support or eliminate the dependency by going back to plain aws-vault.
  • We wrote our own service operator, but then AWS made https://aws.amazon.com/blogs/containers/aws-controllers-for-kubernetes-ack/ which probably obsoletes part of it already - and in future, likely all of it. If you're going to re-use the GSP code, remove the obsolete parts of our service operator (ECR, SQS, S3) and use AWS's, then eventually remove ours all together when they add RDS+ElastiCache
  • We tried to have working, daily replacement of the underlying EC2 instances, but our experience with node rolling in GSP was complicated, with multiple incidents. There are some problems lurking that need tracking down and solving. It may be possible to use EKS Fargate and eliminate the EC2 instances.
  • GSP did not support automatically getting certs for non-govsvc.uk domains (i.e., subdomains of service.gov.uk), leading us to do some odd tricks involving separately terraforming CloudFront distributions in front of GSP. You would probably want to solve this if you use it.
  • The lambda_splunk_forwarder terraform module still lurks but new clusters should be using CSLS to ship logs, so this Lambda should be eliminated and not re-used. It only survived until GSP decommissioning because swapping it out on the existing production cluster may have been an issue that lead to dropped logs.
  • There's still some CloudHSM support lurking in GSP, this should be eliminated and not re-used.
  • It was set up for the govsvc.uk domain which may have to be re-reigstered or transferred.
  • Permissions issues documented here

More broadly:

  • Ultimately, an organisation needs strong justification to run multiple platforms like this. If we were to spin this back up it'd have to replace existing systems.
  • Kubernetes is complicated from a developer point of view, requiring lots of tricky YAML documents.
  • The security model was motivated by a particular high-security project which made it tricky to get things done as everything needed to go through GitOps.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].