Skip to content

Commit

Permalink
SW2 Cleanup (#13)
Browse files Browse the repository at this point in the history
* Update code for versions

* Adding tests

* Update dockerfile

* update pydantic


---------

Co-authored-by: Boris <[email protected]>
  • Loading branch information
bio-boris and Boris authored Sep 25, 2023
1 parent 956001e commit 90485bb
Show file tree
Hide file tree
Showing 56 changed files with 2,460 additions and 925 deletions.
5 changes: 2 additions & 3 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,7 @@ SERVICE_WIZARD_ADMIN_ROLE="SERVICE_WIZARD_ADMIN"
CATALOG_ADMIN_TOKEN="REDACTED"

# Kubernetes configs
# Note this also creates a toleration V1Toleration(effect="NoSchedule", key=namespace, operator="Exists")
KUBECONFIG="~/.kube/config"
NAMESPACE="staging-dynamic-services" # Note this also creates a toleration V1Toleration(effect="NoSchedule", key=namespace, operator="Exists")
NAMESPACE="staging-dynamic-services"
USE_INCLUSTER_CONFIG="false"
TAINT_TOLERATION_EXPRESSIONS=""
# APP_AFFINITY_FILE="app-affinity.yaml" # path to files that force apps to run on specific nodes
6 changes: 2 additions & 4 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@


name: "Code Scanning - Action"

on:
push:
branches: [main, develop]
branches: [ main, develop ]
pull_request:
branches: [main, develop]
branches: [ main, develop ]
schedule:
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/manual-build.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: Manual Build & Push
on:
workflow_dispatch:
workflow_dispatch:
jobs:
build-push:
uses: kbase/.github/.github/workflows/reusable_build-push.yml@main
Expand Down
20 changes: 10 additions & 10 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files

- repo: https://github.com/psf/black
- repo: https://github.com/psf/black
rev: 23.7.0
hooks:
- id: black
- id: black
language_version: python3.11

- repo: https://github.com/pycqa/flake8
- repo: https://github.com/pycqa/flake8
rev: 6.1.0
hooks:
- id: flake8
args: [--config, pyproject.toml]
- id: flake8
args: [ --config, pyproject.toml ]
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM python:3.11
FROM python:3.11.4-bookworm

RUN mkdir -p /app
WORKDIR /app
Expand Down
7 changes: 3 additions & 4 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@ verify_ssl = true
name = "pypi"

[packages]
fastapi = "==0.95.2"
uvicorn = {version = "==0.22.0", extras = ["standard"]}
fastapi = "==0.101.1"
uvicorn = { version = "==0.22.0", extras = ["standard"] }
sentry-sdk = "==1.25.0"
PySocks = "==1.7.1"
requests = "==2.31.0"
prometheus-fastapi-instrumentator = "==6.0.0"
pydantic = "==1.10.8"
prometheus-fastapi-instrumentator = "==6.0.0"
cacheout = "==0.14.1"
jinja-cli = "==1.2.2"
python-dotenv = "==0.19.1"
Expand Down
260 changes: 173 additions & 87 deletions Pipfile.lock

Large diffs are not rendered by default.

181 changes: 70 additions & 111 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,44 @@
# Service Wizard 2
[![codecov](https://codecov.io/gh/kbase/service_wizard2/graph/badge.svg?token=JxuP8XOFwU)](https://codecov.io/gh/kbase/service_wizard2)

The service wizard manages the lifecycle of "dynamic services" in KBase.
The previous service wizard talked directly to rancher1, this one talks directly to kubernetes.
Dynamic services are responsible for providing data and/or UI components for the KBase UI and Narrative.
Dynamic services are responsible for providing data and/or UI components for the KBase UI and Narrative.

# Known issues
* Still does not allow you to update environmental variables for a service that was launched once, it requires a new deployment.

* Still does not allow you to update environmental variables for a service that was launched once, it requires a new
deployment.
* Starting up too many services causes the status endpoint to not respond.
* Only supports one type of toleration for now.
* Doesn't completely support multiple replicas for now.
* Doesn't support volumes, only bind mounts
* Doesn't yet support forcing a dynamic service to land on a specific host (e.g. staticnarrative service, htmlfilsetservice) or define behavior for multiple replicas on specific hosts
* If the catalog admin is not valid, you get an authentication error, but its not clear that its the auth token from the service rather than from the user request

* Doesn't yet support forcing a dynamic service to land on a specific host (e.g. staticnarrative service,
htmlfilsetservice) or define behavior for multiple replicas on specific hosts
* If the catalog admin is not valid, you get an authentication error, but its not clear that its the auth token from the
service rather than from the user request

# Environment Variables

The following environment variables are used to configure the application:
The following environment variables are used to configure the application.
Ensure that all the required environment variables are properly set before running the application.
See [.env](.env) file for example

## *Required Environment Variables*

## Client URLs

- `AUTH_SERVICE_URL`: Defines the URL of the authentication service used for user authentication and authorization.
- `CATALOG_URL`: Sets the URL for the catalog service, which manages and provides access to application catalogs.
- `AUTH_LEGACY_URL`: Defines the URL of the legacy authentication service to be appended to the env inside the dynamic service

- `AUTH_LEGACY_URL`: Defines the URL of the legacy authentication service to be appended to the env inside the dynamic
service

## Service Wizard URLs

- `EXTERNAL_SW_URL`: Specifies the URL for the external Service Wizard.
- `EXTERNAL_SW_URL`: Specifies the URL for the external Service Wizard. Also serves as identifier for Sentry
- `EXTERNAL_DS_URL`: Sets the URL for the external Dynamic Services.
- `KBASE_SERVICES_ENDPOINT`: Specifies the endpoint URL for the KBase service, which provides various functionalities for the application.
- `KBASE_SERVICES_ENDPOINT`: Specifies the endpoint URL for the KBase service, which provides various functionalities
for the application.
- `KBASE_ROOT_ENDPOINT`: Specifies the root endpoint URL for KBase.
- `ROOT_PATH`: Specifies the root path for the application.

Expand All @@ -43,18 +51,33 @@ See [.env](.env) file for example

## Kubernetes configs

- `KUBECONFIG`: Specifies the path to the kubeconfig file. This environment variable is required when `USE_INCLUSTER_CONFIG` is set to "false", else it will read from the default location.
- `KUBECONFIG`: Specifies the path to the kubeconfig file. This environment variable is required
when `USE_INCLUSTER_CONFIG` is set to "false", else it will read from the default location.
- `NAMESPACE`: Specifies the namespace for the application where it operates.
- `USE_INCLUSTER_CONFIG`: A boolean flag indicating whether the application should use in-cluster configuration. Set it to "true" to use in-cluster configuration or "false" to use an external configuration file.
- `USE_INCLUSTER_CONFIG`: A boolean flag indicating whether the application should use in-cluster configuration. Set it
to "true" to use in-cluster configuration or "false" to use an external configuration file.

**NOTE THAT** setting the `KUBECONFIG` environment variable will have no effect when `USE_INCLUSTER_CONFIG` is set to "
true". The application will automatically use the in-cluster configuration provided by the underlying infrastructure. If
you want to use an external configuration file, ensure that `USE_INCLUSTER_CONFIG` is set to "false" and provide the
path to the configuration file using the `KUBECONFIG` environment variable.

**NOTE THAT** setting the `KUBECONFIG` environment variable will have no effect when `USE_INCLUSTER_CONFIG` is set to "true". The application will automatically use the in-cluster configuration provided by the underlying infrastructure. If you want to use an external configuration file, ensure that `USE_INCLUSTER_CONFIG` is set to "false" and provide the path to the configuration file using the `KUBECONFIG` environment variable.
**NOTE THAT** setting `NAMESPACE` also creates a toleration V1Toleration(effect="NoSchedule", key=namespace, operator="Exists")

Ensure that all the required environment variables are properly set before running the application.
## *Optional Environment Variables*

## Telemetry and Miscellaneous configs

- `SENTRY_DSN`: The DSN for the sentry instance to use for error reporting
- `METRICS_USERNAME` : The username for the /metrics endpoint which can be used by prometheus
- `METRICS_PASSWORD` : The password for the /metrics endpoint which can be used by prometheus
**NOTE THAT** the `/metrics` endpoint will not be available unless both the username and password are set.
- `DOTENV_FILE_LOCATION`: The location of the .env file to use for local development. Defaults to .env
- `LOG_LEVEL`: The log level to use for the application. Defaults to INFO

# Code Review Request

* Organization and error handling for authorization, files in random places from ripping out FASTAPI parts.
* Organization and directory structure of APP
* Organization and directory structure of TESTS
* Organization and directory structure of TESTS (unit tests)
Expand All @@ -65,115 +88,86 @@ Ensure that all the required environment variables are properly set before runni
* Dependency system design (passing around request.app.state)
* Caching
* Async/await

*

# Local Development

This repo uses a pipenv to manage dependencies.
To install pipenv, run `pip install pipenv`
To install dependencies, run

```
pipenv --python 3.11-service_wizard2
pipenv install --dev
pipenv shell
```

To start the server, run

```
uvicorn --host 0.0.0.0 --factory src.factory:create_app --reload --port 1234
```

To install pre-commit hook and test it

```
pre-commit install
pre-commit run --all-files
```



Convenience scripts are provided in the [scripts](scripts) directory to setup the pipenv environment and install dependencies.
Convenience scripts are provided in the [scripts](scripts) directory to setup the pipenv environment and install
dependencies.

In order to connect to a kubernetes cluster, you will need to have a kubeconfig file in your home directory.
The kubeconfig file is typically located at `~/.kube/config`.
Read more about kubeconfig files [here](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/).
Ensure that your context is set to the correct cluster and namespace and matches the environmental variables in the [env](test/.env) file.

Read more about kubeconfig
files [here](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/).
Ensure that your context is set to the correct cluster and namespace and matches the environmental variables in
the [env](test/.env) file.

# PYCHARM

You can run the service in pycharm as well, but you will need to set the following parameters in the run configuration:

script path =`/Users/XXX/.local/share/virtualenvs/service_wizard2-vG0FwGFD/bin/uvicorn`
parameters = `--reload --port 5002 --host 0.0.0.0 --factory src.factory:create_app `
parameters = `<uvicorn_path_goes_here> --reload --port 5002 --host 0.0.0.0 --factory src.factory:create_app `

## Usage

OpenAPI documentation is provided at the `/docs` endpoint of the server (in KBase, this is
at `<host>/service/service_wizard2/docs`, for example
[https://ci.kbase.us/services/service_wizard2/docs](https://ci.kbase.us/services/service_wizard2/docs)).

However, the RPC endpoints are not documented. See the [original service wizard spec](src/ServiceWizard_Artifacts/ServiceWizard.spec) for details on how to use the endpoint.


### Error codes

Error codes are listed in [errors.py](src/service/errors.py).
Errors are return as JSONRPC errors.

## Administration

To start the service Docker container:
* Ensure the approproiate kubernetes roles/rolebindings/ are in place for the service account
used by the service.
* Ensure that the namespace is created for both the Service Wizard and the Dynamic Services.
* Ensure that the environment is properly configured for the service.

* The collections listed in
[collection_and_field_names.py](src/common/storage/collection_and_field_names.py) must be
created in ArangoDB. The collections are not created automatically to allow service admins
to specify sharding to their liking. Indexes are created automatically, assuming the collections
exist.
* The environment variables listed in
[collections_config.toml.jinja](collections_config.toml.jinja)
must be provided to the Docker container, unless their default values are acceptable.
In particular, database access and credential information must be provided.

## File structure

* `/src/service` - service code
* `/src/loaders/[collection ID]` - loader code for collections, e.g. `/loaders/gtdb`
* `/src/common` - shared loader and service code
* `/src/common/storage` - data connection and access methods
* `/src/clients` - KBase and Kubernetes clients with caches
* `/src/configs` - the configuration for the app
* `/src/dependencies` - shared service code
* `/src/models` - models for the app returns, logic for calculating service status, other models
* `/src/routes` - the routes for the app
* `/src/rpc` - the RPC endpoints for the app and common code
* `/test/src` - test code. Subdirectories should mirror the folder structure above, e.g.
`/test/src/service` contains service test code
* `/test/ServiceWizard_Artifacts` - the original Service Wizard related code

## Development
* Update the release notes in the [RELEASE_NOTES.md](RELEASE_NOTES.md) file.
* You can run the app via `docker-compose.yaml`
* You can update your credentials in your `kubeconfig` to deploy and launch the app in Rancher2 Desktop

### Adding code

* In this alpha / prototype stage, we will be PRing (do not push directly) to `main`. In the
future, once we want to deploy beyond CI, we will add a `develop` branch.
* The PR creator merges the PR and deletes branches (after builds / tests / linters complete).
* To add new data products, see [Adding data products](/docs/adding_data_products.md)

#### Timestamps

* Timestamps visible in the API must be fully qualified ISO8601 timestamps in the format
`2023-01-29T21:41:48.867140+00:00`.
* Timestamps may be stored in the database as either the above format or as Unix epoch
milliseconds, depending on the use case.
* If timestamps are stored as epoch ms, they must be converted to the ISO8601 format prior to
returning them via the API.

### Versioning

* The code is versioned according to [Semantic Versioning](https://semver.org/).
* The version must be updated in
* `/src/common/version.py`
* `/RELEASE_NOTES.md`
* any test files that test the version

### Code requirements for prototype code:

* Any code committed must at least have a test file that imports it and runs a noop test so that
the code is shown with no coverage in the coverage statistics. This will make it clear what
code needs tests when we move beyond the prototype stage.
* Each module should have its own test file. Eventually these will be expanded into unit tests
(or integration tests in the case of app.py)
* Any code committed must have regular code and user documentation so that future devs
converting the code to production can understand it.
* Release notes are not strictly necessary while deploying to CI, but a concrete version (e.g.
no `-dev*` or `-prototype*` suffix) will be required outside of that environment. On a case by
case basis, add release notes and bump the prototype version (e.g. 0.1.0-prototype3 ->
0.1.0-prototype4) for changes that should be documented.

### Running tests

Expand All @@ -184,38 +178,3 @@ pipenv sync --dev # only the first time or when Pipfile.lock changes
pipenv shell
PYTHONPATH=. pytest test
```

## TODO

* Logging ip properly (X-RealIP, X-Forwarded-For)
* Add request ID to logs and return in errors
* Compare log entries to SDK and see what we should keep
* Take a look at the jgi-kbase IDmapper service

### Prior to declaring this a non-prototype

* Coverage badge in Readme
* Run through all code, refactor to production quality
* Add tests where missing (which is a lot) and inspect current tests for completeness and quality
* E.g. don't assume existing tests are any good
* Async testing help
https://tonybaloney.github.io/posts/async-test-patterns-for-pytest-and-unittest.html
* Build & push tool images in GHA
* Consider using a base image for each tool with a "real" image that builds from the base image.
The "real" image should just copy the files into the image and set the entry point. This will
make GHA builds a lot faster
* Alternatively use docker's GHA cache feature
* Manual push only is probably fine, these images won't change that often
* JobRunner repo should be updated to push the callback server to a GHA KBase namespace
* Testing tool containers
* DO NOT import the tool specific scripts and / or run them directly in tests, as that will
require all their dependencies to be installed, creating dependency hell.
* Instead
* Test as a black box using `docker run`
* This won't work for gtdb_tk, probably. Automated testing for that is going to be
problematic.
* If necessary, add a `Dockerfile.test` dockerfile to build a test specific image and run
tests in there.
* Either mount a directory in which to save the coverage info or `docker cp` it when the
run is complete
* Figure out how to merge the various coverage files.
3 changes: 2 additions & 1 deletion codecov.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
ignore:
- "src/clients"
- "src/clients/baseclient.py"
- "src/clients/CatalogClient.py"
5 changes: 4 additions & 1 deletion docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
version: '3'

# This docker-compose is for developer convenience, not for running in production.
# Be careful as this mounts in your kubeconfig file into here, giving this application access to your k8 connections

services:

Expand All @@ -14,4 +15,6 @@ services:
ports:
- "5001:5000"
env_file:
- .env
- .env
volumes:
- ~/.kube/config:/root/.kube/config
8 changes: 4 additions & 4 deletions k8/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# K8 Deployment Files
* You can use these to deploy into rancher desktop
* You will have to volume mount in kubconfig into the container, this is not yet added
*
# K8 Deployment Files For Local Testing and Development

* You can use these to deploy into rancher desktop or podman desktop
* You will have to volume mount in kubconfig in the service wizard container or use an incluster_config
Loading

0 comments on commit 90485bb

Please sign in to comment.