Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions Pipfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
alabaster = "==0.7.12"
babel = "==2.9.1"
certifi = "==2021.10.8"
charset-normalizer = "==2.0.12"
commonmark = "==0.9.1"
docutils = "==0.17.1"
idna = "==3.3"
imagesize = "==1.3.0"
importlib-metadata = "==4.11.3"
jinja2 = "==3.0.3"
markupsafe = "==2.1.0"
packaging = "==21.3"
pyenchant = "==3.2.2"
pygments = "==2.11.2"
pyparsing = "==3.0.7"
pytz = "==2021.3"
recommonmark = "*"
requests = "==2.27.1"
snowballstemmer = "==2.2.0"
sphinx = "==4.4.0"
sphinxcontrib-applehelp = "==1.0.2"
sphinxcontrib-devhelp = "==1.0.2"
sphinxcontrib-htmlhelp = "==2.0.0"
sphinxcontrib-jsmath = "==1.0.1"
sphinxcontrib-qthelp = "==1.0.3"
sphinxcontrib-serializinghtml = "==1.1.5"
sphinxcontrib-spelling = "==7.3.2"
urllib3 = "==1.26.8"
zipp = "==3.7.0"
sphinx-autobuild = "*"
sphinx-material = "*"

[dev-packages]

[requires]
python_version = "3.9"
521 changes: 521 additions & 0 deletions Pipfile.lock

Large diffs are not rendered by default.

162 changes: 45 additions & 117 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,138 +1,66 @@
# Clowder V2 (In Active Development)
![](docs/source/img/logo_full.png)

*For the previous version of Clowder, please see [Clowder V1](https://github.com/clowder-framework/clowder).*
# Clowder v2 (In active development)
[![Build Status](https://github.com/clowder-framework/clowder2/actions/workflows/pytest.yml/badge.svg?branch=main)](https://github.com/clowder-framework/clowder2/actions?query=branch%3Amain)
[![Slack](https://img.shields.io/badge/Slack-4A154B?&logo=slack&logoColor=white)](https://join.slack.com/t/clowder-software/shared_invite/zt-4e0vo0sh-YNndJEuLtPGRa7~uIlpcNA)

Clowder V2 is a reimagining of the [Clowder](https://clowderframework.org/) research data management system
using a different and newer technology stack. While the Clowder V1 has served us well, many of the underlying
technologies and libraries have not received enough support in recent years and new developers have a difficult
time learning how to contribute to it. Clowder V2 is also an opportunity to leverage our experience working with
research data in Clowder and deliver
a better solution to common problems researchers encounter when working with data.
[//]: # ([![Documentation Status](https://readthedocs.org/projects/clowder2/badge/?version=latest)](https://clowder2.readthedocs.io/en/latest/?badge=latest))

In this version of Clowder, the application is clearly divided into backend and frontend modules. While this is somewhat
similar to Clowder V1, it used the [Play Framework](https://www.playframework.com/) and hence the fronted was created
at the server side, next to a standalone web Application Programming Interface (API). In Clowder V2, the
frontend module is a standalone [React](https://react.dev/) application and the backend, a
standalone [FastAPI](https://fastapi.tiangolo.com/lo/) web API. We continue to leverage
[MongoDB](https://www.mongodb.com/), [RabbitMQ](https://www.rabbitmq.com/),
and [Elasticsearch](https://www.elastic.co/). We also use [MinIO](https://min.io/) out of the box as the default object
store and [Traefik](https://traefik.io/traefik/) as the application proxy.
*For the previous version of Clowder, please see [Clowder v1](https://github.com/clowder-framework/clowder).*

## Running in Docker
Clowder v2 is a reimagining of the [Clowder research data management system](https://clowderframework.org/)
using a different and newer technology stack. Clowder is a cloud native data management framework to support any
research domain. Clowder was developed to help researchers and scientists in data intensive domains manage raw data,
complex metadata, and automatic data pipelines.

To run the full stack using [Docker](https://www.docker.com/) (recommended), please use the following instructions:
While the Clowder v1 has worked well over the years, many of the underlying
technologies and libraries have not received enough support in recent years and new developers have had a challenging
time learning how to contribute to it.
Clowder v2 is also an opportunity to leverage our experience working with
research data in Clowder and deliver a better solution to common problems researchers encounter when working with data.

1. Run all Docker services with `docker compose up --scale backend=4 --build`. This will start the services with four
instances of the backend module running in parallel. Note the `--build` flag used to build the images first. If using
default images, that flag can be removed. The images can also be built with `docker compose build`.
Clowder v2 provides:

2. The application will be running and available at `http://localhost`.
- a better user experience and user interface
- an easier code base to pick up and modify written in Python/FastAPI and Typescript/React
- new features based on our experience working with researchers

3. To access the Traefik dashboard, go to `http://localhost:8080`. To view the raw
settings, go to `http://localhost:8080/api/rawdata`.
## Documentation

## Developing
The v2 documentation is still work in progress. It's available at https://clowder2.readthedocs.io.

When developing, the required services can be run using Docker. You can then run the backend
and frontend modules from the command line or in your favorite IDE (to make debugging easier). We recommend
using [PyCharm](https://www.jetbrains.com/pycharm/) and have
made our run configurations available in the `.run` folder. PyCharm should automatically import it, but you will have
to change the path to the Python virtual environment to point to your path on your host (see Initial Dependencies
section below).
The v1 documentation is not fully compatible with v2, but it does provide some still relevant information.
It is available at https://clowder-framework.readthedocs.io.
There is a few other documentation links available on the [website](https://clowderframework.org/documentation.html).

### Initial Development Dependencies
## Installation

- Run `python3 -m venv venv` to create a Python Virtual Environment and add it to PyCharm by navigating to
`PyCharm -> Settings... -> Project: clowder2 -> Python Interpreter -> Add Interpreter`.
- Run `source venv/bin/activate && pip install --upgrade pip` to activate the created Python Virtual Environment and
upgrade
pip.
- Run `pip install pipenv` to install [Pipenv](https://pipenv.pypa.io/en/latest/).
The easiest way of running Clowder v2 is checking out the [code](https://github.com/clowder-framework/clowder2)
and running `docker compose up` in the main directory.

### Required Services
Helm charts are available for running Clowder v2 on Kubernetes. See the [helm](https://github.com/clowder-framework/clowder2/tree/main/deployments/kubernetes/charts) directory for more information.

- Running `./docker-dev.sh up` brings up the required services in the background.
- Running `docker-compose logs -f` displays the live logs for all containers. To view the logs of individual containers,
provide the container name. For example, for viewing the backend module logs, run `docker-compose logs -f backend`.
- Running `./docker-dev.sh down` brings down the required services.
## Contributing

**Note:** `./docker-dev.sh` sets the project name flag to `-p clowder2-dev`. This is so that the dev containers
don't get mixed with the production containers if the user is running both on the same machine using `docker-compose.yml`.
If this is not used, the keycloak container will use the volume created with the other docker compose and it will be
unable to run as the information stored in the postgres database is different.
We are always looking for contributors. This could be anything from fixing bugs, adding new features, providing new
feature requests, reccomending UI/UX improvements, helping with the documentation, or just testing the system and
providing feedback. Here are a few ways to get started:

### Backend Module
- Join our [Slack](https://join.slack.com/t/clowder-software/shared_invite/zt-4e0vo0sh-YNndJEuLtPGRa7~uIlpcNA)
channel, introduce yourself, and ask questions about the specific aspects of the system you are interested in.
- Submit an issue (bug or feature request) on the [issue tracker](https://github.com/clowder-framework/clowder2/issues).
- Submit a [pull request](https://github.com/clowder-framework/clowder2/pulls) with a bug fix or new feature. For
larger changes, it's best to open an issue first or ask on Slack to discuss the changes.
- Develop new [information extractors](https://github.com/clowder-framework/pyclowder) and/or visualizations.

After starting up the required services, setup and run the backend module.
Please follow our [code of conduct](https://github.com/clowder-framework/clowder/blob/develop/CODE_OF_CONDUCT.md) when
interacting with the community.

The backend module is developed using [Python](https://www.python.org/), [FastAPI](https://fastapi.tiangolo.com/),
and [Motor](https://motor.readthedocs.io/en/stable/).
We recommend using [Python 3.9](https://www.python.org/downloads/)
and Pipenv for dependency management.
## Support & Contacts

#### Install Backend Dependencies
The easiest way to get in touch with us is [Slack](https://join.slack.com/t/clowder-software/shared_invite/zt-4e0vo0sh-YNndJEuLtPGRa7~uIlpcNA).
This is a public forum. If you prefer email, you can contact us at [[email protected]](mailto:[email protected]).

1. Switch to backend module directory `cd backend`.
2. Install dependencies using `pipenv install --dev`.
## License

#### Run Backend Module

You can run the backend module using either of the below options:

- Using the PyCharm's run configuration by navigating to `PyCharm -> Run -> Run...` and clicking `uvicorn`. Running
directly from PyCharm helps the developer by providing easy access to its debugging features.
- From the command line by running `pipenv run uvicorn app.main:app --reload` .

Additional steps/details:

1. API docs are available at `http://localhost:8000/docs`. The API base URL is `http://localhost:8000/api/v2`.
2. Create a user using `POST /api/v2/users` and getting a JWT token by using `POST /api/v2/login`. Place the token in
header of requests that require authentications using the `Authorization: Bearer <your token>` HTTP header.
* You can also run the frontend module below and use the Login link available there.
3. Manually run tests before pushing with `pipenv run pytest -v` or right-clicking on `test` folder and clicking `Run`
in PyCharm.
4. Linting is done using [Black]((https://black.readthedocs.io/en/stable/)). You can set up PyCharm to automatically
run it when you save a file using
these [instructions](https://black.readthedocs.io/en/stable/integrations/editors.html).
The git repository includes an action to run Black on push and pull_request.
5. Before pushing new code, please make sure all files are properly formatted by running the following command in
the `/backend` directory:
```pipenv run black app```

### Frontend Module

To run the frontend, both required services and the backend module must be running successfully.

The frontend module is developed using [TypeScript](https://www.typescriptlang.org/), [React](https://reactjs.org/),
[Material UI](https://mui.com/), [Redux](https://redux.js.org/), [webpack](https://webpack.js.org/),
[Node.js](https://nodejs.org). We recommend using Node v16.15 LTS.

#### Install Frontend Dependencies

1. Switch to frontend directory `cd ../frontend`.
2. Install dependencies: `npm install`

#### Run Frontend Module

You can run the frontend module using either of the below options:

- Using the PyCharm's run configuration by navigating to `PyCharm -> Run -> Run...` and clicking `start:dev`. Running
directly from PyCharm helps the developer by providing easy access to its debugging features.
- From the command line by running `npm run start:dev`
- By default, the backend module runs at `http://localhost:8000`. If running at different URL/port, use:
`CLOWDER_REMOTE_HOSTNAME=http://<hostname or IP address>:<port number> npm start`
- After modifying the backend module API, update autogenerated client function calls:
- Backend module must be running
- Run codegen: `npm run codegen:v2:dev`

### Configuring Keycloak

- If you are developer running the dev stack on your local machine, please import
the [Keycloak](https://www.keycloak.org/) realm setting file `/scripts/keycloak/clowder-realm-dev.json`
- If you are running production docker compose on local machine, please import the Keycloak realm setting file
`/scripts/keycloak/clowder-realm-prod.json`
- If you are deploying on the kubernetes cluster (https://clowder2.software-dev.ncsa.cloud/), please import the
Keycloak realm setting file `/scripts/keycloak/mini-kube-clowder-realm-prod.json`

**For more details on how to set up Keycloak, please refer to
this [Documentation](docs/source/configure-keycloak-realm.md)**
Clowder v2 is licensed under the [Apache 2.0 license](https://github.com/clowder-framework/clowder2/blob/main/LICENSE).
8 changes: 4 additions & 4 deletions deployments/kubernetes/charts/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Clowder 2
# Clowder v2 Helm Charts

This depends on some subcharts, make sure to have them installed if you plan on modifying the helm chart:
Helm charts depend on some subcharts, make sure to have them installed if you plan on modifying the helm chart:

```bash
helm repo add bitnami https://charts.bitnami.com/bitnami
Expand Down Expand Up @@ -28,9 +28,9 @@ Now you can install (or upgrade) clowder using:
helm upgrade --install --namespace clowder2 --create-namespace --values local.yaml clowder2 .
```

# Docker Desktop
## Ingress Controller

You will need an ingress controller, I like Traefik as my ingress controller. You install this with:
You will need an ingress controller. Traefik works well as ingress controller. You can install it with:

```bash
helm install --namespace traefik --create-namespace traefik traefik/traefik
Expand Down
18 changes: 16 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,32 @@ Uses [Sphinx](https://www.sphinx-doc.org). Requires [enchant](https://pyenchant.

Currently deployed at https://clowder2.readthedocs.io.

If you have installed `sphinx-autobuild docs/source docs/build/html` you can use it to automatically rebuild the docs
when you make changes.

```shell
sphinx-autobuild source build/html

open http://localhost:8000/
```

If you don't have `sphinx-autobuild` installed, you can use the Makefile.

```shell
# build
make html

# view
python3 -m http.server --directory build/html
python3 -m http.server 7000 --directory build/html
open http://localhost:7000/

# check links
make linkcheck
```

# spell checking
You can check spelling with `make spelling`. This requires `enchant` to be installed.

```shell
# install enchant once, on mac
brew install enchant

Expand Down
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ recommonmark==0.7.1
requests==2.27.1
snowballstemmer==2.2.0
Sphinx==4.4.0
sphinx-autobuild==2021.3.14
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.0
Expand Down
9 changes: 9 additions & 0 deletions docs/source/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Architecture

Clowder v2 follows the architecture of v1, we the additional split of the server side into
multiple standalone containers.
Each box below is a docker service as defined in `docker-compose.yml`.
Boxes are grouped logically into backend, keycloak, extractors, database boxes.
Lines show interactions between containers over the docker network.

![Clowder v2 architecture](img/architecture.jpg)
50 changes: 44 additions & 6 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@

# -- Project information -----------------------------------------------------

project = "Clowder v2"
copyright = "2022, Luigi Marini"
author = "Luigi Marini"
project = "Clowder2"
copyright = "2022, Clowder Devs"
author = "Clowder Devs"

# The full version, including alpha/beta/rc tags
release = "0.1"
release = "2.0.0-beta.1"


# -- General configuration ---------------------------------------------------
Expand All @@ -40,13 +40,51 @@
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "alabaster"
html_theme = "sphinx_material"

# Set link name generated in the top bar.
html_title = "Clowder v2"

# Material theme options (see theme.conf for more information)
html_sidebars = {
"**": ["logo-text.html", "globaltoc.html", "localtoc.html", "searchbox.html"]
}
html_theme_options = {
# Set the name of the project to appear in the navigation.
"nav_title": "Clowder v2",
# Set you GA account ID to enable tracking
"google_analytics_account": "UA-XXXXX",
# Specify a base_url used to generate sitemap.xml. If not
# specified, then no sitemap will be built.
"base_url": "https://clowder2.readthedocs.io/",
# Set the color and the accent color
"color_primary": "blue",
"color_accent": "light-blue",
# Set the repo location to get a badge with stats
"repo_url": "https://github.com/clowder-framework/clowder2",
"repo_name": "Clowder2",
# Visible levels of the global TOC; -1 means unlimited
"globaltoc_depth": 2,
# If False, expand all TOC entries
"globaltoc_collapse": True,
# If True, show hidden TOC entries
"globaltoc_includehidden": False,
}

import os

FORCE_CLASSIC = os.environ.get("SPHINX_MATERIAL_FORCE_CLASSIC", False)
FORCE_CLASSIC = FORCE_CLASSIC in ("1", "true")
if FORCE_CLASSIC:
print("!!!!!!!!! Forcing classic !!!!!!!!!!!")
html_theme = "classic"
html_theme_options = {}
html_sidebars = {"**": ["globaltoc.html", "localtoc.html", "searchbox.html"]}

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
Expand Down
Loading