Skip to content

Conversation

@longshuicy
Copy link
Member

In this PR I created views for all the major resources we would need to list. I included a README on how to import them using studio 3T (see here: https://github.com/clowder-framework/clowder2/blob/e6cc94251d05cb7d1df29018817296168c97cdbc/scripts/mongoviews/README.md)

For most of the resources I just join the "dataset_id" with the dataset field in "authorization collection".
Metadata and listeners are trickier because they could be on "file" or on "dataset". So I end up doing a "facet" (if statement) first to see if resource type is "file" or "dataset.
If it's dataset, just join; If it's file, have another trip to connect with files collection first then join with "authorization"

Some of the examples

dataset

{
    "_id" : ObjectId("63e3ef7ab6a10a463793346b"),
    "name" : "test",
    "description" : "test",
    "author" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "created" : ISODate("2023-02-08T18:52:42.597+0000"),
    "modified" : ISODate("2023-02-08T18:52:42.597+0000"),
    "status" : "PRIVATE",
    "views" : NumberInt(0),
    "downloads" : NumberInt(0),
    "auth" : [
        {
            "_id" : ObjectId("63f7919071578766527ad69c"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:26:20.496+0000"),
            "modified" : ISODate("2023-01-20T15:26:20.496+0000"),
            "dataset_id" : ObjectId("63e3ef7ab6a10a463793346b"),
            "user_id" : "[email protected]",
            "role" : "viewer"
        },
        {
            "_id" : ObjectId("63f7a44071578766527ad6cd"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:26:20.496+0000"),
            "dataset_id" : ObjectId("63e3ef7ab6a10a463793346b"),
            "modified" : ISODate("2023-01-20T15:26:20.496+0000"),
            "role" : "viewer",
            "user_id" : "[email protected]"
        }
    ]
}

file

{
    "_id" : ObjectId("63ee5eb09940b21e3765d610"),
    "name" : "HHS (2).png",
    "creator" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "created" : ISODate("2023-02-16T22:57:40.966+0000"),
    "version_id" : "07c373b6-d747-49d1-b9f3-7a75be83e8d3",
    "version_num" : NumberInt(7),
    "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
    "folder_id" : null,
    "views" : NumberInt(0),
    "downloads" : NumberInt(14),
    "bytes" : NumberInt(226404),
    "content_type" : {
        "content_type" : "image/png",
        "main_type" : "image"
    },
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

folder

{
    "_id" : ObjectId("63e66ed1350371efe874eb37"),
    "name" : "test",
    "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
    "parent_folder" : null,
    "author" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "created" : ISODate("2023-02-10T16:20:33.712+0000"),
    "modified" : ISODate("2023-02-10T16:20:33.712+0000"),
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

metadata

{
    "_id" : ObjectId("63f503aeb427ca74dfbf9441"),
    "context" : {

    },
    "context_url" : "https://clowder.ncsa.illinois.edu/contexts/metadata.jsonld",
    "definition" : null,
    "content" : {
        "lines" : "482",
        "words" : "2128",
        "characters" : "14451"
    },
    "resource" : {
        "collection" : "files",
        "resource_id" : ObjectId("63efe87d6d2f826653fa65fe"),
        "version" : NumberInt(1)
    },
    "agent" : {
        "id" : ObjectId("63f503aeb427ca74dfbf9444"),
        "creator" : {
            "id" : ObjectId("63a32586cdf079dcd6077a1a"),
            "email" : "[email protected]",
            "first_name" : "Chen",
            "last_name" : "Wang"
        },
        "listener" : {
            "id" : ObjectId("63f3ac04ea5ae920b0babaff"),
            "author" : "Rob Kooper <[email protected]>",
            "name" : "ncsa.wordcount",
            "version" : "2.0",
            "description" : "WordCount extractor. Counts the number of characters, words and lines in the text file that was uploaded.",
            "creator" : null,
            "created" : ISODate("2023-02-20T17:21:08.857+0000"),
            "modified" : ISODate("2023-02-20T17:21:08.857+0000"),
            "properties" : {
                "author" : "Rob Kooper <[email protected]>",
                "process" : {
                    "file" : [
                        "text/*",
                        "application/json"
                    ]
                },
                "maturity" : "Development",
                "name" : "ncsa.wordcount",
                "contributors" : [

                ],
                "contexts" : [
                    {
                        "lines" : "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#lines",
                        "words" : "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#words",
                        "characters" : "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#characters"
                    }
                ],
                "repository" : [
                    {
                        "id" : ObjectId("63f3ac04ea5ae920b0babafe"),
                        "repository_type" : "git",
                        "repository_url" : ""
                    }
                ],
                "external_services" : [

                ],
                "libraries" : [

                ],
                "bibtex" : [

                ],
                "default_labels" : [

                ],
                "categories" : [

                ],
                "parameters" : null,
                "version" : "2.0"
            }
        }
    },
    "created" : ISODate("2023-02-21T17:47:26.195+0000"),
    "file_details" : [
        {
            "_id" : ObjectId("63efe87d6d2f826653fa65fe"),
            "name" : "ChangeLog.txt",
            "creator" : {
                "id" : ObjectId("63a32586cdf079dcd6077a1a"),
                "email" : "[email protected]",
                "first_name" : "Chen",
                "last_name" : "Wang"
            },
            "created" : ISODate("2023-02-17T20:50:05.371+0000"),
            "version_id" : "75f9f740-283b-4d29-9335-3cff8ae987e5",
            "version_num" : NumberInt(1),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "folder_id" : null,
            "views" : NumberInt(0),
            "downloads" : NumberInt(2),
            "bytes" : NumberInt(14451),
            "content_type" : {
                "content_type" : "text/plain",
                "main_type" : "text"
            }
        }
    ],
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

listener_jobs

{
    "_id" : ObjectId("63d146bc79a3d39c71d0e0b1"),
    "listener_id" : "ncsa.wordcount",
    "resource_ref" : {
        "collection" : "file",
        "resource_id" : ObjectId("63d140925cda10061fcd6768"),
        "version" : 1.0
    },
    "creator" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "parameters" : {

    },
    "created" : ISODate("2023-01-25T15:11:56.615+0000"),
    "started" : null,
    "updated" : ISODate("2023-01-25T14:11:56.000+0000"),
    "finished" : null,
    "duration" : 500.0,
    "latest_message" : "StatusMessage.done: Done processing.",
    "status" : "StatusMessage.done: Done processing.",
    "file_details" : [

    ],
    "auth" : [

    ]
}

listener_job_update

{
    "_id" : ObjectId("63d146bdeac2ff40a3c8c31e"),
    "job_id" : ObjectId("63f6f73302e05c3c0823098e"),
    "timestamp" : ISODate("2023-01-25T14:11:56.000+0000"),
    "status" : "StatusMessage.processing: Downloading file.",
    "listener_job_details" : [
        {
            "_id" : ObjectId("63f6f73302e05c3c0823098e"),
            "listener_id" : "ncsa.wordcount",
            "resource_ref" : {
                "collection" : "dataset",
                "resource_id" : ObjectId("63e50a0fb6a10a46379335f2"),
                "version" : null
            },
            "creator" : {
                "id" : ObjectId("63a32586cdf079dcd6077a1a"),
                "email" : "[email protected]",
                "first_name" : "Chen",
                "last_name" : "Wang"
            },
            "parameters" : {

            },
            "created" : ISODate("2023-02-23T05:18:43.047+0000"),
            "started" : null,
            "updated" : null,
            "finished" : null,
            "duration" : null,
            "latest_message" : null,
            "status" : "CREATED"
        }
    ],
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

@longshuicy longshuicy linked an issue Feb 23, 2023 that may be closed by this pull request
@lmarini
Copy link
Member

lmarini commented Feb 24, 2023

Should we have a script that runs then the database is first created? Are these operations idempotent so that we can run the script every time we startup the services? or do we have to worry about doing it only the first time?

@longshuicy
Copy link
Member Author

Should we have a script that runs then the database is first created? Are these operations idempotent so that we can run the script every time we startup the services? or do we have to worry about doing it only the first time?

I think we need to import these when database is first created. I need to look into a more programmatic way to import those script. But once it's there we don't need to worry about it when we start up the services

@tcnichol
Copy link
Contributor

I am going to mark this one approved. I was able to generate the views in Studio 3T and these also work with list access for the other pull request.

Copy link
Contributor

@tcnichol tcnichol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was able to generate views, and once I did that I could filter access on other pull request.

@max-zilla
Copy link
Contributor

for reference, here is the mongo equivalent command without Studio 3T:
db.createView("datasets_view", "datasets", [{$lookup: {"from" : "authorization", "localField" : "_id", "foreignField" : "dataset_id", "as" : "auth"}}]);

@max-zilla max-zilla merged commit 934c8b9 into main Mar 2, 2023
@max-zilla max-zilla deleted the 342-design-mongodb-view-for-authorization branch March 2, 2023 14:20
longshuicy added a commit that referenced this pull request Mar 2, 2023
* add a readme

* add listener job update view

* temp

* add init

* add to production docker-compose as well

* 343 list resources that user has access (#355)

* add logic to filter user

* add filters to file

* list files working too

* add same filter for folder

* include executions

* exempt the current resource owner

* add test database

* rewrite createView part

* rewrite init script but it's still not working yet

* fix typo

* fix linting
max-zilla added a commit that referenced this pull request Mar 6, 2023
* Added group_id to authorization

* Added list of group_ids instead of one group_id per entry

* Addressing comments

* fixing lint failure

* Added new component for file actions (#352)

* Added new component for file actions

* Tested file download and file delete from File page

* Verified after deleting file, page navigates back to main dataset page

* Fixed version-not-updating error

* Updated onSave sequence to first synchronously update the file and then subsequently call listVersions API to display on the frontend

* Fixed lint issues by running eslint

* UI for displaying logs on extractors (#317)

* Static UI for displaying logs on extractors

* Created 2 React custom components:
* ExtractorStatus -- UI table to display summary of job/progress so far
* ExtractorLogs -- terminal-like UI to display logs in real-time as they are being fetched

* [WIP] Enable extractor job id retrieval from backend

* Updated redux store value when job id is returned

* Added actions for job update, job summary API endpoints

* [WIP DO NOT MERGE] Testing frontend functionality with dynamic log rendering

* Updated API endpoints, return payload, reducers and actions (#330)

* Updated API endpoints, return payload, reducers and actions for the extractor jobs

* Tested API calls to fetchJobSummary, fetchJobUpdates from frontend, able to receive extractor job data

* Tested submit job process flow, able to retrieve extractor job id from the backend

* Removed redundant code

---------

Co-authored-by: Chen Wang <[email protected]>

* Integrated UI with backend

* Tested with sample extractor logs, able to view both extractor status and summary from the UI

* Removed hardcoding of job id

* Fixed indentation

* Adding download button on UI to download correct verison file (#334)

* Adding download button on UI to download correct verison file

* fix filename issue

* Update FileVersion.ts

---------

Co-authored-by: Chen Wang <[email protected]>

* fix member typing (#336)

* fix member typing

* adjust the test so it can pass

* fixing error of blank page on submit file to extractor (#338)

files.py - renaming resubmit method
codegen
fixing error in listeners.py

* Modified extractor job summary duration calculation

* Moving fetchJobSummary, fetchJobSummary API calls to ExtractorStatus component to fix interval issue

---------

Co-authored-by: Chen Wang <[email protected]>
Co-authored-by: Dipannita <[email protected]>
Co-authored-by: Todd Nicholson <[email protected]>

* Properly interpret extractor statuses (#356)

* various fixes to message listener

* black formatting

* Added new component for file actions (#352)

* Added new component for file actions

* Tested file download and file delete from File page

* Verified after deleting file, page navigates back to main dataset page

* Fixed version-not-updating error

* Updated onSave sequence to first synchronously update the file and then subsequently call listVersions API to display on the frontend

* Fixed lint issues by running eslint

* UI for displaying logs on extractors (#317)

* Static UI for displaying logs on extractors

* Created 2 React custom components:
* ExtractorStatus -- UI table to display summary of job/progress so far
* ExtractorLogs -- terminal-like UI to display logs in real-time as they are being fetched

* [WIP] Enable extractor job id retrieval from backend

* Updated redux store value when job id is returned

* Added actions for job update, job summary API endpoints

* [WIP DO NOT MERGE] Testing frontend functionality with dynamic log rendering

* Updated API endpoints, return payload, reducers and actions (#330)

* Updated API endpoints, return payload, reducers and actions for the extractor jobs

* Tested API calls to fetchJobSummary, fetchJobUpdates from frontend, able to receive extractor job data

* Tested submit job process flow, able to retrieve extractor job id from the backend

* Removed redundant code

---------

Co-authored-by: Chen Wang <[email protected]>

* Integrated UI with backend

* Tested with sample extractor logs, able to view both extractor status and summary from the UI

* Removed hardcoding of job id

* Fixed indentation

* Adding download button on UI to download correct verison file (#334)

* Adding download button on UI to download correct verison file

* fix filename issue

* Update FileVersion.ts

---------

Co-authored-by: Chen Wang <[email protected]>

* fix member typing (#336)

* fix member typing

* adjust the test so it can pass

* fixing error of blank page on submit file to extractor (#338)

files.py - renaming resubmit method
codegen
fixing error in listeners.py

* Modified extractor job summary duration calculation

* Moving fetchJobSummary, fetchJobSummary API calls to ExtractorStatus component to fix interval issue

---------

Co-authored-by: Chen Wang <[email protected]>
Co-authored-by: Dipannita <[email protected]>
Co-authored-by: Todd Nicholson <[email protected]>

* Millisecond (#360)

* it's seconds

* add more icons

* default to open tab

---------

Co-authored-by: Aruna Parameswaran <[email protected]>
Co-authored-by: Chen Wang <[email protected]>
Co-authored-by: Dipannita <[email protected]>
Co-authored-by: Todd Nicholson <[email protected]>

* Context matches v1 type. (#328)

* context is now changed to list union of AnyUrl or dict. This should make v2 compatible with some extractors from extractors-core which dynamically build the context.

* formatting
fix test

* formatting

* still getting pytest errors
changed context from optional[list] to list with default value empty, but still is not fixing the issues.

* should fix the tests

* formatting

* getting rid of class 'context element' not needed

* getting rid of class 'context element' not needed

* removing console log

---------

Co-authored-by: Chen Wang <[email protected]>

* Mongo views (#353)

* add a readme

* add listener job update view

* temp

* add init

* add to production docker-compose as well

* 343 list resources that user has access (#355)

* add logic to filter user

* add filters to file

* list files working too

* add same filter for folder

* include executions

* exempt the current resource owner

* add test database

* rewrite createView part

* rewrite init script but it's still not working yet

* fix typo

* fix linting

* Replaced EmbeddedSearch with custom search box component to fix session refresh issue (#365)

* Tested dataset search by clicking button and pressing enter, able to view search results

* match user_ids which is a list now (#368)

* update to lookup query

* update authorization_deps queries

* simplify group logic in auth check

* formatting

* Formatting

---------

Co-authored-by: Aruna Parameswaran <[email protected]>
Co-authored-by: Chen Wang <[email protected]>
Co-authored-by: Todd Nicholson <[email protected]>
Co-authored-by: Max Burnette <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design mongodb view for authorization

5 participants