Mongo views #353

longshuicy · 2023-02-23T20:41:23Z

In this PR I created views for all the major resources we would need to list. I included a README on how to import them using studio 3T (see here: https://github.com/clowder-framework/clowder2/blob/e6cc94251d05cb7d1df29018817296168c97cdbc/scripts/mongoviews/README.md)

For most of the resources I just join the "dataset_id" with the dataset field in "authorization collection".
Metadata and listeners are trickier because they could be on "file" or on "dataset". So I end up doing a "facet" (if statement) first to see if resource type is "file" or "dataset.
If it's dataset, just join; If it's file, have another trip to connect with files collection first then join with "authorization"

Some of the examples

dataset

{
    "_id" : ObjectId("63e3ef7ab6a10a463793346b"),
    "name" : "test",
    "description" : "test",
    "author" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "created" : ISODate("2023-02-08T18:52:42.597+0000"),
    "modified" : ISODate("2023-02-08T18:52:42.597+0000"),
    "status" : "PRIVATE",
    "views" : NumberInt(0),
    "downloads" : NumberInt(0),
    "auth" : [
        {
            "_id" : ObjectId("63f7919071578766527ad69c"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:26:20.496+0000"),
            "modified" : ISODate("2023-01-20T15:26:20.496+0000"),
            "dataset_id" : ObjectId("63e3ef7ab6a10a463793346b"),
            "user_id" : "[email protected]",
            "role" : "viewer"
        },
        {
            "_id" : ObjectId("63f7a44071578766527ad6cd"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:26:20.496+0000"),
            "dataset_id" : ObjectId("63e3ef7ab6a10a463793346b"),
            "modified" : ISODate("2023-01-20T15:26:20.496+0000"),
            "role" : "viewer",
            "user_id" : "[email protected]"
        }
    ]
}

file

{
    "_id" : ObjectId("63ee5eb09940b21e3765d610"),
    "name" : "HHS (2).png",
    "creator" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "created" : ISODate("2023-02-16T22:57:40.966+0000"),
    "version_id" : "07c373b6-d747-49d1-b9f3-7a75be83e8d3",
    "version_num" : NumberInt(7),
    "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
    "folder_id" : null,
    "views" : NumberInt(0),
    "downloads" : NumberInt(14),
    "bytes" : NumberInt(226404),
    "content_type" : {
        "content_type" : "image/png",
        "main_type" : "image"
    },
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

folder

{
    "_id" : ObjectId("63e66ed1350371efe874eb37"),
    "name" : "test",
    "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
    "parent_folder" : null,
    "author" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "created" : ISODate("2023-02-10T16:20:33.712+0000"),
    "modified" : ISODate("2023-02-10T16:20:33.712+0000"),
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

metadata

{
    "_id" : ObjectId("63f503aeb427ca74dfbf9441"),
    "context" : {

    },
    "context_url" : "https://clowder.ncsa.illinois.edu/contexts/metadata.jsonld",
    "definition" : null,
    "content" : {
        "lines" : "482",
        "words" : "2128",
        "characters" : "14451"
    },
    "resource" : {
        "collection" : "files",
        "resource_id" : ObjectId("63efe87d6d2f826653fa65fe"),
        "version" : NumberInt(1)
    },
    "agent" : {
        "id" : ObjectId("63f503aeb427ca74dfbf9444"),
        "creator" : {
            "id" : ObjectId("63a32586cdf079dcd6077a1a"),
            "email" : "[email protected]",
            "first_name" : "Chen",
            "last_name" : "Wang"
        },
        "listener" : {
            "id" : ObjectId("63f3ac04ea5ae920b0babaff"),
            "author" : "Rob Kooper <[email protected]>",
            "name" : "ncsa.wordcount",
            "version" : "2.0",
            "description" : "WordCount extractor. Counts the number of characters, words and lines in the text file that was uploaded.",
            "creator" : null,
            "created" : ISODate("2023-02-20T17:21:08.857+0000"),
            "modified" : ISODate("2023-02-20T17:21:08.857+0000"),
            "properties" : {
                "author" : "Rob Kooper <[email protected]>",
                "process" : {
                    "file" : [
                        "text/*",
                        "application/json"
                    ]
                },
                "maturity" : "Development",
                "name" : "ncsa.wordcount",
                "contributors" : [

                ],
                "contexts" : [
                    {
                        "lines" : "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#lines",
                        "words" : "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#words",
                        "characters" : "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#characters"
                    }
                ],
                "repository" : [
                    {
                        "id" : ObjectId("63f3ac04ea5ae920b0babafe"),
                        "repository_type" : "git",
                        "repository_url" : ""
                    }
                ],
                "external_services" : [

                ],
                "libraries" : [

                ],
                "bibtex" : [

                ],
                "default_labels" : [

                ],
                "categories" : [

                ],
                "parameters" : null,
                "version" : "2.0"
            }
        }
    },
    "created" : ISODate("2023-02-21T17:47:26.195+0000"),
    "file_details" : [
        {
            "_id" : ObjectId("63efe87d6d2f826653fa65fe"),
            "name" : "ChangeLog.txt",
            "creator" : {
                "id" : ObjectId("63a32586cdf079dcd6077a1a"),
                "email" : "[email protected]",
                "first_name" : "Chen",
                "last_name" : "Wang"
            },
            "created" : ISODate("2023-02-17T20:50:05.371+0000"),
            "version_id" : "75f9f740-283b-4d29-9335-3cff8ae987e5",
            "version_num" : NumberInt(1),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "folder_id" : null,
            "views" : NumberInt(0),
            "downloads" : NumberInt(2),
            "bytes" : NumberInt(14451),
            "content_type" : {
                "content_type" : "text/plain",
                "main_type" : "text"
            }
        }
    ],
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

listener_jobs

{
    "_id" : ObjectId("63d146bc79a3d39c71d0e0b1"),
    "listener_id" : "ncsa.wordcount",
    "resource_ref" : {
        "collection" : "file",
        "resource_id" : ObjectId("63d140925cda10061fcd6768"),
        "version" : 1.0
    },
    "creator" : {
        "id" : ObjectId("63a32586cdf079dcd6077a1a"),
        "email" : "[email protected]",
        "first_name" : "Chen",
        "last_name" : "Wang"
    },
    "parameters" : {

    },
    "created" : ISODate("2023-01-25T15:11:56.615+0000"),
    "started" : null,
    "updated" : ISODate("2023-01-25T14:11:56.000+0000"),
    "finished" : null,
    "duration" : 500.0,
    "latest_message" : "StatusMessage.done: Done processing.",
    "status" : "StatusMessage.done: Done processing.",
    "file_details" : [

    ],
    "auth" : [

    ]
}

listener_job_update

{
    "_id" : ObjectId("63d146bdeac2ff40a3c8c31e"),
    "job_id" : ObjectId("63f6f73302e05c3c0823098e"),
    "timestamp" : ISODate("2023-01-25T14:11:56.000+0000"),
    "status" : "StatusMessage.processing: Downloading file.",
    "listener_job_details" : [
        {
            "_id" : ObjectId("63f6f73302e05c3c0823098e"),
            "listener_id" : "ncsa.wordcount",
            "resource_ref" : {
                "collection" : "dataset",
                "resource_id" : ObjectId("63e50a0fb6a10a46379335f2"),
                "version" : null
            },
            "creator" : {
                "id" : ObjectId("63a32586cdf079dcd6077a1a"),
                "email" : "[email protected]",
                "first_name" : "Chen",
                "last_name" : "Wang"
            },
            "parameters" : {

            },
            "created" : ISODate("2023-02-23T05:18:43.047+0000"),
            "started" : null,
            "updated" : null,
            "finished" : null,
            "duration" : null,
            "latest_message" : null,
            "status" : "CREATED"
        }
    ],
    "auth" : [
        {
            "_id" : ObjectId("63cab262be029fb68ed002c4"),
            "creator" : "[email protected]",
            "created" : ISODate("2023-01-20T15:25:22.470+0000"),
            "modified" : ISODate("2023-01-20T15:25:22.470+0000"),
            "dataset_id" : ObjectId("63e50a0fb6a10a46379335f2"),
            "user_id" : "[email protected]",
            "role" : "owner"
        }
    ]
}

lmarini · 2023-02-24T16:13:37Z

Should we have a script that runs then the database is first created? Are these operations idempotent so that we can run the script every time we startup the services? or do we have to worry about doing it only the first time?

longshuicy · 2023-02-24T17:05:50Z

Should we have a script that runs then the database is first created? Are these operations idempotent so that we can run the script every time we startup the services? or do we have to worry about doing it only the first time?

I think we need to import these when database is first created. I need to look into a more programmatic way to import those script. But once it's there we don't need to worry about it when we start up the services

tcnichol · 2023-02-27T21:53:40Z

I am going to mark this one approved. I was able to generate the views in Studio 3T and these also work with list access for the other pull request.

tcnichol

Was able to generate views, and once I did that I could filter access on other pull request.

* add logic to filter user * add filters to file * list files working too * add same filter for folder * include executions * exempt the current resource owner

max-zilla · 2023-03-02T14:18:56Z

for reference, here is the mongo equivalent command without Studio 3T:
db.createView("datasets_view", "datasets", [{$lookup: {"from" : "authorization", "localField" : "_id", "foreignField" : "dataset_id", "as" : "auth"}}]);

* add a readme * add listener job update view * temp * add init * add to production docker-compose as well * 343 list resources that user has access (#355) * add logic to filter user * add filters to file * list files working too * add same filter for folder * include executions * exempt the current resource owner * add test database * rewrite createView part * rewrite init script but it's still not working yet * fix typo * fix linting

* Added group_id to authorization * Added list of group_ids instead of one group_id per entry * Addressing comments * fixing lint failure * Added new component for file actions (#352) * Added new component for file actions * Tested file download and file delete from File page * Verified after deleting file, page navigates back to main dataset page * Fixed version-not-updating error * Updated onSave sequence to first synchronously update the file and then subsequently call listVersions API to display on the frontend * Fixed lint issues by running eslint * UI for displaying logs on extractors (#317) * Static UI for displaying logs on extractors * Created 2 React custom components: * ExtractorStatus -- UI table to display summary of job/progress so far * ExtractorLogs -- terminal-like UI to display logs in real-time as they are being fetched * [WIP] Enable extractor job id retrieval from backend * Updated redux store value when job id is returned * Added actions for job update, job summary API endpoints * [WIP DO NOT MERGE] Testing frontend functionality with dynamic log rendering * Updated API endpoints, return payload, reducers and actions (#330) * Updated API endpoints, return payload, reducers and actions for the extractor jobs * Tested API calls to fetchJobSummary, fetchJobUpdates from frontend, able to receive extractor job data * Tested submit job process flow, able to retrieve extractor job id from the backend * Removed redundant code --------- Co-authored-by: Chen Wang <[email protected]> * Integrated UI with backend * Tested with sample extractor logs, able to view both extractor status and summary from the UI * Removed hardcoding of job id * Fixed indentation * Adding download button on UI to download correct verison file (#334) * Adding download button on UI to download correct verison file * fix filename issue * Update FileVersion.ts --------- Co-authored-by: Chen Wang <[email protected]> * fix member typing (#336) * fix member typing * adjust the test so it can pass * fixing error of blank page on submit file to extractor (#338) files.py - renaming resubmit method codegen fixing error in listeners.py * Modified extractor job summary duration calculation * Moving fetchJobSummary, fetchJobSummary API calls to ExtractorStatus component to fix interval issue --------- Co-authored-by: Chen Wang <[email protected]> Co-authored-by: Dipannita <[email protected]> Co-authored-by: Todd Nicholson <[email protected]> * Properly interpret extractor statuses (#356) * various fixes to message listener * black formatting * Added new component for file actions (#352) * Added new component for file actions * Tested file download and file delete from File page * Verified after deleting file, page navigates back to main dataset page * Fixed version-not-updating error * Updated onSave sequence to first synchronously update the file and then subsequently call listVersions API to display on the frontend * Fixed lint issues by running eslint * UI for displaying logs on extractors (#317) * Static UI for displaying logs on extractors * Created 2 React custom components: * ExtractorStatus -- UI table to display summary of job/progress so far * ExtractorLogs -- terminal-like UI to display logs in real-time as they are being fetched * [WIP] Enable extractor job id retrieval from backend * Updated redux store value when job id is returned * Added actions for job update, job summary API endpoints * [WIP DO NOT MERGE] Testing frontend functionality with dynamic log rendering * Updated API endpoints, return payload, reducers and actions (#330) * Updated API endpoints, return payload, reducers and actions for the extractor jobs * Tested API calls to fetchJobSummary, fetchJobUpdates from frontend, able to receive extractor job data * Tested submit job process flow, able to retrieve extractor job id from the backend * Removed redundant code --------- Co-authored-by: Chen Wang <[email protected]> * Integrated UI with backend * Tested with sample extractor logs, able to view both extractor status and summary from the UI * Removed hardcoding of job id * Fixed indentation * Adding download button on UI to download correct verison file (#334) * Adding download button on UI to download correct verison file * fix filename issue * Update FileVersion.ts --------- Co-authored-by: Chen Wang <[email protected]> * fix member typing (#336) * fix member typing * adjust the test so it can pass * fixing error of blank page on submit file to extractor (#338) files.py - renaming resubmit method codegen fixing error in listeners.py * Modified extractor job summary duration calculation * Moving fetchJobSummary, fetchJobSummary API calls to ExtractorStatus component to fix interval issue --------- Co-authored-by: Chen Wang <[email protected]> Co-authored-by: Dipannita <[email protected]> Co-authored-by: Todd Nicholson <[email protected]> * Millisecond (#360) * it's seconds * add more icons * default to open tab --------- Co-authored-by: Aruna Parameswaran <[email protected]> Co-authored-by: Chen Wang <[email protected]> Co-authored-by: Dipannita <[email protected]> Co-authored-by: Todd Nicholson <[email protected]> * Context matches v1 type. (#328) * context is now changed to list union of AnyUrl or dict. This should make v2 compatible with some extractors from extractors-core which dynamically build the context. * formatting fix test * formatting * still getting pytest errors changed context from optional[list] to list with default value empty, but still is not fixing the issues. * should fix the tests * formatting * getting rid of class 'context element' not needed * getting rid of class 'context element' not needed * removing console log --------- Co-authored-by: Chen Wang <[email protected]> * Mongo views (#353) * add a readme * add listener job update view * temp * add init * add to production docker-compose as well * 343 list resources that user has access (#355) * add logic to filter user * add filters to file * list files working too * add same filter for folder * include executions * exempt the current resource owner * add test database * rewrite createView part * rewrite init script but it's still not working yet * fix typo * fix linting * Replaced EmbeddedSearch with custom search box component to fix session refresh issue (#365) * Tested dataset search by clicking button and pressing enter, able to view search results * match user_ids which is a list now (#368) * update to lookup query * update authorization_deps queries * simplify group logic in auth check * formatting * Formatting --------- Co-authored-by: Aruna Parameswaran <[email protected]> Co-authored-by: Chen Wang <[email protected]> Co-authored-by: Todd Nicholson <[email protected]> Co-authored-by: Max Burnette <[email protected]>

add a readme

e6cc942

longshuicy requested review from arunapa, ddey2, lmarini and tcnichol February 23, 2023 20:41

longshuicy requested a review from max-zilla as a code owner February 23, 2023 20:41

longshuicy linked an issue Feb 23, 2023 that may be closed by this pull request

Design mongodb view for authorization #342

Closed

add listener job update view

2636303

tcnichol approved these changes Feb 27, 2023

View reviewed changes

longshuicy added 9 commits February 28, 2023 13:28

temp

ff12565

add init

043004d

add to production docker-compose as well

c16f329

343 list resources that user has access (#355)

333af5c

* add logic to filter user * add filters to file * list files working too * add same filter for folder * include executions * exempt the current resource owner

add test database

74e6e74

rewrite createView part

446ad06

rewrite init script but it's still not working yet

fde954f

fix typo

f225c1a

fix linting

714b69b

max-zilla approved these changes Mar 2, 2023

View reviewed changes

max-zilla merged commit 934c8b9 into main Mar 2, 2023

max-zilla deleted the 342-design-mongodb-view-for-authorization branch March 2, 2023 14:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mongo views #353

Mongo views #353

Uh oh!

longshuicy commented Feb 23, 2023

Uh oh!

lmarini commented Feb 24, 2023

Uh oh!

longshuicy commented Feb 24, 2023

Uh oh!

tcnichol commented Feb 27, 2023

Uh oh!

tcnichol left a comment

Uh oh!

max-zilla commented Mar 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Mongo views #353

Mongo views #353

Uh oh!

Conversation

longshuicy commented Feb 23, 2023

dataset

file

folder

metadata

listener_jobs

listener_job_update

Uh oh!

lmarini commented Feb 24, 2023

Uh oh!

longshuicy commented Feb 24, 2023

Uh oh!

tcnichol commented Feb 27, 2023

Uh oh!

tcnichol left a comment

Choose a reason for hiding this comment

Uh oh!

max-zilla commented Mar 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants