Python: Add support for threat models by RasmusWL · Pull Request #17203 · github/codeql

RasmusWL · 2024-08-12T13:58:38Z

No description provided.

python/ql/lib/semmle/python/frameworks/data/ModelsAsData.qll

Naming in other languages: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and instead I came up with new names.

Without, it's impossible to write test showing what threat-models are active by default... unless I provide a hardcoded list in the test itself, which is not any fun.

I didn't want to put the configuration file in `semmle/python/frameworks/**/*.model.yml`, so created `ext/` as in other languages

asgerf

Looks great! I mainly a comment about the class names otherwise the structure LGTM

asgerf · 2024-08-20T09:50:56Z

python/ql/lib/semmle/python/Concepts.qll

+ * Extend this class to refine existing API models. If you want to model new APIs,
+ * extend `ThreatModelSource::Range` instead.
+ */
+class ThreatModelSource extends DataFlow::Node instanceof ThreatModelSource::Range {


Might I suggest the name FlowSource for this class? It seems consistent with C++ and Swift at least, and it works nicely with RemoteFlowSource being a special case of it.

Then we could rename ActiveThreatModelSource to ThreatModelFlowSource to be consistent with other languages. I do agree that the "active" prefix makes sense, but given that this will be the new go-to thing to use in isSource() predicates it seems that we really do want consistency for that class name.

I've persuaded @michaelnebel to be in favor of ActiveThreatModelSource, so I've just filed a PR to make the existing languages use that as well 👍 #17424

FlowSource

Regarding renaming to FlowSource, I've tried to do that here: RasmusWL@dec5daa

I'm slightly hesitant to accept it, I can't quite put my finger on it, but I think it's because it's so generic that name could be used to capture the set of sources for any data-flow/taint-tracking configuration, no matter the logic, whereas ThreatModelSource seem to convey a more specific meaning to me.

I realize that your suggestion probably fits pretty well with current naming in C#/Java/Go with SourceNode and C++/Swift with FlowSource 🤔 Maybe I'll see if anyone comes with a convincing argument during next round of review, otherwise it looks like I should just disagree-and-commit.

Whatever we do we ought to treat the naming of the two classes as a single decision; not something where we try to make a decision for each class in isolation.

The proposed rename to FlowSource made sense when the other class would be called ThreatModelFlowSource; but it doesn't work so so nicely with ActiveThreatModelSource. I'd prefer the combination ThreatModelSource/ActiveThreatModelSource over FlowSource/ActiveThreatModelSource.

Alright, I'm feeling strongly in favor of ActiveThreatModelSource, so it seems like I won't add the FlowSource commit 👍

python/ql/lib/semmle/python/security/dataflow/CodeInjectionCustomizations.qll

python/ql/test/library-tests/frameworks/stdlib/threat_models.py

Since using `.DictionaryElementAny` doesn't actually do a store on the source, (so we can later follow any dict read-steps). I added the ensure_tainted steps to highlight that the result of the WHOLE expression ends up "tainted", and that we don't just mark `os.environ` as the source without further flow.

@michaelnebel

As part of adding support for threat-models to Python/JS (see github#17203), we ran into some trouble with name clashes. Naming in existing languages supporting threat-models: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and we had to come up with new names. Initially I used `ThreatModelSource` for the "QL only modeling", but that meant that we needed a new name to represent the active sources coming from either QL or data-extensions... for this I came up with `ActiveThreatModelSource`, and I really liked it. To me, it's much clearer that this class only contains the currently active threat model sources. So to align languages, I got approval from @michaelnebel to rename the existing classes.

tausbn

Just one minor comment, otherwise I think this looks really good! (I'm approving it now in case the minor comment is irrelevant.)

python/ql/lib/semmle/python/Concepts.qll

Co-authored-by: Taus <tausbn@github.com>

tausbn

github-actions bot added the Python label Aug 12, 2024

github-advanced-security bot found potential problems Aug 12, 2024

View reviewed changes

python/ql/lib/semmle/python/frameworks/data/ModelsAsData.qll Fixed Show fixed Hide fixed

RasmusWL force-pushed the threat-models branch from 1e35fd4 to 8747fa4 Compare August 16, 2024 09:11

github-actions bot added the documentation label Aug 16, 2024

RasmusWL force-pushed the threat-models branch 2 times, most recently from b145618 to 2117d1f Compare August 16, 2024 11:42

RasmusWL added 4 commits August 19, 2024 10:54

ThreatModels: Expose knownThreatModel

766dcc4

Without, it's impossible to write test showing what threat-models are active by default... unless I provide a hardcoded list in the test itself, which is not any fun.

Python: Add test showing default active threat-models

617ab27

Python: Remove 'response' from default threat-models

8f7dec0

I didn't want to put the configuration file in `semmle/python/frameworks/**/*.model.yml`, so created `ext/` as in other languages

RasmusWL force-pushed the threat-models branch 2 times, most recently from e1b2ae4 to 93b5060 Compare August 19, 2024 12:51

RasmusWL mentioned this pull request Aug 19, 2024

JS: Add support for threat models #17256

Merged

RasmusWL marked this pull request as ready for review August 20, 2024 08:57

RasmusWL requested a review from a team as a code owner August 20, 2024 08:57

asgerf reviewed Aug 20, 2024

View reviewed changes

RasmusWL added 14 commits September 10, 2024 14:32

Python: Make queries use ActiveThreatModelSource

528f08f

Python: Add basic support for environment/commandargs threat-models

b9239d7

Python: Proper threat-model handling for argparse

e1801f3

Python: Model stdin thread-model

66f389a

Python: Model file threat-model

d245db5

Python: Fixup modeling of os.open

7483075

Python: Add basic support for database threat-model

8d8cd05

Python: Add e2e threat-model test

a0b24d6

Python: Add change-note

0ccb5b1

Docs: Update threat-model list to include Python

7d3793e

Python: Add threat-modeling of raw_input

333367c

Python: Additional threatModelSource annotations

cbebf7b

Python: Add links to threat-model docs

5ff7b65

RasmusWL mentioned this pull request Sep 10, 2024

Go/Java/C#: Rename ThreatModelFlowSource to ActiveThreatModelSource #17424

Merged

Docs: Include 'Threat models' for Python

e35c2b2

RasmusWL force-pushed the threat-models branch from 93b5060 to e35c2b2 Compare September 10, 2024 14:45

Docs: Fix link

e11bfc2

tausbn previously approved these changes Sep 17, 2024

View reviewed changes

python/ql/lib/semmle/python/Concepts.qll Outdated Show resolved Hide resolved

RasmusWL and others added 2 commits September 23, 2024 11:19

Merge branch 'main' into threat-models

4a21a85

Python: Minor simplification of ActiveThreatModelSource

535db98

Co-authored-by: Taus <tausbn@github.com>

RasmusWL dismissed tausbn’s stale review via 535db98 September 23, 2024 09:22

tausbn previously approved these changes Sep 24, 2024

View reviewed changes

Merge branch 'main' into threat-models

431a1af

RasmusWL dismissed tausbn’s stale review via 431a1af September 26, 2024 09:44

RasmusWL requested a review from tausbn September 26, 2024 10:25

tausbn approved these changes Sep 26, 2024

View reviewed changes

RasmusWL merged commit 7c32efc into github:main Sep 26, 2024

RasmusWL deleted the threat-models branch September 26, 2024 11:15

felickz mentioned this pull request Oct 24, 2024

End of life the local packs with new configurable threat-models setting in CodeQL GitHubSecurityLab/CodeQL-Community-Packs#69

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Add support for threat models#17203

Python: Add support for threat models#17203
RasmusWL merged 23 commits intogithub:mainfrom
RasmusWL:threat-models

RasmusWL commented Aug 12, 2024

Uh oh!

Uh oh!

asgerf left a comment

Uh oh!

asgerf Aug 20, 2024

Uh oh!

RasmusWL Sep 10, 2024

Uh oh!

asgerf Sep 11, 2024

Uh oh!

RasmusWL Sep 16, 2024

Uh oh!

Uh oh!

Uh oh!

tausbn left a comment

Uh oh!

Uh oh!

tausbn left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RasmusWL commented Aug 12, 2024

Uh oh!

Uh oh!

asgerf left a comment

Choose a reason for hiding this comment

Uh oh!

asgerf Aug 20, 2024

Choose a reason for hiding this comment

Uh oh!

RasmusWL Sep 10, 2024

Choose a reason for hiding this comment

FlowSource

Uh oh!

asgerf Sep 11, 2024

Choose a reason for hiding this comment

Uh oh!

RasmusWL Sep 16, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tausbn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tausbn left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants