⚠️ This repository is archived and no longer maintained. Pip package maintainer tools, no longer maintained. For current DataJoint development, see datajoint-python.
Otumat (pronounced "Automate") is a suite of maintainer tools and utilities for pip packages.
The following features are currently supported.
Have you ever wondered:
- How many users are actually using my Python package?
- How often are they using my Python package?
- Which features/methods are most used and which ones are less interesting to the community?
- Is there a better alternative to track and measure usage data than using anonymous download data available from PyPi's download logs in Google BigQuery? See Analyzing PyPI Package Download for more details on this
Since a reasonable solution could not be found, I introduced this feature that provides the mechanism and building blocks to have usage tracking data as granular as you need it.
There are a few pre-requisites or assumptions:
- User clients will have at least some periodic internet connection to upload usage logs to a centralized, remote usage-data aggregating host
- 4 HTTP routes need to be implemented on your remote usage-data host
GETGUI-based authenticated route to register package installations with a user. You may use it to collect consent, have your user complete a survey, etc.POSTauthenticated API route to accept the form submission of the above GUI route. AninstallIdshould be returned along with other details to ensure an 'open' connection.POSTauthenticated API route that accepts user's event data and will store in an medium of your choice.POSTstandard OAuth2.0 route that will allow refreshingaccess_token's andrefresh_token's. PKCE flow) currently implemented).
Specific request/response details for the above 4 routes to follow soon.
Once your remote server is ready, simply add the following to your package:
- Include
otumatas arequirementsdependency - In your
__init__.py, intantiate anUsageAgentthat your package can refer to. For example:Therefore, the first time your package is imported on the client's machine, it will trigger the usage tracking installation enrollment. User's will need to opt-in though the default is not to collect usage data.from otumat.usage import UsageAgent as _UsageAgent usage_agent = _UsageAgent(author='DataJoint', data_directory='datajoint-python', package_name=__name__, host='https://datajoint.io', install_route='/user/usage-install', event_route='/api/usage-event', refresh_route='/auth/token', response_timeout=300, upload_frequency='12h')
- Log any interesting event within your package using the instantiated
UsageAgent. For example, we can log imports by including the following also in our__init__.py:Events will be buffered locally until the upload interval arrives. Caches are then unloaded. Daemon service runs cross-platform for Windows, MACOS, Linux and activates on startup.usage_agent.log(event_type='import')
Specific example of what an implemented flow looks like to follow soon.
There are some cases where it is undesirable to have the usage tracking flow triggered. For instance, if you'd like to depend on a package (e.g. datajoint) which does have the usage tracking flow enabled but would rather not trigger it within your package. For such a case, you could do the following in your package's __init__.py before your first import from datajoint. It will effectively disable usage tracking checks, flows, and prompts in your package:
import otumat as _otumat
_otumat.DISABLE_USAGE_TRACKING_PACKAGES = (['datajoint'] +
_otumat.DISABLE_USAGE_TRACKING_PACKAGES)
# first import from package with usage tracking enabled
import datajointThis feature allows you to run a given script whenever a selected file is modified.
To watch a file, install otumat using pip install otumat, then run the command:
otumat watch [-h] -f WATCH_FILE [-i WATCH_INTERVAL] -s WATCH_SCRIPT [watch_args ...]
Help:
-h, --help
Required named arguments:
-f WATCH_FILE: Path to file to be watched.-s WATCH_SCRIPT: Path to script to run on file change.
Optional named arguments:
-i WATCH_INTERVAL: Interval in seconds between polls.- Defaults to 5 seconds.
watch_args: Arguments providing state between runs.- Defaults to no arguments.
This package also includes a setuptools extension that provides new keyword arguments privkey_path and pubkey_path.
By specifying the privkey_path, setuptools will generate the git hash (SHA1) of the module directory and sign the output based on the PEM key path passed in. The resulting signature will be stored as egg metadata {{module_name}}.sig accessible via pkg_resources module.
If passing pubkey_path, this will simply be copied in as egg metadata {{module_name}}.pub.
This provides a solution to determining the 'trust-worthiness' of plugins or extensions that may be developed by the community for a given pip package if the public key file is available for the RSA keypair. The choice of what to do for failed verification is up to you.
setuptools.setup(
...
setup_requires=['otumat'],
pubkey_path='./pubkey.pem',
...setuptools.setup(
...
setup_requires=['otumat'],
privkey_path='~/keys/privkey.pem',
...import pkg_resources
from pathlib import Path
from otumat import hash_pkg, verify
base_name = 'base'
plugin_name = 'plugin1'
base_meta = pkg_resources.get_distribution(base_name)
plugin_meta = pkg_resources.get_distribution(plugin_name)
data = hash_pkg(pkgpath=str(Path(plugin_meta.module_path, plugin_name)))
signature = plugin_meta.get_metadata('{}.sig'.format(plugin_name))
pubkey_path = str(Path(base_meta.egg_info, '{}.pub'.format(base_name)))
verify(pubkey_path=pubkey_path, data=data, signature=signature)For reference, certificates may also be generated and verified using git and openssl by the following process:
$ cd {{/path/to/local/repo/dir}}
$ git add . --all
$ GIT_HASH=$(git ls-files -s {{/pip/package/dir}} | git hash-object --stdin)
$ printf $GIT_HASH | openssl dgst -sha256 -sign {{/path/to/privkey/pem}} -out {{pip_package_name}}.sigbin -sigopt rsa_padding_mode:pss
$ openssl enc -base64 -in {{pip_package_name}}.sigbin -out {{pip_package_name}}.sig
$ rm {{pip_package_name}}.sigbin
$ git reset$ cd {{/path/to/local/repo/dir}}
$ git add . --all
$ GIT_HASH=$(git ls-files -s {{/pip/package/dir}} | git hash-object --stdin)
$ openssl enc -base64 -d -in {{pip_package_name}}.sig -out {{pip_package_name}}.sigbin
$ printf $GIT_HASH | openssl dgst -sha256 -verify {{/path/to/pubkey/pem}} -signature {{pip_package_name}}.sigbin -sigopt rsa_padding_mode:pss
$ rm {{pip_package_name}}.sigbin
$ git reset