radiant_mlhub package

Subpackages

Submodules

radiant_mlhub.exceptions module

exception radiant_mlhub.exceptions.APIKeyNotFound[source]

Bases: radiant_mlhub.exceptions.MLHubException

Raised when an API key cannot be found using any of the strategies described in the Authentication docs.

exception radiant_mlhub.exceptions.AuthenticationError[source]

Bases: radiant_mlhub.exceptions.MLHubException

Raised when the Radiant MLHub API cannot authenticate the request, either because the API key is invalid or expired, or because no API key was provided in the request.

exception radiant_mlhub.exceptions.EntityDoesNotExist[source]

Bases: radiant_mlhub.exceptions.MLHubException

Raised when attempting to fetch a collection that does not exist in the Radiant MLHub API.

exception radiant_mlhub.exceptions.MLHubException[source]

Bases: Exception

Base exception class for all Radiant MLHub exceptions

radiant_mlhub.if_exists module

class radiant_mlhub.if_exists.DownloadIfExistsOpts(value)[source]

Bases: str, enum.Enum

Allowed values for download’s if_exists option.

overwrite = 'overwrite'

resume = 'resume'

skip = 'skip'

radiant_mlhub.retry_config module

radiant_mlhub.retry_config.config() → Optional[urllib3.util.retry.Retry][source]

Common configuration for http backoff/retry strategy.

{backoff factor} * (2 ** ({number of total retries} - 1))

0.2 * (2 ** (10 - 1)) = 102.4 seconds

radiant_mlhub.session module

Methods and classes to simplify constructing and authenticating requests to the MLHub API.

It is generally recommended that you use the get_session() function to create sessions, since this will propertly handle resolution of the API key from function arguments, environment variables, and profiles as described in Authentication. See the get_session() docs for usage examples.

class radiant_mlhub.session.Session(*, api_key: Optional[str])[source]

Bases: requests.sessions.Session

Custom class inheriting from requests.Session with some additional conveniences:

Adds the API key as a key query parameter
Adds an Accept: application/json header
Adds a User-Agent header that contains the package name and version, plus basic system information like the OS name
Prepends the MLHub root URL (https://api.radiant.earth/mlhub/v1/) to any request paths without a domain
Raises a radiant_mlhub.exceptions.AuthenticationError for 401 (UNAUTHORIZED) responses
Calls requests.Response.raise_for_status() after all requests to raise exceptions for any status codes above 400.

API_KEY_ENV_VARIABLE = 'MLHUB_API_KEY'

DEFAULT_ROOT_URL = 'https://api.radiant.earth/mlhub/v1/'

MLHUB_HOME_ENV_VARIABLE = 'MLHUB_HOME'

PROFILE_ENV_VARIABLE = 'MLHUB_PROFILE'

ROOT_URL_ENV_VARIABLE = 'MLHUB_ROOT_URL'

classmethod from_config(profile: Optional[str] = None) → radiant_mlhub.session.Session[source]

Create a session object by reading an API key from the given profile in the profiles file. By default, the client will look for the profiles file in a .mlhub directory in the user’s home directory (as determined by Path.home). However, if an MLHUB_HOME environment variable is present, the client will look in that directory instead.

Parameters: profile (str, optional) – The name of a profile configured in the profiles file.
Returns: session
Return type: Session
Raises: APIKeyNotFound – If the given config file does not exist, the given profile cannot be found, or there is no api_key property in the given profile section.

classmethod from_env() → radiant_mlhub.session.Session[source]

Create a session object from an API key from the environment variable.

Returns: session
Return type: Session
Raises: APIKeyNotFound – If the API key cannot be found in the environment

paginate(url: str, **kwargs: Any) → Iterator[Dict[str, Any]][source]

Makes a GET request to the given url and paginates through all results by looking for a link in each response with a rel type of "next". Any additional keyword arguments are passed directly to requests.Session.get().

Parameters: url (str) – The URL to which the initial request will be made. Note that this may either be a full URL or a path relative to the ROOT_URL as described in Session.request().
Yields: page (dict) – An individual response as a dictionary.

request(method: str, url: str, **kwargs: Any) → requests.models.Response[source]

Overwrites the default requests.Session.request() method to prepend the MLHub root URL if the given url does not include a scheme. This will raise an AuthenticationError if a 401 response is returned by the server, and a HTTPError if any other status code of 400 or above is returned.

Parameters

method (str) – The request method to use. Passed directly to the method argument of requests.Session.request()
url (str) – Either a full URL or a path relative to the ROOT_URL. For example, to make a request to the Radiant MLHub API /collections endpoint, you could use session.get('collections').
**kwargs – All other keyword arguments are passed directly to requests.Session.request() (see that documentation for an explanation of these keyword arguments).

Raises

AuthenticationError – If the response status code is 401
HTTPError – For all other response status codes at or above 400

radiant_mlhub.session.get_session(*, api_key: Optional[str] = None, profile: Optional[str] = None) → radiant_mlhub.session.Session[source]

Gets a Session object that uses the given api_key for all requests. Resolves an API key by trying each of the following (in this order):

Use the api_key argument provided (Optional).

Use an MLHUB_API_KEY environment variable.

Use the profile argument provided (Optional).

Use the MLHUB_PROFILE environment variable.

Use the default profile

If none of the above strategies results in a valid API key, then an APIKeyNotFound exception is raised. See Using Profiles section for details.

Parameters

api_key (str, optional) – The API key to use for all requests from the session. See description above for how the API key is resolved if not provided as an argument.
profile (str, optional) – The name of a profile configured in the .mlhub/profiles file. This will be passed directly to from_config().

Returns

session

Return type

Session

Raises

APIKeyNotFound – If no API key can be resolved.

Examples

>>> from radiant_mlhub import get_session
# Get the API from the "default" profile
>>> session = get_session()
# Get the session from the "project1" profile
# Alternatively, you could set the MLHUB_PROFILE environment variable to "project1"
>>> session = get_session(profile='project1')
# Pass an API key directly to the session
# Alternatively, you could set the MLHUB_API_KEY environment variable to "some-api-key"
>>> session = get_session(api_key='some-api-key')

Module contents

class radiant_mlhub.Collection(id: str, description: str, extent: pystac.collection.Extent, title: Optional[str] = None, stac_extensions: Optional[List[str]] = None, href: Optional[str] = None, extra_fields: Optional[Dict[str, Any]] = None, catalog_type: Optional[pystac.catalog.CatalogType] = None, license: str = 'proprietary', keywords: Optional[List[str]] = None, providers: Optional[List[pystac.provider.Provider]] = None, summaries: Optional[pystac.summaries.Summaries] = None, *, api_key: Optional[str] = None, profile: Optional[str] = None)[source]

Bases: pystac.collection.Collection

Class inheriting from pystac.Collection that adds some convenience methods for listing and fetching from the Radiant MLHub API.

property archive_size: Optional[int]: The size of the tarball archive for this collection in bytes (or None if the archive does not exist).

download(output_dir: Union[str, pathlib.Path], *, if_exists: radiant_mlhub.if_exists.DownloadIfExistsOpts = DownloadIfExistsOpts.resume, api_key: Optional[str] = None, profile: Optional[str] = None) → pathlib.Path[source]

Downloads the archive for this collection to an output location (current working directory by default). If the parent directories for output_path do not exist, they will be created.

The if_exists argument determines how to handle an existing archive file in the output directory. See the documentation for the download_archive() function for details. The default behavior is to resume downloading if the existing file is incomplete and skip the download if it is complete.

Note

Some collections may be very large and take a significant amount of time to download, depending on your connection speed.

Parameters

output_dir (Path) – Path to a local directory to which the file will be downloaded. File name will be generated automatically based on the download URL.
if_exists (str, optional) – How to handle an existing archive at the same location. If "skip", the download will be skipped. If "overwrite", the existing file will be overwritten and the entire file will be re-downloaded. If "resume" (the default), the existing file size will be compared to the size of the download (using the Content-Length header). If the existing file is smaller, then only the remaining portion will be downloaded. Otherwise, the download will be skipped.
api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

output_path – The path to the downloaded archive file.

Return type

pathlib.Path

Raises

FileExistsError – If file at output_path already exists and both exist_okay and overwrite are False.

classmethod fetch(collection_id: str, *, api_key: Optional[str] = None, profile: Optional[str] = None) → Collection[source]

Creates a Collection instance by fetching the collection with the given ID from the Radiant MLHub API.

Parameters

collection_id (str) – The ID of the collection to fetch (e.g. bigearthnet_v1_source).
api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

collection

Return type

Collection

fetch_item(item_id: str, *, api_key: Optional[str] = None, profile: Optional[str] = None) → pystac.item.Item[source]

classmethod from_dict(d: Dict[str, Any], href: Optional[str] = None, root: Optional[pystac.catalog.Catalog] = None, migrate: bool = False, preserve_dict: bool = True, *, api_key: Optional[str] = None, profile: Optional[str] = None) → Collection[source]: Patches the pystac.Collection.from_dict() method so that it returns the calling class instead of always returning a pystac.Collection instance.

get_items(*, api_key: Optional[str] = None, profile: Optional[str] = None) → Iterator[pystac.item.Item][source]

Note

The get_items method is not implemented for Radiant MLHub Collection instances for performance reasons. Please use the Dataset.download() method to download Dataset assets.

Raises: NotImplementedError –

classmethod list(*, api_key: Optional[str] = None, profile: Optional[str] = None) → List[Collection][source]

Returns a list of Collection instances for all collections hosted by MLHub.

See the Authentication documentation for details on how authentication is handled for this request.

Parameters

api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

collections

Return type

List[Collection]

property registry_url: Optional[str]: The URL of the registry page for this Collection. The URL is based on the DOI identifier for the collection. If the Collection does not have a "sci:doi" property then registry_url will be None.

class radiant_mlhub.Dataset(id: str, collections: List[Dict[str, Any]], title: Optional[str] = None, registry: Optional[str] = None, doi: Optional[str] = None, citation: Optional[str] = None, *, api_key: Optional[str] = None, profile: Optional[str] = None, **_: Any)[source]

Bases: object

Class that brings together multiple Radiant MLHub “collections” that are all considered part of a single “dataset”. For instance, the bigearthnet_v1 dataset is composed of both a source imagery collection (bigearthnet_v1_source) and a labels collection (bigearthnet_v1_labels).

id

The dataset ID.

Type: str

title

The title of the dataset (or None if dataset has no title).

Type: str or None

registry_url

The URL to the registry page for this dataset, or None if no registry page exists.

Type: str or None

doi

The DOI identifier for this dataset, or None if there is no DOI for this dataset.

Type: str or None

citation

The citation information for this dataset, or None if there is no citation information.

Type: str or None

property collections: radiant_mlhub.models.dataset._CollectionList

List of collections associated with this dataset. The list that is returned has 2 additional attributes (source_imagery and labels) that represent the list of collections corresponding the each type.

Note

This is a cached property, so updating self.collection_descriptions after calling self.collections the first time will have no effect on the results. See functools.cached_property() for details on clearing the cached value.

Examples

>>> from radiant_mlhub import Dataset
>>> dataset = Dataset.fetch('bigearthnet_v1')
>>> len(dataset.collections)
2
>>> len(dataset.collections.source_imagery)
1
>>> len(dataset.collections.labels)
1

To loop through all collections

>>> for collection in dataset.collections:
...     # Do something here

To loop through only the source imagery collections:

>>> for collection in dataset.collections.source_imagery:
...     # Do something here

To loop through only the label collections:

>>> for collection in dataset.collections.labels:
...     # Do something here

download(output_dir: Union[pathlib.Path, str] = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/radiant-mlhub/checkouts/stable/docs/source'), *, asset_output_dir: Optional[Union[pathlib.Path, str]] = None, catalog_only: bool = False, if_exists: radiant_mlhub.if_exists.DownloadIfExistsOpts = DownloadIfExistsOpts.resume, api_key: Optional[str] = None, profile: Optional[str] = None, bbox: Optional[List[float]] = None, intersects: Optional[Dict[str, Any]] = None, datetime: Optional[Union[datetime.datetime, Tuple[datetime.datetime, datetime.datetime]]] = None, collection_filter: Optional[Dict[str, List[str]]] = None) → None[source]

Downloads dataset’s STAC catalog and all linked assets. The download may be customized and controlled by providing bbox, intersects, datetime, and filter options.

Parameters

output_dir (str or pathlib.Path) – The directory into which the STAC catalog will be written. If no asset_output_dir is specified, the assets will also be saved to the output_dir. Defaults to current working directory.
asset_output_dir (Otional[str, pathlib.Path]) – The directory into which the archives will be written. If not defined by the user, the assets are saved to their respective asset level STAC catalog directories in the output_dir, which is in the current working directory by default.
catalog_only (bool) – If True, the STAC catalog will be downloaded and unarchived, but no assets wll be downloaded. Defaults to False.
if_exists (Optional[str]) – Allowed values: skip, overwrite, or resume (default).
bbox (Optional[List[float]]) – List representing a bounding box of coordinates, for spatial intersection filter. Must be in CRS EPSG:4326.
intersects (Optional[GeoJSON]) – GeoJSON object for spatial intersects filter. Must be a parsed GeoJSON dict with a geometry property.
datetime (Optional[datetime, Tuple[datetime, datetime]]) – Single datetime or datetime range for temporal filter.
collection_filter (Optional[Dict[str, list]]) –
Mapping of collection_id and asset keys to include (exclusively).

examples:
- download will only include this collection:
  dict(ref_landcovernet_sa_v1_source_sentinel_2=[])
- download will only include this collection and only these asset keys:
  dict(ref_landcovernet_sa_v1_source_sentinel_2=[“B02”, “B03”, “B04”])
api_key (Optional[str]) – An API key to use for this request. This will override an API key set in a profile on using an environment variable.
profile (Optional[str]) – Authentication Profile to use when making this request.

Raises

IOError – If output_dir exists and is not a directory. If unrecoverable download errors occurred.
ValueError – If provided filters are incompatible, for example bbox and intersects.
RuntimeError – If filters result in zero assets to download.

Any unrecoverable download errors will be logged to {output_dir}/{dataset_id}/err_report.csv.

property estimated_dataset_size: Optional[int]: Size in bytes of entire dataset (bytes)

classmethod fetch(dataset_id_or_doi: str, *, api_key: Optional[str] = None, profile: Optional[str] = None) → Dataset[source]

Creates a Dataset instance by first trying to fetching the dataset based on ID, then falling back to fetching by DOI.

Parameters

dataset_id_or_doi (str) – The ID or DOI of the dataset to fetch (e.g. bigearthnet_v1).
api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

dataset

Return type

Dataset

classmethod fetch_by_doi(dataset_doi: str, *, api_key: Optional[str] = None, profile: Optional[str] = None) → Dataset[source]

Creates a Dataset instance by fetching the dataset with the given DOI from the Radiant MLHub API.

Parameters

dataset_doi (str) – The DOI of the dataset to fetch (e.g. 10.6084/m9.figshare.12047478.v2).
api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

dataset

Return type

Dataset

classmethod fetch_by_id(dataset_id: str, *, api_key: Optional[str] = None, profile: Optional[str] = None) → Dataset[source]

Creates a Dataset instance by fetching the dataset with the given ID from the Radiant MLHub API.

Parameters

dataset_id (str) – The ID of the dataset to fetch (e.g. bigearthnet_v1).
api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

dataset

Return type

Dataset

classmethod list(*, tags: Optional[Union[str, Iterable[str]]] = None, text: Optional[Union[str, Iterable[str]]] = None, api_key: Optional[str] = None, profile: Optional[str] = None) → List[Dataset][source]

Returns a list of Dataset instances for each datasets hosted by MLHub.

See the Authentication documentation for details on how authentication is handled for this request.

Parameters

tags (A list of tags to filter datasets by. If not None, only datasets containing all) – provided tags will be returned.
text (A list of text phrases to filter datasets by. If not None, only datasets) – containing all phrases will be returned.
api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Yields

dataset (Dataset)

property stac_catalog_size: Optional[int]: Size of the dataset_id.tar.gz STAC archive (bytes)

class radiant_mlhub.DownloadIfExistsOpts(value)[source]

Bases: str, enum.Enum

Allowed values for download’s if_exists option.

overwrite = 'overwrite'

resume = 'resume'

skip = 'skip'

class radiant_mlhub.MLModel(id: str, geometry: Optional[Dict[str, Any]], bbox: Optional[List[float]], datetime: Optional[datetime.datetime], properties: Dict[str, Any], stac_extensions: Optional[List[str]] = None, href: Optional[str] = None, collection: Optional[Union[str, pystac.collection.Collection]] = None, extra_fields: Optional[Dict[str, Any]] = None, *, api_key: Optional[str] = None, profile: Optional[str] = None)[source]

Bases: pystac.item.Item

classmethod fetch(model_id: str, *, api_key: Optional[str] = None, profile: Optional[str] = None) → radiant_mlhub.models.ml_model.MLModel[source]

Fetches a MLModel instance by id.

Parameters

model_id (str) – The ID of the ML Model to fetch (e.g. model-cyclone-wind-estimation-torchgeo-v1).
api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

model

Return type

MLModel

classmethod from_dict(d: Dict[str, Any], href: Optional[str] = None, root: Optional[pystac.catalog.Catalog] = None, migrate: bool = False, preserve_dict: bool = True, *, api_key: Optional[str] = None, profile: Optional[str] = None) → radiant_mlhub.models.ml_model.MLModel[source]: Patches the pystac.Item.from_dict() method so that it returns the calling class instead of always returning a pystac.Item instance.

classmethod list(*, api_key: Optional[str] = None, profile: Optional[str] = None) → List[radiant_mlhub.models.ml_model.MLModel][source]

Returns a list of MLModel instances for all models hosted by MLHub.

See the Authentication documentation for details on how authentication is handled for this request.

Parameters

api_key (str) – An API key to use for this request. This will override an API key set in a profile on using an environment variable
profile (str) – A profile to use when making this request.

Returns

models

Return type

List[MLModel]

session_kwargs: Dict[str, Any] = {}: Class inheriting from pystac.Item that adds some convenience methods for listing and fetching from the Radiant MLHub API.

radiant_mlhub.get_session(*, api_key: Optional[str] = None, profile: Optional[str] = None) → radiant_mlhub.session.Session[source]

Gets a Session object that uses the given api_key for all requests. Resolves an API key by trying each of the following (in this order):

Use the api_key argument provided (Optional).

Use an MLHUB_API_KEY environment variable.

Use the profile argument provided (Optional).

Use the MLHUB_PROFILE environment variable.

Use the default profile

If none of the above strategies results in a valid API key, then an APIKeyNotFound exception is raised. See Using Profiles section for details.

Parameters

api_key (str, optional) – The API key to use for all requests from the session. See description above for how the API key is resolved if not provided as an argument.
profile (str, optional) – The name of a profile configured in the .mlhub/profiles file. This will be passed directly to from_config().

Returns

session

Return type

Session

Raises

APIKeyNotFound – If no API key can be resolved.

Examples

>>> from radiant_mlhub import get_session
# Get the API from the "default" profile
>>> session = get_session()
# Get the session from the "project1" profile
# Alternatively, you could set the MLHUB_PROFILE environment variable to "project1"
>>> session = get_session(profile='project1')
# Pass an API key directly to the session
# Alternatively, you could set the MLHUB_API_KEY environment variable to "some-api-key"
>>> session = get_session(api_key='some-api-key')