SQLiteStore¶
- class SQLiteStore(*args, **kwargs)[source]¶
SQLite backed annotation store.
Uses and rtree index for fast spatial queries.
- Version History:
- 1.0.0:
Initial version.
- 1.0.1 (07/10/2022):
Added optional “area” column and queries sorted/filtered by area.
Initialize
SQLiteStore
.Methods
Add a column to store the area of the geometry.
Appends new annotations to specified keys.
Query the store for annotation bounding boxes.
Remove all annotations from the store.
Closes
SQLiteStore
from file pointer or path.Commit any in-memory changes to disk.
Get the list of options that sqlite3 was compiled with.
Create an SQLite expression index based on the provided predicate.
Deserialize a geometry from a string or bytes.
Drop an index from the store.
Serialise a copy of the whole store to a file-like object.
Serialise and return a copy of store as a string or bytes.
Return annotations as a list of geoJSON features.
Get a connection to the database.
Return a list of the names of all indexes in the store.
Query the store for annotation keys.
Return iterable (generator) over key and annotations.
Return an iterable (usually generator) of all keys in the store.
Opens
SQLiteStore
from file pointer or path.Optimize the database with VACUUM and ANALYZE.
Bulk patch of annotations.
Query the store for annotation properties.
Runs Query.
Remove the area column from the store.
Bulk removal of annotations by keys.
Serialise a geometry to WKB with optional compression.
Converts AnnotationStore to
pandas.DataFrame
.Return an iterable of all annotation in the store.
- Parameters:
- add_area_column(*, mk_index=True)[source]¶
Add a column to store the area of the geometry.
- Parameters:
self (SQLiteStore)
mk_index (bool)
- Return type:
None
- append_many(annotations, keys=None)[source]¶
Appends new annotations to specified keys.
- Parameters:
self (SQLiteStore)
annotations (Iterable[Annotation])
keys (Iterable[str] | None)
- Return type:
- bquery(geometry=None, where=None)[source]¶
Query the store for annotation bounding boxes.
Acts similarly to AnnotationStore.query except it checks for intersection between stored and query geometry bounding boxes. This may be faster than a regular query in some cases, e.g. for SQliteStore with a large number of annotations.
Note that this method only checks for bounding box intersection and therefore may give a different result to using AnnotationStore.query with a box polygon and the “intersects” geometry predicate. Also note that geometry predicates are not supported for this method.
- Parameters:
geometry (Geometry or Iterable) – Geometry to use when querying. This can be a bounds (iterable of length 4) or a Shapely geometry (e.g. Polygon). If a geometry is provided, the bounds of the geometry will be used for the query. Full geometry intersection is not used for the query method.
where (str or bytes or Callable) – A statement which should evaluate to a boolean value. Only annotations for which this predicate is true will be returned. Defaults to None (assume always true). This may be a string, Callable, or pickled function as bytes. Callables are called to filter each result returned from the annotation store backend in python before being returned to the user. A pickle object is, where possible, hooked into the backend as a user defined function to filter results during the backend query. Strings are expected to be in a domain specific language and are converted to SQL on a best-effort basis. For supported operators of the DSL see
tiatoolbox.annotation.dsl
. E.g. a simple python expression props[“class”] == 42 will be converted to a valid SQLite predicate when using SQLiteStore and inserted into the SQL query. This should be faster than filtering in python after or during the query. Additionally, the same string can be used across different backends (e.g. the previous example predicate string is valid for both DictionaryStore and a SQliteStore). On the other hand it has many more limitations. It is important to note that untrusted user input should never be accepted to this argument as arbitrary code can be run via pickle or the parsing of the string statement.self (SQLiteStore)
- Returns:
A list of bounding boxes for each Annotation.
- Return type:
Example
>>> from tiatoolbox.annotation.storage import SQLiteStore >>> from shapely.geometry import Polygon >>> store = SQLiteStore() >>> store.append( ... Annotation( ... geometry=Polygon.from_bounds(0, 0, 1, 1), ... properties={"class": 42}, ... ), ... key="foo", ... ) >>> store.bquery(where="props['class'] == 42") {'foo': (0.0, 0.0, 1.0, 1.0)}
- clear()[source]¶
Remove all annotations from the store.
- Parameters:
self (SQLiteStore)
- Return type:
None
- close()[source]¶
Closes
SQLiteStore
from file pointer or path.- Parameters:
self (SQLiteStore)
- Return type:
None
- commit()[source]¶
Commit any in-memory changes to disk.
- Parameters:
self (SQLiteStore)
- Return type:
None
- static compile_options()[source]¶
Get the list of options that sqlite3 was compiled with.
Example
>>> for opt in SQLiteRTreeStore.compile_options(): >>> print(opt) COMPILER=gcc-7.5.0 ENABLE_COLUMN_METADATA ENABLE_DBSTAT_VTAB ENABLE_FTS3 ENABLE_FTS3_PARENTHESIS ENABLE_FTS3_TOKENIZER ENABLE_FTS4 ENABLE_FTS5 ENABLE_JSON1 ENABLE_LOAD_EXTENSION ENABLE_PREUPDATE_HOOK ENABLE_RTREE ENABLE_SESSION ENABLE_STMTVTAB ENABLE_UNLOCK_NOTIFY ENABLE_UPDATE_DELETE_LIMIT HAVE_ISNAN LIKE_DOESNT_MATCH_BLOBS MAX_SCHEMA_RETRY=25 MAX_VARIABLE_NUMBER=250000 OMIT_LOOKASIDE SECURE_DELETE SOUNDEX TEMP_STORE=1 THREADSAFE=1
- create_index(name, where, *, analyze=True)[source]¶
Create an SQLite expression index based on the provided predicate.
Note that an expression index will only be used if the query expression (in the WHERE clause) exactly matches the expression used when creating the index (excluding minor inconsequential changes such as whitespace).
An SQLite expression indexes require SQLite version 3.9.0 or higher.
- Parameters:
name (str) – Name of the index to create.
where (str | bytes) – The predicate used to create the index.
analyze (bool) – Whether to run the “ANALYZE” command after creating the index.
self (SQLiteStore)
- Return type:
None
- deserialize_geometry(data)[source]¶
Deserialize a geometry from a string or bytes.
- Parameters:
data (bytes or str) – The serialised representation of a Shapely geometry.
self (SQLiteStore)
- Returns:
The deserialized Shapely geometry.
- Return type:
Geometry
- drop_index(name)[source]¶
Drop an index from the store.
- Parameters:
name (str) – The name of the index to drop.
self (SQLiteStore)
- Return type:
None
- dump(fp)[source]¶
Serialise a copy of the whole store to a file-like object.
- Parameters:
fp (Path or str or IO) – A file path or file handle object for output to disk.
self (SQLiteStore)
- Return type:
None
- dumps()[source]¶
Serialise and return a copy of store as a string or bytes.
- Returns:
The serialised store.
- Return type:
- Parameters:
self (SQLiteStore)
- features()[source]¶
Return annotations as a list of geoJSON features.
- Returns:
List of features as dictionaries.
- Return type:
- Parameters:
self (SQLiteStore)
- get_connection(thread_id)[source]¶
Get a connection to the database.
- Parameters:
self (SQLiteStore)
thread_id (int)
- Return type:
- indexes()[source]¶
Return a list of the names of all indexes in the store.
- Returns:
The list of index names.
- Return type:
List[str]
- Parameters:
self (SQLiteStore)
- iquery(geometry=None, where=None, geometry_predicate='intersects', distance=0)[source]¶
Query the store for annotation keys.
Acts the same as AnnotationStore.query except returns keys instead of annotations.
- Parameters:
geometry (Geometry or Iterable) – Geometry to use when querying. This can be a bounds (iterable of length 4) or a Shapely geometry (e.g. Polygon).
where (str or bytes or Callable) – A statement which should evaluate to a boolean value. Only annotations for which this predicate is true will be returned. Defaults to None (assume always true). This may be a string, Callable, or pickled function as bytes. Callables are called to filter each result returned from the annotation store backend in python before being returned to the user. A pickle object is, where possible, hooked into the backend as a user defined function to filter results during the backend query. Strings are expected to be in a domain specific language and are converted to SQL on a best-effort basis. For supported operators of the DSL see
tiatoolbox.annotation.dsl
. E.g. a simple python expression props[“class”] == 42 will be converted to a valid SQLite predicate when using SQLiteStore and inserted into the SQL query. This should be faster than filtering in python after or during the query. Additionally, the same string can be used across different backends (e.g. the previous example predicate string is valid for both DictionaryStore `and a `SQliteStore). On the other hand it has many more limitations. It is important to note that untrusted user input should never be accepted to this argument as arbitrary code can be run via pickle or the parsing of the string statement.geometry_predicate (str) – A string which define which binary geometry predicate to use when comparing the query geometry and a geometry in the store. Only annotations for which this binary predicate is true will be returned. Defaults to “intersects”. For more information see the shapely documentation on binary predicates.
distance (float) – Distance used when performing a distance based query. E.g. “centers_within_k” geometry predicate.
self (SQLiteStore)
- Returns:
A list of keys for each Annotation.
- Return type:
- items()[source]¶
Return iterable (generator) over key and annotations.
- Parameters:
self (SQLiteStore)
- Return type:
- keys()[source]¶
Return an iterable (usually generator) of all keys in the store.
- Returns:
An iterable of keys.
- Return type:
Iterable[str]
- Parameters:
self (SQLiteStore)
- classmethod open(fp)[source]¶
Opens
SQLiteStore
from file pointer or path.- Parameters:
- Return type:
- optimize(limit=1000, *, vacuum=True)[source]¶
Optimize the database with VACUUM and ANALYZE.
- Parameters:
vacuum (bool) – Whether to run VACUUM.
limit (int) – The approximate maximum number of rows to examine when running ANALYZE. If zero or negative, not limit will be used. For more information see https://www.sqlite.org/pragma.html#pragma_analysis_limit.
self (SQLiteStore)
- Return type:
None
- patch_many(keys, geometries=None, properties_iter=None)[source]¶
Bulk patch of annotations.
This may be more efficient than calling patch repeatedly in a loop.
- Parameters:
geometries (iter(Geometry)) – An iterable of geometries to update.
properties_iter (iter(dict)) – An iterable of properties to update.
keys (iter(str)) – An iterable of keys for each annotation to be updated.
self (SQLiteStore)
- Return type:
None
- pquery(select, geometry=None, where=None, geometry_predicate='intersects', *, unique=True, squeeze=True)[source]¶
Query the store for annotation properties.
Acts similarly to AnnotationStore.query but returns only the value defined by select.
- Parameters:
select (str or bytes or Callable) – A statement defining the value to look up from the annotation properties. If select = “*”, all properties are returned for each annotation (unique must be False).
geometry (Geometry or Iterable) – Geometry to use when querying. This can be a bounds (iterable of length 4) or a Shapely geometry (e.g. Polygon). If a geometry is provided, the bounds of the geometry will be used for the query. Full geometry intersection is not used for the query method.
where (str or bytes or Callable) – A statement which should evaluate to a boolean value. Only annotations for which this predicate is true will be returned. Defaults to None (assume always true). This may be a string, Callable, or pickled function as bytes. Callables are called to filter each result returned from the annotation store backend in python before being returned to the user. A pickle object is, where possible, hooked into the backend as a user defined function to filter results during the backend query. Strings are expected to be in a domain specific language and are converted to SQL on a best-effort basis. For supported operators of the DSL see
tiatoolbox.annotation.dsl
. E.g. a simple python expression props[“class”] == 42 will be converted to a valid SQLite predicate when using SQLiteStore and inserted into the SQL query. This should be faster than filtering in python after or during the query. It is important to note that untrusted user input should never be accepted to this argument as arbitrary code can be run via pickle or the parsing of the string statement.geometry_predicate (str) –
A string defining which binary geometry predicate to use when comparing the query geometry and a geometry in the store. Only annotations for which this binary predicate is true will be returned. Defaults to “intersects”. For more information see the shapely documentation on binary predicates.
unique (bool) – If True, only unique values for each selected property will be returned as a list of sets. If False, all values will be returned as a dictionary mapping keys values. Defaults to True.
squeeze (bool) – If True, when querying for a single value with unique=True, the result will be a single set instead of a list of sets.
self (SQLiteStore)
- Return type:
dict[str, Properties] | list[set[Properties]] | set[Properties]
Examples
>>> from tiatoolbox.annotation.storage import SQLiteStore >>> from shapely.geometry import Point >>> store = SQLiteStore() >>> annotation = Annotation( ... geometry=Point(0, 0), ... properties={"class": 42}, ... ) >>> store.append(annotation, "foo") >>> store.pquery("*", unique=False) ... {'foo': {'class': 42}}
>>> from tiatoolbox.annotation.storage import SQLiteStore >>> from shapely.geometry import Point >>> store = SQLiteStore() >>> annotation = Annotation( ... geometry=Point(0, 0), ... properties={"class": 42}, ... ) >>> store.append(annotation, "foo") >>> store.pquery("props['class']") ... {42} >>> annotation = Annotation(Point(1, 1), {"class": 123}) >>> store.append(annotation, "foo") >>> store.pquery("props['class']") ... {42, 123}
- query(geometry=None, where=None, geometry_predicate='intersects', min_area=None, distance=0)[source]¶
Runs Query.
- Parameters:
self (SQLiteStore)
geometry (QueryGeometry | None)
where (Predicate | None)
geometry_predicate (str)
min_area (float | None)
distance (float)
- Return type:
- remove_area_column()[source]¶
Remove the area column from the store.
- Parameters:
self (SQLiteStore)
- Return type:
None
- remove_many(keys)[source]¶
Bulk removal of annotations by keys.
- Parameters:
keys (iter(str)) – An iterable of keys for the annotation to be removed.
self (SQLiteStore)
- Return type:
None
- serialise_geometry(geometry)[source]¶
Serialise a geometry to WKB with optional compression.
Converts shapely geometry objects to well-known binary (WKB) and applies optional compression.
- Parameters:
geometry (Geometry) – The Shapely geometry to be serialised.
self (SQLiteStore)
- Returns:
The serialised geometry.
- Return type:
- to_dataframe()[source]¶
Converts AnnotationStore to
pandas.DataFrame
.- Parameters:
self (SQLiteStore)
- Return type:
- values()[source]¶
Return an iterable of all annotation in the store.
- Returns:
An iterable of annotations.
- Return type:
Iterable[Annotation]
- Parameters:
self (SQLiteStore)