TensorStore, Google’s Python and C++ Library to Address Storing Issues

TensorStore, Google’s Python and C++ Library to Address Storing Issues
Published on

This Python and C++ library is designed for storage and manipulation of n-dimensional data

Fundamental engineering problems in scientific computing linked to the management and processing of enormous datasets in neuroscience have already been resolved using Google's TensorStore. TensorStore is an open-source Python and C++ library developed by Google AI to address the problem of storing and manipulating n-dimensional data. This library supports several storage systems like Google Cloud Storage, local and network filesystems, etc. It offers a unified API for reading and writing diverse array types. With strong atomicity, isolation, consistency, and durability (ACID) guarantee, the library also provides read/writeback caching and transactions. Optimistic concurrency ensures secure access from different processes and computers.

Getting started with Google's Python and C++ library

To get started using the Python API, start with the tutorial and indexing operation guide, then refer to the detailed Python API reference. For setup instructions, refer to the Building and Installing section. This Python and C++ Library is also used for reading and writing large multi-dimensional arrays.

Concepts of Google's TensorStore

The core abstraction, a TensorStore is an asynchronous view of a multi-dimensional array. Every TensorStore is backed by a driver, which connects the high-level TensorStore interface to an underlying data storage mechanism.

Opening or creating a TensorStore is done using a JSON Spec, which is analogous to a URL/file path/database connection string.

TensorStore introduces a new indexing abstraction, the Index transform, which underlies all indexing operations. All indexing operations result in virtual views and are fully composable. Dimension labels are also supported and can be used in indexing operations through the dimension expression mechanism.

Shared resources like in-memory caches and concurrency limits are configured using the context mechanism.

Properties of a TensorStore, like the domain, data type, chunk layout, fill value, and encoding, can be queried and specified/constrained in a uniform way using the schema mechanism.

In conclusion, Google's Python and C++ library is designed for storage and manipulation of n-dimensional data that:

  • Provides a uniform API for reading and writing multiple array formats, including zarr and N5.
  • Natively supports multiple storage systems, including Google Cloud Storage, local and network filesystems, HTTP servers, and in-memory storage.
  • Supports read/writeback caching and transactions, with strong atomicity, isolation, consistency, and durability (ACID) guarantees.
  • Supports safe, efficient access from multiple processes and machines via optimistic concurrency.
  • Offers an asynchronous API to enable high-throughput access even to high-latency remote storage.
  • Provides advanced, fully composable indexing operations and virtual views.

Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp

                                                                                                       _____________                                             

Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.

Related Stories

No stories found.
logo
Analytics Insight
www.analyticsinsight.net